Wednesday, February 18, 2009

MVD Paper available online

Elsevier have published the paper I wrote with Bob Colomb about Multi-Version Documents online. The Greek text has dropped out of Figure 16, but the rest is good. I hope this has an impact, and it is certainly something I will be referring to in future. It represents everything I knew about the MVD idea and its implications as of December 2008.

Thesis Complete

This morning I submitted a near-final draft of my thesis 'Multiple Versions and Overlap in Digital Text' to my two supervisors. The last chapter describes some new work on aligning multi-version texts automatically. Here's a table taken from the thesis which summarises its performance on a variety of multi-version texts.

The SZ column is the average version size in kilobytes, NV is the number of versions, TT is the total time taken to merge all versions, AT is the average time to merge one version after the first, both in seconds. The test machine had a 1.66GHz Core Duo processor, using one core. The Romulo doesn't merge properly at the moment because there is almost nothing in common between the versions, so the merge times don't mean much in this case.

The key is the AT column, which is how long it takes to 'save' an edited version back into the document. As you can see, it's pretty fast, considering that this is a hard problem. As far as quality goes, I can't see any bad alignments or false transpositions, except in the Malvezzi case. Once I can coerce the input into a sensible format this should also work.


It looks as if I will be going to Balisage this year. I will be presenting a boiled down version of Chapter 5 of the thesis, which is all new work. I'll be very interested to hear their reactions, especially as I can now demonstrate the theory. (Their motto is 'There is nothing so practical as a good theory').