Tuesday, January 8, 2013

Hritserver 0.2.0 released

I've made an early release of hritserver 0.2.0. The version number reflects my rough feeling that this is about 20% finished. That doesn't sound like much but most of the unfinished part is in the services it will eventually perform. The basic infrastructure of hritserver: the merging, formatting and importing facilities are closer to 90% complete.

There's a mixed import dialog in this release that should be able to import any TEI-Lite document. The import process can be configured in several ways, or you can just follow the defaults. The result is always supposed to "work". The stages for XML files are:

  1. XSLT transform of XML sources. The default transform fixes some anomalies in the TEI data model that make it hard to convert it into HTML. Or you can substitute your own stylesheet to do anything you like. In TEI-Lite this step also splits the input into the main text and any embedded <note>s and <interp>s.
  2. The versions within each XML file (add/del, sic/corr, abbrev/expan, app/rdg etc.) are all split into separate files. This is done safely by first splitting the document into a variant graph wherever a valid splittable tag is found and then the graph is written out as N separate files.
  3. The individual files are stripped into their remaining markup and the plain text. The markup may be in several files, such as a separate one for page divisions.
  4. The markup files and the plain text files are merged into CorCode and CorText multi-version documents.
  5. The CorCodes and the CorTexs are then stored in the database.

Imported files should then appear in the Home tab of the Test interface.

Installation

You can try downloading version 0.2.0 if you use a Mac. I'll get around to supporting other platforms presently. To download version 0.2.0 you should use git. If you don't have it you can install it easily via homebrew

brew install git

Then download the latest hritserver code:

git clone https://github.com/HRIT-Infrastructure/hritserver.git

That creates a folder "hritserver" in that directory. Then you should run the installer:

cd hritserver
sudo ./install-macosx.sh

And it should work. This version will be tested and gradually improved. The advantage of using git is that you can easily update to the latest version by typing

git pull

in the hritserver directory.