Friday, December 23, 2011


My colleagues at Loyola have inspired me with their questions to turn nmerge into a service. The idea of HritServer (Humanities Resources, Infrastructure and Tools) is to build a self-contained application that runs as a service from the commandline, and provides the infrastructure to build any digital humanities website. It is a collection of the tools written over the past few years, including nmerge, the XML import/export tools, formatter and the GUIs I developed for Digital Variants. The structure will be a little like Tomcat:

  1. a back-end administrative interface allowing the admin user to add or import new texts and edit existing ones
  2. Example GUIs in Java and PHP that exercise each facility provided by HritServer: compare, view variants, indexed search, tree-view.

The service type will be strictly RESTful; everything will be done via HTTP. The database at the back end will be a modern key-value store rather than an old-fashioned relational design. Each resource will be accessible via a simple URL, with no complex access needed. For example, to get the formatted HTML of act1, scene 1, first folio of Shakespeare's King Lear, one would only need to fetch the URL:

Anything can be stored at a similar URL in a simple hierarchical structure. The HTML is generated on the server from plain text and overlapping markup sets, and never stored. By passing parameters to the same URL, different formatted versions of the same text can be achieved. Different encodings of the same text can likewise be realised by specifying a different collection of markup sets. The idea is to take the complexity out of building such websites, and to maximise automation by providing a powerful base infrastructure that will work for any set of texts. It should be achievable within a reasonable time, because almost everything already exists (although some tools are still incomplete). All I have to do is stitch it together and test it.

I'll have to write a formal software specification but I've already made a good start on coding it.