Saturday, October 6, 2012

Text and Facsimile Side-by-Side

One of the most common requests you hear about in digital humanities is to display a facsimile next to its transcription. The TILE project is one example, another was developed some years back, perhaps for the first time, by some guys at the University of Kentucky, for their electronic Beowulf edition. As soon as you think about putting a facsimile next to its transcription you see the need to link areas of the image to spans of text in the transcription. Sometimes the facsimile is hard to read, and it only becomes useful if someone has already gone over it to link those areas and spans to help the reader to work out what corresponds to what. This gives rise to two main problems:

  1. How to provide an environment to define those links and save them
  2. How to display this information and update it as the user scrolls through the text and zooms in on the image

Both solutions must work over the Web, because that is the medium everyone wants to work in. TILE dealt with problem 1. The Beowulf edition was mostly about 2, I think (you have to buy the DVD to find out). We are tackling both problems as part of our AustESE project - to provide an editing environment for electronic scholarly editions. Since someone else is tackling problem 1 I'll be talking about my solution to problem 2. I'd be curious to know if anyone else has already developed something similar.

How it works

This is an screen dump of the live demo on AustESE. The user has already zoomed into an area of interest and has just moved the mouse over a polygonal area. The correct span of text on the right has been highlighted. This seems disarmingly simple but is quite hard to achieve in the medium of the Web. Years ago, when HTML was first developed, imagemaps were invented that allowed rectangular, circular and polygonal AREAs to be defined for an image. They can be defined for example in a tool like Gimp. Mouse movements and mouse clicks could be detected through the maps to fire events that might, for example, be used to highlight parts of the transcription. What was not provided, however, was any way to style or graphically affect the image map. So to make the map more interactive we need to employ some fancy new Web tricks.

The first thing I found that highlights areas on an image map is a thing called maphighlight. Unfortunately it doesn't do zooming, scaling or scrolling. Scaling is required to fit the image and its map onto the user's screen size. Zooming is used to drill down to details on an image that is unlikely to be readable on many devices otherwise. And scrolling is needed to move around the zoomed image. But the maphighlight designers did hit upon a cool workaround for the lame functionality of MAP, AREA and IMG in HTML:

Their display is broken into three panes layered on top of each other. The top layer is the original image and its image map. The image is set to be invisible, so only the defined regions on the image are "visible" to the mouse and are activated as it moves over them. This is important because if the imagemap was behind the canvas the mouse events wouldn't fire and you'd get no interactivity*. Looking through the image and its map the user can see a transparent canvas, which gets drawn to in response to the mouse movements. Behind it, in the background is the scaled and panned image itself, this time visible. What this needs to work is support for HTML5. Fortunately all modern browsers have the required features. To get the zooming and scrolling to work I had to modify and augment the existing maphighlight jQuery plugin, which I renamed maphilite. There are some more features I want to add, but it basically works. I've tested it on the latest versions of Chrome, Firefox, Safari and Opera. I'm pretty sure it will work also in IE9.

Why not support older browsers? Firstly, it's just too hard. Secondly, anyone can install a free browser on their computer even if they currently use IE8 or less. Thirdly, in a few more months or years everyone will be using HTML5, so why bother?

I'd like to stress that this is just a demo at this stage. The next stage will be to create a bunch of image maps and define them in the HRIT system so they can be applied to images and transcriptions on the fly as the user scrolls through the text.

* I imagine some techies are asking 'Why not just use CSS z-indicies to stack the three panes?' Because MAPs are bound to their images and are not separately stackable, that's why. Either you have the canvas in front of the image and you get no detection of mouse movements or you can't see the canvas behind the image. Pointer-events might get around that but they don't work on IE or Opera. As a CSS-defined background the image can be scrolled by use of background-position and scaled with background-size. Otherwise I don't see how you can get it to work.

No comments: