Monday, January 24, 2011

Fear and XML

I still have strong memories about my first academic position. For the first time, I would have my own office and desktop computer. It was still the days of Win3 and for the first time I was going to use the cutting edge word processor, MS Word 2.0! Having been a previous WordPerfect user, this was a strange moment for me and it was even more jarring because the prior user of the desktop had turned on the "reveal codes" option in Word: there are no white spaces in the document, spaces are represented by black bullet points and carriage returns with the ¶ symbol. It was incredibly annoying and it made it difficult to read my texts. So why did Microsoft even have this option?  It was their imperfect attempt to help authors with typesetting, especially if they followed the rule that there has to be two spaces after a period (which I recently discovered is a typographical sin!). As useful as it was, this type of "markup" was intrusive and I was very relieved when I finally figured it out how to turn it off. I did from time to time turn it back on, but for the most part I was happy enough to have this type of markup sitting under the hood. 

I was reminded of this experience as I was reading some recent exchanges on Twitter about how some users want XML elements to be in the background in an editor. Hugh Cayless, on his blog, recently mused that this demand was grounded in fear. I think he is right: some scholars look at the predominant way in which digital humanists manage and process texts and become fearful that they would have to do the same. This is not fear about XML per se as it is fear about change. For all our boasting of being trend-setting and critical thinkers, we academics are conservative at heart. Cayless has gone even farther to argue that any XML editor that permits hidden tags will ultimately be a losing proposition. This is because any schema (but Cayless has TEI and its minions in mind here) is ultimately a text model and to hide it from the author/reader is to misunderstand its functionality. 

So far I completely agree, but I would want to make a distinction here. As a text editor, sometimes I want to simply read the text, especially when I am transcribing an unpublished manuscript in Latin. I want to examine the flow of the words, the sentences and even paragraphs, without tripping over an XML element that is describing part of the text's content or structure. Now I am the first to admit that the semantic units of words, sentences and paragraphs are already marked up, or encoded, by using white spaces and/or punctuation. And yet, that kind of markup is so ingrained in my reading skill that they do not diminish the usability of the text displayed on screen. Add a few angle brackets, however, and my reading strategy has to change. 

For the T-PEN project, exposing and hiding XML tags is an important part of its development. We have two central goals in our mission: (1) to create a digital tool that paleographers and text editors will use when working with digital images of manuscripts; and (2) to integrate transcription and XML encoding; in other words, to bring to together the acts of composition and encoding. The second goal is particularly important since it means that decisions about representing structure are made as you create the text with the manuscript page before you. But this integration can present a challenge to those scholars who may not yet feel fully comfortable with XML encoding. And, for those scholars who are non-plussed about XML, they may still want to read the text qua text. We thus want to permit the user to have the choice about how their transcriptions are displayed. Users can certainly draw upon a palette of tags (which can be drawn from any schema including TEI) to insert as they transcribe, but they can also banish those tags to leave a naked text. We are even considering allowing users to select a color code for each element, so that they could have a third option; but that would depend greatly upon how many tags are in operation, and whether that might lead to rather unsightly text decoration. 

This strategy also reflects our concern for usability. Usability is not just about ease of use, since ease is directly related to comfort. Cayless is quite correct: there is a good deal of irrational fear about XML amongst humanist scholars. However, if we want to see the number of practitioners of digital humanities grow, then there has to be ways to overcome that fear. Allowing T-PEN users to hide and expose XML tags may increase the comfort level, and this can lead to better use. Ultimately, it will provide for a good experience for the fearful who may finally see that XML tags are the essential components of a usable text model. 


  1. OK, I should start by saying that Hugh's comment we've all seen weirdly mis-formatted documents, where...the writer just made it bold, with a bigger font, and maybe put a couple of newlines after it. Maybe you've done this yourself, when you couldn't figure out the "right" way to do it." convinces me I'm not even qualified to enter into this discussion.

    Which won't stop me.

    What I'm wondering is whether we could have a discussion about what precisely the intellectual work is that goes on in the throes of first-pass transcription. Partly it's reading. Lots of it is parsing, and re-parsing, and trying again to parse - which is, of course, a major target of encoding, but is also one reason some of us are wary of adding an additional level of stuff that seems to the novice to be hard to parse itself. But a whole lot of what goes in transcription is VISUAL: what the heck am I seeing? Can I clarify for myself what I'm seeing by making a visual model of it – writing or typing in close imitation? At that level of transcription, which in some projects is likely to occur every few words or every few lines, you are nowhere near the stage of making a mental model of the text; you are instead doing precisely that kind of reproduction of the APPEARANCE of a text that TEI is NOT about. It's that recursive process of visualizing, parsing, and reading that is accommodated by an interface that lets you switch back and forth between seeing and not seeing tags.

    I wonder if it would be helpful in these discussions to have people share what their first-pass transcriptions actually look like. I have the feeling that when we argue about mental models we have a qualia problem we won't get over without some real-world examples of how people actually work. I'll try to post some of my own editorial sausage-making in the next few days, embarrassed as I'd be to show it in public.

  2. Carin:

    You have made a very important point, something which I think the T-PEN project may be unearthing slowly. Since we are trying to integrate composition/authoring and encoding into one place, this means that the "base" text is highly undeveloped. It could be that it takes several pages of transcription before you realize you are seeing structural patterns that need to be encoded as such.

    And, transcriptions (at least in my experience) are messy and inchoate for a very long time, until I have established a rhythm that matches the scribe, or until I have a larger amount of text that can help me decode some obscure abbreviation I keep tripping over. All this certainly is very different from say taking a stable transcription and loading it up in something like Oxygen to mark it up.

    I welcome your perspective on this, but I doubt your first pass transcriptions are any worse than my terrible ones. I hope you can show us some examples as I think this would be very helpful.

  3. Q, you've stated clearly my experience of the long contingency of the transcription process. It is, indeed, like a dance with a scribe, and sometimes you don't even know at the outset how many people you're dancing with – especially when there are glosses involved.

    I will try to work up some examples that I can narrate to try to elucidate my thinking process.

  4. I like the attention to visuality in this discussion of the ways in which XML tags might be a kind a visual noise in the transcription-in-process. I also wonder the extent to which our tag sets determine what we see, what we pay attention to in a manuscript, and what we (necessarily) overlook.

    When I work on a transcription (for the Blake Archive), I have the MS image file open and I toggle back and forth between that and my transcription, which is an XML file in Oxygen. I sometimes also have a copy of our MS tag set open (another Word doc). I encode as I go, stopping once in a while to upload the transcription in our project's testing site to check out how things are displaying (without the noise of the tags). Others on our project, however, write out their transcriptions long hand first (or in Word), and then encode in Oxygen. These small variations in the process of reading, transcribing, and encoding suggest, I think, the extent to which the process of encoding--using tags to describe text--feels comfortable, and has (or has not) become naturalized. Our tag set is pretty "straightforward;" that is, we only encode authorial changes to the text. Of course, we often have questions about whether we see a revision in the MS, or what kind of revision we see, but we can usually work around those small questions, and come back to them if/when we need to.