I still have strong memories about my first academic position. For the first time, I would have my own office and desktop computer. It was still the days of Win3 and for the first time I was going to use the cutting edge word processor, MS Word 2.0! Having been a previous WordPerfect user, this was a strange moment for me and it was even more jarring because the prior user of the desktop had turned on the "reveal codes" option in Word: there are no white spaces in the document, spaces are represented by black bullet points and carriage returns with the ¶ symbol. It was incredibly annoying and it made it difficult to read my texts. So why did Microsoft even have this option? It was their imperfect attempt to help authors with typesetting, especially if they followed the rule that there has to be two spaces after a period (which I recently discovered is a typographical sin!). As useful as it was, this type of "markup" was intrusive and I was very relieved when I finally figured it out how to turn it off. I did from time to time turn it back on, but for the most part I was happy enough to have this type of markup sitting under the hood.
I was reminded of this experience as I was reading some recent exchanges on Twitter about how some users want XML elements to be in the background in an editor. Hugh Cayless, on his blog, recently mused that this demand was grounded in fear. I think he is right: some scholars look at the predominant way in which digital humanists manage and process texts and become fearful that they would have to do the same. This is not fear about XML per se as it is fear about change. For all our boasting of being trend-setting and critical thinkers, we academics are conservative at heart. Cayless has gone even farther to argue that any XML editor that permits hidden tags will ultimately be a losing proposition. This is because any schema (but Cayless has TEI and its minions in mind here) is ultimately a text model and to hide it from the author/reader is to misunderstand its functionality.
So far I completely agree, but I would want to make a distinction here. As a text editor, sometimes I want to simply read the text, especially when I am transcribing an unpublished manuscript in Latin. I want to examine the flow of the words, the sentences and even paragraphs, without tripping over an XML element that is describing part of the text's content or structure. Now I am the first to admit that the semantic units of words, sentences and paragraphs are already marked up, or encoded, by using white spaces and/or punctuation. And yet, that kind of markup is so ingrained in my reading skill that they do not diminish the usability of the text displayed on screen. Add a few angle brackets, however, and my reading strategy has to change.
For the T-PEN project, exposing and hiding XML tags is an important part of its development. We have two central goals in our mission: (1) to create a digital tool that paleographers and text editors will use when working with digital images of manuscripts; and (2) to integrate transcription and XML encoding; in other words, to bring to together the acts of composition and encoding. The second goal is particularly important since it means that decisions about representing structure are made as you create the text with the manuscript page before you. But this integration can present a challenge to those scholars who may not yet feel fully comfortable with XML encoding. And, for those scholars who are non-plussed about XML, they may still want to read the text qua text. We thus want to permit the user to have the choice about how their transcriptions are displayed. Users can certainly draw upon a palette of tags (which can be drawn from any schema including TEI) to insert as they transcribe, but they can also banish those tags to leave a naked text. We are even considering allowing users to select a color code for each element, so that they could have a third option; but that would depend greatly upon how many tags are in operation, and whether that might lead to rather unsightly text decoration.
This strategy also reflects our concern for usability. Usability is not just about ease of use, since ease is directly related to comfort. Cayless is quite correct: there is a good deal of irrational fear about XML amongst humanist scholars. However, if we want to see the number of practitioners of digital humanities grow, then there has to be ways to overcome that fear. Allowing T-PEN users to hide and expose XML tags may increase the comfort level, and this can lead to better use. Ultimately, it will provide for a good experience for the fearful who may finally see that XML tags are the essential components of a usable text model.