Monday, January 24, 2011

Fear and XML

I still have strong memories about my first academic position. For the first time, I would have my own office and desktop computer. It was still the days of Win3 and for the first time I was going to use the cutting edge word processor, MS Word 2.0! Having been a previous WordPerfect user, this was a strange moment for me and it was even more jarring because the prior user of the desktop had turned on the "reveal codes" option in Word: there are no white spaces in the document, spaces are represented by black bullet points and carriage returns with the ¶ symbol. It was incredibly annoying and it made it difficult to read my texts. So why did Microsoft even have this option?  It was their imperfect attempt to help authors with typesetting, especially if they followed the rule that there has to be two spaces after a period (which I recently discovered is a typographical sin!). As useful as it was, this type of "markup" was intrusive and I was very relieved when I finally figured it out how to turn it off. I did from time to time turn it back on, but for the most part I was happy enough to have this type of markup sitting under the hood. 

I was reminded of this experience as I was reading some recent exchanges on Twitter about how some users want XML elements to be in the background in an editor. Hugh Cayless, on his blog, recently mused that this demand was grounded in fear. I think he is right: some scholars look at the predominant way in which digital humanists manage and process texts and become fearful that they would have to do the same. This is not fear about XML per se as it is fear about change. For all our boasting of being trend-setting and critical thinkers, we academics are conservative at heart. Cayless has gone even farther to argue that any XML editor that permits hidden tags will ultimately be a losing proposition. This is because any schema (but Cayless has TEI and its minions in mind here) is ultimately a text model and to hide it from the author/reader is to misunderstand its functionality. 

So far I completely agree, but I would want to make a distinction here. As a text editor, sometimes I want to simply read the text, especially when I am transcribing an unpublished manuscript in Latin. I want to examine the flow of the words, the sentences and even paragraphs, without tripping over an XML element that is describing part of the text's content or structure. Now I am the first to admit that the semantic units of words, sentences and paragraphs are already marked up, or encoded, by using white spaces and/or punctuation. And yet, that kind of markup is so ingrained in my reading skill that they do not diminish the usability of the text displayed on screen. Add a few angle brackets, however, and my reading strategy has to change. 

For the T-PEN project, exposing and hiding XML tags is an important part of its development. We have two central goals in our mission: (1) to create a digital tool that paleographers and text editors will use when working with digital images of manuscripts; and (2) to integrate transcription and XML encoding; in other words, to bring to together the acts of composition and encoding. The second goal is particularly important since it means that decisions about representing structure are made as you create the text with the manuscript page before you. But this integration can present a challenge to those scholars who may not yet feel fully comfortable with XML encoding. And, for those scholars who are non-plussed about XML, they may still want to read the text qua text. We thus want to permit the user to have the choice about how their transcriptions are displayed. Users can certainly draw upon a palette of tags (which can be drawn from any schema including TEI) to insert as they transcribe, but they can also banish those tags to leave a naked text. We are even considering allowing users to select a color code for each element, so that they could have a third option; but that would depend greatly upon how many tags are in operation, and whether that might lead to rather unsightly text decoration. 

This strategy also reflects our concern for usability. Usability is not just about ease of use, since ease is directly related to comfort. Cayless is quite correct: there is a good deal of irrational fear about XML amongst humanist scholars. However, if we want to see the number of practitioners of digital humanities grow, then there has to be ways to overcome that fear. Allowing T-PEN users to hide and expose XML tags may increase the comfort level, and this can lead to better use. Ultimately, it will provide for a good experience for the fearful who may finally see that XML tags are the essential components of a usable text model. 

Friday, January 21, 2011

Patrick joins the team

Greetings! My name is Patrick Cuba, the GUI Web Developer on the T-PEN project. As a member of the team, I am responsible for the graphical design and interface. This project is an exciting challenge to enhance the experience of transcription with deference to the analog nature of the manuscripts and the established conventions and habits of the scholars who will be using it. T-PEN should be a tool that feels obvious, not become something else to learn. My task is to create a flexible and simple interface that optimizes usability, intuitiveness, and comfort. Your feedback and input about the process of transcription (such as comments on entries like "How Do You Transcribe?") are invaluable to my work.

I have been employed by Saint Louis University for nearly a decade. My work in Student Development had been instructive, engaging, and fulfilling. When the position for GUI Web Developer posted, however, I saw an incredible opportunity to couple my undergraduate degrees of English and Philosophy & Religion (Truman State University) with my passion for the application of technology to the sincere advancement of knowledge and culture. As I learned about T-PEN and previous projects of the Center for Digital Theology at Saint Louis University, I found a door cracked open for myself in digital humanities, which had been previously unknown to me. To have entered into such a project with a competent and welcoming team already hard at work is intimidating and inspiring.

The first week is often too full of orientations about retirement planning, wrestling with the IT Department to get appropriate permissions, and finding the best lunch spots. Nevertheless, I will be working diligently to offer quality updates and adjustments to T-PEN that increase usability, expand compatability, and demonstrate the possibilities of technology intelligently applied.

Friday, January 14, 2011

T-PEN Presentation at the NEH PI Meeting in September 2010

In September 2010, Jim Ginther and Abigail Firey attended a meeting for all PIs of projects funded by the NEH's Office of Digital Humanities.  One of our tasks was to present the T-PEN project in two minutes.  These "lightening rounds" were taped and are now on YouTube.  Abigail insisted on being the "silent partner" so viewers have to put up with Ginther's blather!