NEHETG2006

From Digital Scholarship Group
Jump to navigation Jump to search

WWP Introductory TEI Seminars (2006-2009)

In 2006 the WWP was awarded funding for a two-year series of 12 introductory seminars in scholarly text encoding, to be hosted by humanities centers and digital projects. The grant also provided seminar participants with access to advice and consulting by WWP staff (e.g. to assist with grant proposal development, schema customization, encoding, etc.). As part of the grant the WWP set up a discussion list, WWP-ENCODING, for discussions of text encoding by participants in this seminar series and other WWP workshops. This series was originally scheduled to run from January 2007 through December 2008, but was extended through June 2009.

The full schedule of seminars is available at the WWP web site.

Participant support

We provided significant followup support to the following participants.

Susan Cole

Professor Emeritus Cole had a significant corpus (~3.5 MiB in almost 250 files arranged in just under 50 directories) of ancient Greek inscriptions. However, she had lost access to this corpus sometime after it had been moved to another institution for some processing. The WWP facilitated recovering a copy from the archives at the other institution (with great thanks to Dot Porter), which, due to the illness and untimely death of a key party, took months.

Once we were able to examine these texts, we discovered that they were encoded as TEI P4 files using Beta Code for the Greek. They had many serious, ubiquitous, and (luckily) mostly consistent encoding errors. In addition, several files inadvertently had duplicated large swaths of text for no apparent reason. During the course of 2008 the WWP instituted a shared repository, so that Professor Cole and WWP staff could effect changes to the files. The WWP, with frequent contact with Professor Cole, fixed the well-formedness and validity errors in the files, while Professor Cole found and removed duplicate chunks of text.

The WWP is now in the process of assisting Professor Cole by upgrading her home-grown encoding methodologies to proper TEI encoding (e.g., she had used "vv." to indicate a small horizontal space, which has been changed to <space dim="horizontal" extent="small"/>). Very soon we will be converting the Beta Code to Unicode using Hugh Cayless's transcoder program. For the past several months we have been testing the transcoder and going back & forth with Hugh to fix bugs and add the features needed to handle Professor Cole's files.

Once this is accomplished, we are hoping to be able to get these files into P5 (which is likely to be quite easy) and then assist Professor Cole in getting them into an EpiDoc-like format (which is likely to be somewhat more difficult).

Amelia Wong

Kent Hooper

Professor Hooper has created a bibliographical listing of secondary literature relating to the life and work of the early 20th century German writer and artist Ernst Barlach, from which he would like to generate a web site that would be useful to up and coming Barlach scholars, as well as a print publication. The bibliography of over 3,600 entries was originally created in a word processor. Professor Hooper is almost completely without technical support in this arena at his home institution, so a friend of his from a different institution has been helping him. His friend Rick has managed to get the majority of the bibliography out of the word processor format into an XML-like format.

At the Wheaton seminar itself WWP staff helped professor Hooper change this tagged file into well-formed XML (although not yet valid TEI). Furthermore WWP staff merged Professor Hooper's two files into one making it easier for him to manage them.

Since that time WWP staff have been performing the various global changes needed to make Professor Hooper's file valid TEI, and Professor Hooper has been making the (thousands of) individual changes needed.

In November WWP staff and Professor Hooper's friend Rick began collaborating on creating a framework for generating XHTML from the input TEI. However, it now seems that personell from the Bamboo project may be doing most of this work for the Barlach Bibliography project.

Catherine Goebel

Sandra Petrulionis and Noelle Baker