NEHDHS2007

From Digital Scholarship Group
Jump to navigation Jump to search

NEH Digital Humanities Startup Grant

Grant information here

This is an 18-month project to theorize and develop an approach to encoding detailed information about persons in the metadata and content of the WWP textbase, and to develop a pilot user interface that will exploit this information. The project begins on July 1, 2008 and runs through December 30, 2009.

The grant budget funds approximately 15% of Syd and 15% of John, plus about $11,000 for student programmers and encoders. The overall goals of the grant are as follows:

  • Create the framework for a TEI personography and populate it with the data from our Filemaker names database
  • Regularize, fix, and groom the data in the personography to a basic level of accuracy (eliminate duplicate records, etc.)
  • Think through the details of name encoding, including the challenges of metaphorical reference, etc.
  • Apply name references to the textbase: in metadata for all files, and in content for a selected subset of files
  • Develop some prototype interface feature or tool that exploits name markup and reference in the WWO collection


Basic shape of the project

1. Theorize

  • Theorize the encoding of names and other person references, including problems of fictionality, historical change, gender attribution, and the semantics of the claims made by the document and the encoding
  • Identify and justify a testbed (subset of WWO texts) for the representation of names in content
  • Examine the use of TEI personography encoding to address specifics of this project, and identify areas where it would need to be extended to address issues encountered above.
  • Examine the options for name authority data (LCNA, others)

2. Overhaul and build infrastructure

  • Build tools for assisted encoding of names, name keys, and other data in content
  • Review the name authority data we already have and bring it to the necessary level of consistency and completeness to support the encoding of the testbed
  • Develop an encoding specification for handling persons and names in both metadata and content

3. Build

  • Encode the metadata: develop personography encoding for all WWO texts (including data about relationships to other persons and places)
  • Add appropriate name encoding to content in testbed
  • Develop additional name authority data as needed (and if possible handle links out to external authority files, if we can get help from CDI on this)
  • Build pilot interface layer

4. Reflect and report

  • Write documentation for name encoding and include in Guide
  • Write white paper on name encoding issues


Notes, tracking pages, and documentation

WWO subset page Lists the current subset of WWO texts and documents the criteria used in their selection.

WWO subset tracking page Central location for tracking the encoding and disambiguation of names in the content of WWO subset texts.

Meeting notes Notes from meetings where names grant issues were discussed; documentation of decisions from those meetings.

Outstanding questions and problems Running list of questions, problems, and issues pertaining to our personography specification, prototyping, etc.

Research materials List of reference materials that may be useful in conducting research on people in the WWP's names universe.

Tasks and tracking General list of grant-related tasks.

FileMaker names database Description and documentation of the WWP's FileMaker database for recording name and biographical information.

White paper draft Notes and drafting of final project white paper.

General timeline

Date Item
July 1, 2008 grant begins
July - November 2008 develop classification system for names
July - November 2008 establish encoding specification for personography
July - December 2008 overhaul and update existing names data
September - October 2008 develop and apply criteria for creating WWO subset for name encoding in content
January - May 2009 continue in-depth research to expand and correct existing names data
January - May 2009 encode names in content for all WWO subset texts
March - June 2009 develop and test prototype interface tools
May - June 2009 initial draft of white paper
July 2009 deploy prototype tools in WWO sandbox; complete white paper