WWPmarkupdoc (old)

From Digital Scholarship Group
Jump to navigation Jump to search

Important: This page is about the current P4-based internal markup documentation derived from the old Markup Documentation FileMaker Pro database. For discussion of the new (started development 2007-12, not nearly finished yet) internal markup documentation that will start covering both P4 and P5, and is based on the Guide architecture, see Internal Documentation Update.

Location

The WWP internal markup documentation is part of the WWP website and therefore under Subversion control. The path on either golf or papa is:

/var/www/htdocs/WWP/encoding/documentation/markup/

In general the best practice is to perform initial development on your local machine, committing changes frequently. One you're at the point that you need to see how it looks on the web, you either need to update golf after each commit from your machine in order to see your changes, or you need to revise or mess about with it directly on golf. Remember to commit to the repository from golf occasionally, too. Then, when you're happy, update it on papa.

How it works

  • Julia exports data from the WWP's documentation Filemaker database in the form of a single large file called *records.xml*
  • then we need to clean up and prepare records.xml before XSLT stylesheets will work on it:
    • change "ROW" \-> "row"
    • eliminate a couple of line breaks in the first row element
    • take out the namespace dec in the <FMPDSORESULT> element
    • change ␣ to  
    • replace @ signs with pointy brackets for (p), (eg), (lb)
    • deal with some non-XML bytes: examples are in 042 (where an apostrophe should be); 008 (double quotes); 060 (ellipsis); 144 (four ligatures: oe = 0153; OE = 0152; ae = 00E6; AE = 00C6); 092 (hyphen in a date range)
  • run split-records.xsl against records.xml to make a separate XML file out of each <row> element; store the split-out files in the splitRecords/ directory
  • run trans-records.xsl stylesheet against the files in splitRecords/ to transform them into display-ready XHTML; store the transformed files in the records/ directory.
  • run make-listall.xsl against records.xml to create listall.html
  • for index.html there has to be an alphabetically-sorted list of keywords in <option> elements for the drop down search menu. The way we get this list is rather a hack, but it works.
    • firstly, we run extract-keywords.xsl against records.xml to create keywordslist.xml. We make some global changes to separate multiple keyword items, put each keyword item on a separate line as the content of a <keyword> element, and delete unwanted whitespaces, newlines, commas, and semi-colons.
    • then we run sort-keywords.xsl against this cleaned-up keywordslist.xml to create sortedoptions.xml; we now have a list of <option> elements sorted alphabetically, but the list contains many duplicates so...
    • finally we run unique-options.xsl against sortedoptions.xml to create uniquedoptions.xml in which all the <option> elements are unique. This gives us a list that we can now cut and paste into the <form> in index.html.

Fall 2007 Documentation Review

Below are summarized the findings of a review of both the internal WWP encoding documentation and the Guide for Scholarly Encoding, which was conducted in October 2007. Details of particular problems are not enumerated here out of consideration for space, for a more complete articulation or for further details see Jacque.

Overall recommendation: The internal documentation is seriously flawed. The tools have limited usefulness because they either don't work or the amount of erroneous or irrelevant information is too high. While the Guide replicates some of these errors and produces others, its format is far more up to date than that of the internal documentation. The draw back to using the infrastructure of the Guide is the need to rewrite the prose so that it addresses the concerns of the WWP encoding community rather than the general public. I recommend either building a new internal guide or discovering ways to redirect a copy of the Guide for internal use without importing its errors. The determining factor seems to me to be one of time in developing the structure of the internal documentation given that much of the content will need to be updated, revised, or created in either scenario.

Summarized particulars

The search tool has limited usability. It appears to work with short terms such as 'a' or 'an' but does not for longer terms that would be used to search the documentation. The guide search tool is not equipped to deal with white space which needs to be fixed. I am not sure if there is a similar sort of problem with the documentation search tool.

Currently the keyword function produces an unacceptable number of "false hits" which pull up material not relevant to the search term. This happens most frequently with abbreviated terms, like "ack", which pulls up all words in which that letter string appears, although it happens as well with short full word keywords.

There is significant duplication in the internal documentation keyword list, this includes duplication with singular/plural forms of a single term, past tense forms of terms, and irregular punctuation/capitalization. This problem is duplicated in the Guide and exacerbated by the inclusion of several misspellings as unique keywords. The keyword list is simply too long to be readily used, even once the errors have been addressed. I would support considering keyword clustering and ensuring keywords that we do not actually use or have changed are updated in the keyword list. The keyword list should be regularized to prevent spelling errors and variations and a styleguide/protocol should be determined for future keyword development in order to avoid recreating this problem over time.

Currently WWP specific practices are not uniformly emphasized in the documentation, in some cases it is necessary to read through an entry to discover that we don't use a term or have changed it in some way; current WWP practices should be addressed first in all instances for the purposes of the internal documentation.

Several sidebar buttons are non-functional, these include all of the buttons except "WWP".

We would need to change the process-oriented sections of the Guide or eliminate them for the internal documentation. This might be a nice opportunity to gather all of our materials into a single space, including our workflow description and sample encoded pages here for example.

In a similar approach we could change the content of the "Magic" section to include WWP specific shortcuts and other tricks.

In addition to the shared problems and those that were documented by W. Gui, the Guide has some particular issues that need to be addressed. In particular, the link to "Tech advise and magic" is broken, as is "Project Strategy", and the "Text Encoding Concepts" area has several broken links. W. Gui's summary of editorial errors is attached in the attachment section.

Our bibliography area is currently sorely out of date. When new texts are added to bring it up to date consider breaking the different sections into separate pages to facilitate easier use.