Difference between revisions of "Women Writers in Context publication system"

From Digital Scholarship Group
Jump to navigation Jump to search
 
Line 24: Line 24:
 
If you are familiar with WWO encoding, you will need to switch gears to encode exhibits. For example, the values used on @rend are completely different. One of the largest changes is that WWiC has far fewer elements than WWO. This is a list of the elements most typically used in WWiC encoding:  
 
If you are familiar with WWO encoding, you will need to switch gears to encode exhibits. For example, the values used on @rend are completely different. One of the largest changes is that WWiC has far fewer elements than WWO. This is a list of the elements most typically used in WWiC encoding:  
 
* Names and dates: <persName>, <placeName>, <name>, <orgName>, <date>
 
* Names and dates: <persName>, <placeName>, <name>, <orgName>, <date>
* Structural encoding: <head>, <nowiki><p>, <div>,</nowiki> <ab>, <argument>, <list>, <item>
+
* Structural encoding: <head>, <nowiki><p>, <div>,</nowiki> <ab>, <argument>, <list>, <item>, <label>
* Bibliographic encoding: <title>, <nowiki><quote>, <q>, <bibl>, <cit>, <author>, <editor>, <ref>, <idno>, <cit>, <listBibl></nowiki>
+
* Bibliographic encoding: <title>, <nowiki><quote>, <q>, <bibl>, <cit>, <author>, <editor>, <ref>, <idno>, <cit>, <listBibl>, <ptr></nowiki>
 
* Annotations: <note>, <seg>
 
* Annotations: <note>, <seg>
 
* Generic encoding: <lg>, <l>, <sp>, <speaker>
 
* Generic encoding: <lg>, <l>, <sp>, <speaker>

Latest revision as of 14:28, 27 March 2024

Info: This page provides an overview of our Women Writers in Context (a.k.a. "Exhibits") series, including information about the publication system and workflow/procedures.


Encoding

Exhibits are encoded as XML documents using a custom schema (/var/www/html/dev/exhibits/schema/) created by Syd. For the most part it's basic TEI stuff, with a few customizations under the "wex" namespace (mostly to allow some of the metadata we wanted to store in the <teiHeader>). Exhibits are marked up more lightly than WWP textbase texts and you may find that some WWP textbase elements are not in the Exhibits schema; for example Exhibits encoding does not use <rs> for proper adjectives. Foreign-language words that should be italicized are encoded simply with <hi> and @rend of "italic".

Encoding guidelines follow standard TEI approaches fairly closely. The following sections list the specific features that are handled in ways that may vary from vanilla TEI

Comparison with WWO encoding

If you are familiar with WWO encoding, you will need to switch gears to encode exhibits. For example, the values used on @rend are completely different. One of the largest changes is that WWiC has far fewer elements than WWO. This is a list of the elements most typically used in WWiC encoding:

  • Names and dates: <persName>, <placeName>, <name>, <orgName>, <date>
  • Structural encoding: <head>, <p>, <div>, <ab>, <argument>, <list>, <item>, <label>
  • Bibliographic encoding: <title>, <quote>, <q>, <bibl>, <cit>, <author>, <editor>, <ref>, <idno>, <cit>, <listBibl>, <ptr>
  • Annotations: <note>, <seg>
  • Generic encoding: <lg>, <l>, <sp>, <speaker>
  • Renditional encoding: <hi>
  • Image encoding: <figure>, <figDesc>, <graphic>

Prefixes

For pointing to our 'ographies and contextual information, we use prefixes. These are: "ep:" for the exhibits personography, "ee:" for the exhibits eventography, and "ea:" for the exhibits editorial file (where keywords are defined). Here are examples of each: <text ana="ea:introduction ea:poetry ea:coloniality ea:aesthetics">

<event wex:ref="ee:puritans.gp"/>

<persName ref="ep:jwoodbrid.cnd">John Woodbridge</persName>

Encoding of persons, events, and keywords is discussed below.

Figures

Generally, we try not to include photographs unless they are very compelling or there aren't any alternatives; images should be public domain and relevant to the exhibit. Try to make sure we have images of women wherever possible; even with the difficulty of finding images, we don't want to publish exhibits where the only images are those of men.

Figures are encoded using <figure>. Every figure should have the following three components:

  • The graphic itself, encoded with a <graphic url="assets/gfx/image_name.jpg"/> that points to the location of the image file.
  • A caption, encoded with <ab type="caption">.
  • A credit statement that names the person/institution that provided the image, encoded with <ab type="credit">. These often take the form of something like ("The Trustees of the British Museum") or "Wikimedia Commons" or even just "Public domain". If possible, include a link to the original image source, e.g. <ref target="https://commons.wikimedia.org/wiki/File:Emerson_seated.jpg">Wikimedia Commons</ref>

Figures can appear pretty much anywhere in the encoded document: inside of a <p>, between <p> elements, or within a special <div type="figGrp"> located between <div> elements.

Figures that are encoded within a <p> should typically be designated as either "inset-left" or "inset-right" using @rend on the <figure> element. For instance: <figure rend="inset-left">...</figure>. In the published output, these will appear as small thumbnail images inset within the paragraph.

Figures that are intended for display at larger sizes should be encoded between <p> elements.

If you have a group of figures that you would like to appear together, use <div type="figGrp"> and encode each individual <figure> within it.

See more on preparing and cataloging images below.

Background image

In the publication interface, every exhibit has a full-screen background image that appears behind the title/author/abstract information. This should be encoded directly in the <text> element (i.e., before the <body>) and it will have a @rend of "banner"; otherwise, banner images are encoded just as other figures.

In the fall of 2016, we changed the way we handle banner images, so you may see some published exhibits with a different approach to banners, encoding them as renditional information on the <text> element, using standard CSS. We plan to update these soon.

For best results, the image you choose to use for the background should be at least 1200px wide, and preferably up to 1600px. All of the background images created so far have been processed to be square and larger than 1200 x 1200. To avoid getting monstrously large file sizes, see the “Preparing images” section below.

Background images must have small-sized versions suitable for use as thumbnails in the Table of Contents. These should follow the naming conventions set out in the "Image processing" section. For example, a background image named "sample-background.jpg" must have a smaller version named "sample-background_s.jpg".

Events

Every exhibit can (and should, whenever possible) have a set of events associated with it. These are currently used to generate the exhibit timeline, and in the future they can be used to create other features (e.g. "find other exhibits that refer to this event", "show all exhibits that contain events between 1600 and 1650").

For purposes of the timeline, include any significant events included in the exhibit, as well as any relevant events such as the reigns of kings/queens, major political events during the time period, similar publications, or other publications by the same author.

Do not include birth and death dates for persons named in the exhibit in the events database; all personographic information should be handled in the Names database. To add birth and death dates to the timeline, use <listPerson> and <person> inside of <wex:contextDesc>. Prefix each person's key with "ep" as in: <person wex:ref="ep:mfell.nep"/>.

FileMaker Database

For now, we're storing event information in a FileMaker database (WWP_Events). The various pieces of event-related information are fairly well documented in the database's input layout, but here's a quick overview as well.

Every event has the following required fields:

  • event_id: unique identifier in the form of a brief descriptive string (e.g. "fireoflondon") and a two-letter suffix that serves as a disambiguator
  • label: This is the brief label that is used to describe the event (in the timeline or, in the future, in other displays)
  • desc: The full prose description of the event. Use an underscore character (_) around titles that should be italicized in the published output (FileMaker's italics formatting does not survive export)
  • date: a single date or date range for the event, expressed in standard WWP FileMaker format.
  • type: a keyword/keyphrase used to categorize the event type.

In addition, the following optional fields may be used:

  • Wikipedia: the full URL of a Wikipedia page that describes the event in question
  • Google map: the full URL of a Google Map page displaying the event's location. Note: Make sure this does NOT use "https://" but rather plain old "http://". Our present timeline plugin doesn't appear capable of supporting encrypted HTTPS. Google defaults to HTTPS these days, though, so you may have to manually edit this when you paste in a URL.
  • Map caption: If you provide a link to a Google map, you may also provide a map caption. This is brief explanatory or descriptive text that identifies the map in some meaningful way.
  • Image: Link to an image that can be displayed with the event description. This might be a depiction of that event, a depiction of something else associated with the event, or an image derived from a visualization, etc. These images should also be added to the FileMaker Image database to keep track of their use and for ease of potential future use.
  • Image caption: The caption or brief description that should be associated with the image
  • Image credit: An image credit and/or rights information that must be displayed with the image. All images should have these, even if it only reads "Public domain" (even then, though, there will typically be a source of some sort: e.g., "Public domain; Library of Congress")
  • Quotation: A brief quotation that is related to the event in some way. This shouldn't be very long because the timeline layout has limited space to display it, especially at smaller window sizes. Something on the order of a couple lines of poetry, or a sentence.
  • Quotation credit: The reference or citation for the quotation
  • person_id: the WWP name key for any person who is directly connected to the event. This is a repeating field, so you can add as many as you like. Make sure the key actually corresponds to an existing record in the Names database (WWP_Names). At present, we're not acting on this information but we might in the future (e.g. to permit navigation to other exhibits that talk about the person in question).
  • notes: a general purpose field for any questions/problems about this event.

If the title of a work appears in the label or desc field, encode it with <title> for proper formatting when exported.

Eventography

Events are exported from the WWP_Events database in FileMaker’s “fmpxmlresult” XML format and then transformed to TEI using [WEB]/research/projects/exhibits/utils/events_database2listEvent.xslt.

The output file should be named “events.xml” and then stored in eXist (in /db/exhibits/xml/).

The @type attribute on <text> must have a value of "ography".

Persons

All significant persons mentioned in the text of an exhibit should have an @ref attribute on <persName> which points to the ID of the <person> as listed in persons.xml, prefixed with "ep:". For example: <persName ref="ep:mcavendis.neu">Margaret Cavendish’s</persName>. The first time a person in mentioned in a given exhibit this convention should be used. For subsequent mentions of their name, the @ref can be left off. As with standard WWP encoding, possessive endings should be encoded within the <persName> element.

FileMaker Database

The persons.xml file is exported from the personography FileMaker database, so each person mentioned in an exhibit should have an associated entry in the database. If there is no record for a person in the database already, create one, making sure that "exhibit" is marked in the Context field.

If an entry already exists, make sure that in the person_role field "exhibit" is already marked; if not, add it. If "exhibit" was not marked, this probably means that the content of the record needs to be edited before being made public. Check the "notes" field especially to make sure that all information is accurate and presented in a few short informative sentences.

Argument

The argument is a short abstract, a few sentences at most, which describes the subject of the exhibit. It is encoded within <argument> which appears at the beginning of the <body>. It is similar to the overview, but shorter; it appears within the banner image, rather than the body of the exhibit itself.

Keywords

Keywords for the exhibit are encoded on the @ana attribute of <text>, prefixed with "ea:" (for example <text ana="ea:introduction ea:religion">). The @ana points to the ID of the keyword, as listed in editorial.xml. If adding multiple keywords, they are separated by spaces. A list of IDs for keywords can be found under distribution/editorial.xml.

If an exhibit contains a section such as "connected topics," which may be formatted as a list, this should be merged with the keywords as best as possible for the sake of consistency. This does not include sections such as "connected works/authors."

Pull Quotes

Many exhibits have pull quotes encoded: an especially quotable phrase or passage that is suitable for pulling out and highlighting in the display. These quotes are encoded using @type="pullquote" on <seg>. Currently (2017), we are not displaying pull quotes in the web interface, so you may encounter them in your encoding but you don't need to worry about either adding them yourself or removing them if they are already encoded.

Connected Texts

The format of connected texts sections vary depending on the author, and can be written as lists or as prose paragraphs. This section's header should be "Connections with Other Works" and should appear in its own <div> between the main text of the exhibit and the sources. This overall variation in format is not a problem, but each text should be internally consistent and readable. In particular, make sure that lists are structured consistently.

Regardless of the formatting, use normal encoding practices. If texts are mentioned which are in the WWO collection, a link to the text can be encoded using @ref on <title>. If authors or other important figures appear in the Connections section, they can also be glossed using the personography with a @ref on <persname>.

Notes

Notes are divided into two categories: "annotations" and "contextual information". The former is akin to a typically explanatory note or annotation, elaborating some particular point or briefly providing additional information related to a particular section of the main text. The latter is more amorphous, encompassing everything from a more complete quotation from a text mentioned only in passing in the main text to, potentially, a list of related documents or images, a biographical description of a person named in the text, or a set of links to other resources.

The distinction is important (because it has a dramatic effect on how the published content will be displayed), and must be captured in the encoded document as follows.

Because notes have different display requirements than the main text, verse that appears within a <note> (regardless of the @type value) should be encoded with the simplest possible combination of <lg> and <l>. In most cases, this will mean using a single <lg> element to enclose the whole verse selection, and internal <l> elements for the individual lines.

Annotations

"Annotations" (more or less equivalent to good old footnotes/endnotes) are encoded using <note>, following standard WWP encoding practice (e.g. every <note> has an @xml:id and a @target that points to the anchor in the main text; every anchoring element should be given an @xml:id and a @corresp that points back to the note).

Unlike typical WWP notes, however, annotations must be given a @type with the value "annotation". This is crucial, since without the correct value the note will not be processed and displayed properly.

Inside <note> all of the usual WWP content is permitted. In general, though, if you find that you're encoding a huge <lg> or a <figure> or something like that you should probably treat it as "contextual information" and encode it as described below.

Another difference from typical WWP encoding is that notes are not encoded in a separate division; rather, they should be in the same <p> they are pointing to. This is important because otherwise the notes will not display.

Contextual Information

"Contextual information" is what we are calling notes that provide some more general context for something mentioned in the exhibit—in other words, for something that expands upon it or points readers to other topics/ideas/resources/texts in a more complete way than a typical footnote or endnote might.

It, too, is encoded as a <note> following standard WWP practice, but it is given a @type value of "context". It should also have a @subtype attribute, with one of the following values:

  • text: the content is primarily textual—a quotation, a series of prose paragraphs, etc.
  • media: the content is primarily some medium other than text—a group of <figure> elements, an embedded video, etc. (we don't have anything like this yet, and content of this sort will probably require some minor tweaking of the display CSS and JS—though nothing too major)
  • visualization: the content is a dynamic visualization—NOT an image or video, but rather something like a JavaScript visualization or HTML5 animation  (we definitely don't have anything along these lines yet, and some more major work would need to be done in figuring this out)

While contextual notes can be anchored to paragraphs, it is best to anchor these notes to a short phrase—such as the title of a sonnet excerpted in a note. These anchoring phrases should not contain <persName>s with personography data.

Quotations

Quotations are encoded with the standard <quote> element.

Unlike standard WWP texts, we don't capture much internal structure within <quote>. If you are encoding, for instance, two lines of poetry as an inline quote, use a solidus character ("/") to separate the lines following standard prose conventions.

Phrase-level encoding of personal names and place names, etc., should be encoded as always.

Inline quotations

Brief inline quotations from other texts should just be encoded with <quote>. Quotation marks will be generated by the publication system, so they should be omitted in the encoded file (don't type them in directly and don't capture them as renditional information either).

If a quotation is from one of the works in the bibliography, you can add a pointer to the bibliography entry using @corresp: e.g., <quote corresp="#bibl_beilin_1987">...</quote>.

Block quotations

Block quotations (those of four or more lines) are encoded with <quote type="block">. If you are quoting poetry that is longer than four lines (poetry of three or fewer lines can be encoded as described above with a solidus character) or drama, you can include <l>, <sp>, and <speaker> so that these will display appropriately. If the quotation has a citation, you should wrap the whole thing in <cit> as below:

<cit><quote type="block"><sp><speaker>Jane</speaker><p>I believe Usurers and Lawyers may be very rich, for the Civil War hath made those sorts of Men like as Vultures, after a Battel, that feed on the Dead, or dying Corps.</p></sp></quote><bibl>(<ref target="#bibl_cavendish_1668">II.iv.</ref>)</bibl></cit>

Verse

Citations and bibliographies

Inline citations

Inline citations should be in standard MLA parenthetical format. The parentheses should be transcribed as literal characters. The citation itself (inside the parentheses) is encoded using <ref> and given a @target whose value is a pointer to the bibliographic entry for the work being cited (see the next subsection for our encoding of bibliographies).

There shouldn't be any internal encoding within <ref>, even for personal names.

Bibliographies

Bibliographies and works cited are encoded minimally. Usually there should be an enclosing <div>, which will typically be at the end of the document. It contains a <listBibl> element, whose children are one or more <bibl> elements. Each <bibl> should be assigned a unique @xml:id (in the form of "authorLastName_publicationDate"), to be used as a target by inline citations in the exhibit body.

Within <bibl>, markup should be fairly light: <author>, <title>, and <date>. Where needed, <editor> or <ptr> (the latter for standalone links) may be used. The last element in the <bibl> should be the work's OCLC number, encoded with <idno type ="OCLC">. The OCLC number for a work can be found in its WorldCat entry under “Details.”

If a work in the bibliography is also in the WWO textbase, a link to it can be provided by including an @ref on the <title> of the work which points to the short url of the work in the WWO.

If possible, the bibliography can be separated into two sections, “Primary Sources” and “Secondary Sources”. In exhibits where there is no clear distinction, or only one or the other type, simply title it “Sources”. The Sources section should consist of works treated as sources; you don't need to add a work to this section if it's being named in a different way (for example, if there is a list of editions or translations, particularly if those already contain full bibliographic details). We only need a Sources section if there are any sources being cited in the text.

If you can find the correct OCLC number for primary sources, even those that are linked to the textbase as well, please include it.

In some cases, you may encounter links to the Feminist Companion; we no longer have access to this text so just comment out any citations to it, in case we can recover it in the future.

A fairly standard bibliographic example follows; note that, when author names end with a period, that should go outside the <author> element, since it's primarily functioning as part of the citation's punctuation, rather than the individual names (this also ensures that spacing displays correctly in the output). Note that the period for any authorial initials goes outside of the <author> element.

            <bibl xml:id="bibl_keeble_1987">
                 <author>Keeble, N. H</author>. <title rend="italic">The Literary Culture of Non-Conformity in Later 
                 Seventeenth-Century England</title>. <placeName>Athens, GA</placeName>: <orgName>U of Georgia Press</orgName>, 
                 <date when="1987">1987</date>. <idno type="OCLC">15164788</idno>
              </bibl>

@type on <div>

For the most part we aren't using @type on any <div> elements. The one exception at present is when multiple figures are grouped together as a single section in the main exhibit. In this case, you can assign a @type of "figGrp" to the <div>, and then encode as many individual <figure> elements within it as you'd like.

One limitation to this approach is that it requires all figure groups to appear between other <div> elements (well, before or after them)—you can't have a figure group in the middle of a section. That doesn't seem like a huge problem to me at the moment, but if it becomes an issue then we'll probably need to change the encoding.

"See Also" section

For the Sources and Connected Works sections, as well as anything else that would be the "back matter" of the exhibit, enclose all of these individual <div>s in a separate <div> and give it an @xml:id of "see-also", e.g. <div xml:id="see-also">.

Metadata

Metadata in the <teiHeader> is fairly lightweight: author's name, title, publication date.

For republished RWO files, add a @type of "rwo" to the initial publication date. The date for the WWiC publication does not need @type. For example:

<date type="rwo" when="1999-09">September, 1999</date>

<date when="2016-09">September, 2016</date>

Events that are related to the exhibit, and that should be displayed in the exhibit's timeline, are encoded in a special <wex:contextDesc> element inside the <profileDesc>. Within <wex:contextDesc>, a <listEvent> element may contain one or more <event> elements. The @wex:ref attribute for each <event> must point to a single event in the events.xml file and should have the prefix "ee:" as in, for example, <event wex:ref="ee:quakers.ui"/>. Persons whose birth and death dates are relevant for the exhibit are encoded similarly to events, but with <listPerson> and <person>; use the person's key as it is recorded in the Names database and add an "ep:" prefix. For example: <person wex:ref="ep:mfell.nep"/>.

Basic Encoding Checklist

Below is a checklist that includes the basic sections and task for encoding an exhibit.

  • Author, title, and publication information
  • Event list
    • Extract events from exhibit content
    • Research events
    • Add new events to Filemaker event database
    • Add events to the encoded exhibit file
  • Keywords
    • Add new keywords to editorial.xml file
    • All introductions should have the "introduction" keyword
  • Write an argument
  • Review overall organization to make sure the document is consistent with published exhibits
    • Make sure the exhibits are titled "Introduction to [First Name Last Name]'s [Title]"
    • The main sections of introductions (often called "Introduction") should be changed to "[Title] in Context"
  • Personography
    • Generate list of needed personography review and entry
    • Create new keys as needed
    • Check personography entries and edit if necessary
    • Add any new individuals to personography
    • Add keys to TEI files with @ref
  • Images
    • Background image
    • Figures in body
    • Add all images to assets/gfx folder
  • Sources
    • Split into primary and secondary if possible
    • Link titles back to WWO if possible
    • Make sure in-text citations are added and properly formatted
    • Make sure that in-text citations are linking to the Sources section
  • Connected texts
    • Link titles back to WWO if possible
  • Proofing steps
    • Spot-check encoding for pervasive issues
    • Publish to wwp-test
    • Full proof of published versions

Basic Revision Checklist

Below is a checklist for preparing previously-published RWO files for publication In editing these previously-published exhibits, we can go ahead and fix any outright errors. For small infelicities of language, we tend not to worry unless they make the exhibits difficult to understand. Quotations from WWO, or any early text, do not need to be regularized; if something seems as if it might be an error, check the original. Any more major changes to the language of a text should be brought to an exhibits meeting and may need to be confirmed with the author.

  • Encoding
    • Rename file, if needed, and update SVN
    • Review and correct encoding; analyze needs for rehabilitation and make needed fixes
    • Make sure that the overall organization is consistent with published exhibits
    • Review TEI header and metadata
    • Write an argument, if needed
    • Identify and encode pull quotes
    • Perform final review of encoding when other steps are completed
  • Sources
    • Split into primary and secondary if possible
    • Link titles back to WWO if possible
    • Add xml:ids and OCLC info to secondary sources
    • Make sure in-text citations are added and properly formatted
    • Make sure that citations are linked within the text
  • Images
    • Locate images for background and body
    • Resize
    • Add all images to assets/gfx folder
    • Encode
  • Keywords
    • If there are "connected topics," etc, merge with keywords
    • Add new keywords to editorial.xml file
    • Check and populate list of keywords
    • Make sure that intros are tagged with "introduction"
  • Personography
    • Generate list of needed personography revision and entry creation for persons mentioned in the exhibit
    • Create new keys as needed
    • Check personography entries and edit if necessary in FMP
    • Add any new individuals to names database in FMP
    • Add keys to TEI files
  • Connected texts
    • The Connected texts section should be internally consistent, but there is some variation between how these were set up; try to make sure it's readable but don't worry if some are more list-like than others
  • Proofing steps
    • Spot-check encoding for pervasive issues
    • Publish to wwp-test
    • Full proof of published versions

Preparing images

There are several ways in which images are used in exhibits, and slightly different approaches may need to be taken to preparing images for the web depending on their intended use within the exhibit. These are outlined below.

Header Images

The full-page header image that appears as a background at the top of all exhibits needs to be selected with care, to ensure that it is sufficiently large to scale well at larger screen sizes. As a general rule, the original should be at least 1400 x 1400 pixels, with 1600px per side being ideal. In a few cases, I have used images as small as 1200px in both directions, though this really isn't optimal once the images starts scaling up.

Background images must have small-sized versions suitable for use as thumbnails in the Table of Contents. Because the URL to the thumbnail is automatically generated, it is important that the header images follow the below naming conventions.

Body images

Images used in the body of an exhibit are a little more forgiving when it comes to resolution. At most screen sizes, images used in the body of an exhibit do not need to exceed 800px in either direction. That said, larger images are good to begin with because they are more forgiving of cropping/downsizing for the web.

Body images are used in three ways:

  • small insets embedded within a paragraph
  • full-width images that are as wide as the surrounding text column (e.g. the same width as a prose paragraph
  • tile images that are used in the contents listing on the main page

Resizing for the Web

Whenever possible, it is a good idea to prepare all body images in three sizes (header images will also need to be prepared at a larger size, preferably 1600 x 1600, as noted above, though we can use images that only go up to 800px):

  • large: 1200px on the longest side
  • medium: 800px on the longest side
  • small: 400px on the longest side

Preparing images in three different sizes provides the publication interface with greater flexibility, and ensures that images continue to look reasonably good at a variety of screen sizes and display resolutions.

In PhotoShop (or the image manipulation program of your choice), you will need to resize the image to match each of the three sizes above. Make sure to choose a sampling option that will yield the best results for downsampling if you are, in fact, reducing the size of an image.

Preview can also be used to resize images; under the "Tools" menu, select "Adjust size" and then make sure that you're resizing by pixels.

If you find yourself having to increase the size of an image, it is probably a sign that you need to find a larger image to begin with. If there is no other alternative, you can increase the size of the image judiciously, but bear in mind that increases above about 10% will start to look terrible, especially for users with high-resolution screens.

In most cases, images should be saved in JPEG format. For most images that we're working with, a fair amount of image compression can be applied (the JPEG quality level can be set to "low" or < 30%, depending on what application you're using for image processing. Ideally, the following targets should be kept in mind when creating images at various sizes:

image size/format recommended max. file size
large < 150 KB
medium < 75 KB
small < 40KB

Obviously these are not hard and fast rules, but rather general suggestions that try to balance image quality against file size.

Naming conventions

To aid the publication system in dynamically choosing the correct image size for different layouts, the following file-naming convention exists and should be used for all images:

image size/format filename maximum length on longest side
large filename.jpg 1200px
medium filename_m.jpg 800px
small filename_s.jpg 400px

In addition, don't use filenames with spaces in them. Links with whitespace characters can and will break once published.

Location of processed images

Processed images that are ready for web display should be placed in the assets/gfx/ directory. Images in this directory are under Subversion control.


Publication system

Client-Side Code (HTML/CSS/JavaScript)

HTML

HTML for Women Writers in Context is extremely simple, consisting of a single wrapper file (index.html). This provides a basic container for wrapping page content, which is inserted via JavaScript templates (as part of a Backbone web app) and, at times, HTML content generated by eXist and inserted into the page structure by an AJAX call.

CSS

Two CSS files are required, both referenced in the <head> element of index.html.

  1. timelinejs.css controls the appearance of the TimelineJS plugin component, used for generating timelines of events within individual exhibits. It comes with the TimelineJS plugin, and hasn't been modified in any significant way.
  2. wex.main.css is the main CSS. It is compiled from a LESS file. This defines all the styles needed for the exhibits interface.

JavaScript

Most of the client-side work for the exhibits page/web application is performed by JavaScript. The interface is written a single-page [Backbone|http://backbonejs.org] application.

Several external JS frameworks are used:

  • jQuery --- cross-browser DOM manipulation, event binding, and animations
  • Underscore.js --- utility library, required by Backbone
  • Backbone.js --- the Backbone framework
  • Modernizr.js --- an external framework used for feature detection (we're not really using most of its capabilities, but it's used to add featured-based class attributes to the <html> element
  • Bootstrap --- Twitter-developed framework; we're using a custom build that includes only a few features (specifically, the tooltip)
  • TimelineJS --- Timeline plugin

The JS code is organized into a number of separate files:

  • wex.config.js --- a configuration file that defines a number of settings used by the application
  • wex.main.js --- includes a number of CodeKit import statements, for grabbing all the other files, and defines the document ready function that initializes the application when the page loads
  • wex.models.js ---defines all of the Backbone models used in the application
  • wex.ns.js --- defines an application "namespace" ("wex")
  • wex.router.js --- defines a single Backbone router for the application
  • wex.templates.js --- defines all of the Underscore templates used to generate HTML fragments from JSON models
  • wex.ui.js ---defines all of the Backbone views
  • wex.xtend.js --- extends the Backbone view prototype and defines a couple of additional functions used throughout the application

Minification

These files are merged into a single minified file, wex.main-ck.js, for deployment. I've been using CodeKit to handle the build and minification process, but there are other methods of doing this as well (see below).

If at some future point you decide not to merge and minify (such a decision would reduce overall page performance, I should note, but there could conceivably a reason to do this down the road), you would just need to update the main index.html file to include pointers to each of these individual files.

If you choose to deploy unmerged JavaScript, file load order is very important:

  • jQuery
  • Underscore
  • Backbone
  • Bootstrap tooltip
  • TimelineJS
  • wex.xtend.js
  • wex.ns.js
  • wex.config.js
  • wex.templates.js
  • wex.models.js
  • wex.ui.js
  • wex.router.js
  • wex.main.js

If you wish to continue with the current compression/minification scheme, there are two options:

1. You can purchase a $25 WWP license for CodeKit, and use it to manage that process for you (it can be configured to automatically minify all script files in a given project, and you can also teach it about project framework files, such as the JS libraries we use). I like this method, and have been using it for a while, but I can also see the merits of avoiding specialized software.

2. You can manage the process manually: this would involve manually concatenating the necessary JS files (in the correct order), either in a text editor or on the command line, and then compressing them using any one of many free JS compression tools. I have installed the command-line YUI Compressor utility on Teller for this purpose (/opt/local/bin/yuicompressor-2.4.7.jar). Typically, to use it you'll type something like this:

java -jar yuicompressor-2.4.7.jar [options] [input file]

The full set of options can be found at the YUI Compressor site, in the section title "Using the YUI Compressor from the command line".

Server-Side Code (XQuery/XSLT)

eXist

All of the server-side code and data for the Exhibits publication system is stored in eXist itself. It is relatively modular, in that most of the individual XQuery files generate small JSON structures used to produce specific portions of a page (e.g. the "related exhibits" field, or information about an individual). Major changes to the behavior and appearance of the publication interface will typically require changes to the client-side code.

XML

The exhibit XML files are stored in the /exhibits/xml/ collection. The personography (persons.xml) and eventography (events.xml) are also stored in this directory.

All the XML content is also stored on disk, under Subversion control. Changes made to the XML will not automatically be reflected in the publication interface; for changes to become visible, the updated XML must be loaded into eXist. The documentation page for WWP Internal Markup Documentation (new) describes in detail the process of loading content into eXist using the Java Admin Client.

XSLT

A single XSLT file, fulltext.xsl, is used to generate the full text of an exhibit, as an HTML fragment. This is then delivered to the client and inserted into the page wrapper via AJAX. The XSL transformation is itself kicked off by an XQuery file, fulltext.xquery (see below). The fulltext.xsl file is stored in eXist, in the /exhibits/xsl/ collection.

(Note: I'm not an XSLT expert, and most of my knowledge comes from playing around and/or looking at bits of the old WWP P4 publication pipeline. In other words, I'm certain this could be cleaned up and written more efficiently, and intelligently.)

XQuery

The bulk of the work in eXist is performed by several XQuery files, all of which are stored in the database in the  /exhibits/xquery/ collection. Specifically:

  • toc.xquery --- generates a JSON file that contains a full listing of all the exhibits in the collection, plus some minimal metadata (author, title, keywords, date of publication) that can be used for sorting or filtering
  • people.xquery --- generates a JSON representation of all the WWP personography entries for a given exhibit, used in the exhibits publication interface for displaying information about a person
  • related.xquery --- generates a JSON representation of other exhibits that name a given person
  • events.xquery --- generates a JSON representation of all the events references within a specific exhibit, used in the publication interface to generate the interactive timeline