Getting started with encoding

From Digital Scholarship Group
Revision as of 12:39, 25 May 2015 by Sconnell (talk | contribs)
Jump to navigation Jump to search

Helpful Links

The Encoding Process

  • Document Analysis should be the first thing you do
  • Encoding Steps
    • Update your working directory in Subversion:
      • launch Oxygen
      • in the Tools menu, choose “SVN Client”
      • Control-click on the WWPtextbase/ directory and choose “Update” from the menu that pops up
      • It should either say “no incoming changes” or “Operation successful”.
    • Find your tadpole
      • navigate to your ~/Documents/WWPtextbase/ folder
      • n the under_construction/ directory, locate your tadpole
      • open it in Oxygen (if your system is set up to open XML files in oXygen, you can just double-click the file)
    • Encode the basic document structure:
      • if your text doesn’t have any front or back matter, delete <front> and <back>
      • enter the
        s and other major structural elements inside the <front>, <body>, and <back> elements as appropriate
    • Start transcribing
      • omit the title page and table of contents for now
      • enter paragraphs and container elements like <quote> or <lg> before typing in their contents
      • fix any validation errors as soon as you see them
      • save often
    • Commit your changes
      • at the end of your session, go back into the SVN client
      • navigate to your tadpole file
      • control-click on it and choose “Refresh” (⌘-R); you should see a little star appear next to your file
      • control-click on your file again and choose “Commit” (⌘-M); enter a message that has at least your personal key if not a useful message, and then approve the dialog box.

Encoding Tips

  • Thinking about Encoding
    • Know that encoding can get very detailed: you’re marking up not just the basic organization of your text, but also a lot of other phenomena; you can expect that there will be some things you didn’t know you needed to encode that you will have to add in once you learn about them
    • Think about what you’re doing as building a set of layers, rather than writing a stream; XML is really a set of enclosures, not a linear flow of details—and know that getting your brain wrapped around this way of thinking can be challenging!
    • Don’t be afraid to say “I have no idea what I’m doing”
    • Map out the book you’re working on before you get started; revisit document analysis as needed
    • Keep an eye out for the tendency to miss spaces around elements
    • Pay attention to the ways that punctuation should be used around elements; know that it might feel weird/different from non-encoded texts you’re used to working with
    • Try to get some genre variety as you’re starting to encode; don’t be stuck in prose forever
  • Best Practices
    • Do a review/read through of the documentation once you feel like you have your feet under you; it helps if, rather than thinking of the internal documentation as a thing you go to when you have a problem, you treat it as something you can learn from and read through just to see what’s there and build on your knowledge
    • Look at other people’s texts as well, think about why people made the decisions that they made
    • Continually validate (⌘-shift-V); it is OK to check in an invalid file, but problematic to check in an ill-formed file, thus check well-formedness (⌘-shift-W) before you commit [That said, if you really have to leave before you can fix an ill-formedness error, go ahead and check it in—we try to avoid checking-in ill-formed files, but leaving a file not checked in at all is usually worse.]
    • Don’t forget to fill out your change logs for major milestones in encoding or proofing
    • Left margin tidiness makes things much easier to read; the convention is that documents go in/right as you get further into the hierarchy
  • Process Suggestions
    • Skip the title pages if it’s your first text. You can do those later, when you’re more comfortable with encoding.
    • It’s helpful to paste in a bunch of <lb>s at once and then arrow down for each new line
    • In the same way, you can fill in groups of <l>s with poetry
    • Make yourself a template for page breaks and then copy-paste that in as you need it; it’s often easiest to set yourself up with a bunch of page breaks at once, rather than interrupting your encoding. Your template might look something like this:
<mw type="sig"></mw>
<mw type="catch"></mw>
<pb/>
 <milestone unit="sig" n=""/>
 <mw type="pageNum"></mw> 

What to do if you don't know what to do

  • Look for error messages. For tips on how to read error messages, see: this guide
  • Check the internal documentation
  • Search in files to find example documents that have similar situations (knowing that it’s best to look at recently-encoded files, since some practices have changed). Start with the list of texts that were added since the WWP came to Northeastern
  • Put your problem in a comment so you’re not stuck if you can’t figure out what to do. To surround with a comment, select the text you want and hit: ⌘-shift-, (command-shift-comma)
  • Bring your question to a meeting
  • Ask your mentor
  • Check the [ http://listserv.neu.edu/cgi-bin/wa?INDEX listserv] to see if your question has been discussed in the past (DSGTAG-L)
  • Know that you can email people with questions; you don’t have to wait to grab someone in person