WWO Publishing

From Digital Scholarship Group
Jump to navigation Jump to search

WWO Publishing

Testing

Before publishing it is easiest to move all files you want to publish into one directory (typically distribution/), but more honest to leave the newest ones in on_deck until you have actually published. The process is the same either way, it just means you have to perform some of the steps twice. Either way, remember to update the <date>s in the <publicationStmt>s, as they are retained in WWO (even if not easily searchable).

And remember to remove any files you don’t want published from the directories you are playing with. (Only files of the glob format “*.*.xml” will be processed.)

  1. $ cd /path/to/source/dir/probably/distribution/
  2. $ make wf     # Test well-formedness. You don’t really have to do this, as it is performed by step 1 of “Process Data”, below; but finding errors earlier is always better. All must pass except that in some cases duplicate IDs might be tolerable. Still, a very bad idea. Fix it. (Also a processing instruction with a target that starts with “xml” (case insensitive) is probably OK, too, even if well-formedness checker objects.)
  3. Issue either make validate OR make both — they do the same thing, just in different order. (“validate” tests every file against RELAX NG, then tests every file against Schematron; “both” tests each file first against RELAX NG and then against Schematron.)

There may be lots of validity errors, of course. Figuring out which ones are important and which should be ignored is left as an exercise to the reader.

Process Data

The data we generate (e.g., in distribution/) is the canonical data we want to store. (Hence the name of the schema: “wwp-store”.) It is not quite ready to be published by XTF in that state. For example, XIncludes have to be processed and the <charDecl> has to be deleted (due to a Java bug, but we don’t use it in XTF, anyway). Pretty much one call to make does all this, but it can be a pain to set up the directories properly for it.

  1. $ cd /path/to/source/dir/probably/distribution/     # you might already be there, of course
  2. $ make exist     # This may take awhile. Currently ~15 mins on albus. N.B. that you will likely have to specify some paths to needed directories on the commandline; issue just make for further information. The output goes to /tmp/INDIR_eXist_TIMESTAMP (where INDIR is the nume of the current directory) by default, but this is easily changed by setting OUTDIR= on the commandline; e.g. make OUTDIR=/var/lib/tomcat6/webapps/WWO/data/wwp exist. Note that if you are in a directory named distribution/, this command will also go build emerson.almanack.xml.

make WWO creates XIncluded versions of the XML files in your current directory and saves them to a temporary directory. If your current directory is distribution/, the files representing our published version of Mary Moody Emerson’s Almanacks — i.e., the *.*.xml files in [WEBSITE]/research/projects/manuscripts/emerson/distribution/ — will be copied from their directory (which can be specified on the commandline) and combined into one XIncluded document for publication. WARNING: The files in [WEBSITE]/research/projects/manuscripts/emerson/under_construction/ are not included.

Each of these newly-generated files are then pre-processed and placed in XTF’s data directory.

Useful variables
Variable name Default value Description
TMPDIR /tmp/${INDIR}_tmp_${NOW} The path to the directory where temporary files will be placed during processing.
SAXON /usr/local/bin/saxon9he.jar The path to the Saxon HE JAR file. Saxon is used to prepare WWP files for indexing into XTF.
XTF /var/lib/tomcat8/webapps/WWO The path to your working copy of the WWO XTF site in the Subversion repository. For the (not much used anymore) wwo target, output will be saved to XTF/data/wwp/.
WEB /home/syd/Documents/WWPweb The path to your working copy of the WWP’s website in the Subversion repository, which will be used to find and process Mary Moody Emerson’s Almanacks. Only necessary when preparing WWO from distribution/.

Generate XTF Index

  1. Take the output of make WWO and put it in the data/wwp/ directory of the XTF installation where you want to build the index. (Depending on what machine you are on and what directories you specified, above, this may already be done.) Replace old data. Note that our entire XTF directory (except for the index itself) is under Subversion control, so an easy way to do this is to check the newly created files in wherever you created them and check them out where you want to index them.
  2. $ cd /var/lib/tomcat6/webapps/WWO/        # on indexing machine
  3. $ tar xzf index index_PREV_DATE.tgz         # create a backup, as not under Subversion control
  4. $ cd /var/lib/tomcat6/webapps/WWO/bin/    # could have been just cd bin/
  5. $ ./textIndexer -incremental -index default     # This runs reasonably quickly: ~2 minutes on albus, probably quite a bit longer on wwp-test.

If you’re not already on the target machine where you want the index to be, transfer it. (Probably best to make a tarball of it and then transfer that single file, but any mechanism that moves it intact should work fine. I typically move it with scp, first from my desktop to paramedic:/tmp/, and then from there to wwp-test:/var/lib/tomcat6/webapps/WWO/, and then untar it there. It is fine to leave the tarball lying around as a backup (as long as wwp-test has lots of disc space available), it is fine to delete it.

Addendum/note-to-self I can no longer get scp to allow me to transfer a file from wwp-test to wwp, ostensibly because centrify now really screws us over and requires temporary checked-out passwords. I can transfer easily from paramedic to either machine, as it has SSL keys set up. Could probably transfer it by just copying to my centrified home directory, too.

Install Index

After getting the index in the right place (either because you generated it where you want it to be or moved it from the machine on which you generated it to the place where you want it to be), you need to set the permissions properly. Note that “tomcat” in the commandline below is the name of the tomcat user, which seems to be “tomcat” on our Redhat and CentOS systems, but “tomcat7” or “tomcat8” on my Ubuntu and MacOS X systems.

  1. $ sudo chgrp -R tomcat ../index/  &&  sudo chmod -R g+w ../index/

Generate Text Lists

Note: This can be done on any machine on which you have a working copy of https://liblab.neu.edu/svn/DSG/wwp/website/trunk/wwo/texts, but it is somewhat easier to just do it on wwp-test.

  1. Log in to wwp-test
  2. $ cd /var/www/html/WWP/wwo/texts
  3. $ make     # Gives you instructions
  4. $ make HOST=www.wwp-test.northeastern.edu all     # Unless you want to do something other than the usual. You can expect hundreds of (inappropriate) warnings that xml-model is an invalid name for a PI target.
  5. Do NOT check-in the modified lists. This is the hardest part of this process. The lists are different on wwp and wwp-test; thus we don't want to have the wwp-test ones checked-in just to avoid the problem of someone inadvertently checking them out on wwp.

When you actually want to generate these for wwp you can either

  • Use the Perl command in the output of make to change what you have to be suitable for use on wwp, and check it in. (Then put it back if you still want wwp-test to work :-). OR
  • Re-generate the lists for wwp, check those in. (Then re-generate for wwp-test if you still want wwp-test to work. :-)

Update MARC Records

See documentation in textbase/metadata/README.md.