WWO Philoloading

From Digital Scholarship Group
Jump to navigation Jump to search

Loading texts into Philologic (golf/papa)

Overview

Philologic works by "ingesting" texts and building several indices (of words, of element objects, etc.) that facilitate efficient lookup and retrieval. It also offers the option (which we use) of storing some metadata in a MySQL database. The process for getting the texts loaded into Philologic is:

  • cd into the XML texts directory
  • issue the load command
  • enter your MySQL user password when prompted (should be two times, after "Success" message near end of load sequence)
  • add Philologic MySQL password to appropriate file

Detailed instructions - actual commands in green

must be in the directory that contains the texts you want to load

cd /var/www/htdocs/WWO/xml/texts/

Issue load command. The image name will usually be "wwo", but might sometimes be different for testing purposes. Note that we include the --delete option to get rid of the previous wwo image.

time philoload wwo --image=/var/www/htdocs/philologic/images/wwo/ --mkbibliography=/var/lib/philologic/utils/mkbiblio-twig --delete --loadsql --sqluser=pncaton *.xml


Note: The two prompts for passwords that you receive near the end of the load sequence are for the password of the MySQL user issuing the command, NOT the password that "philologic" uses when it queries the database. In this case that user is pncaton, whose password has to do with the 5 patches we applied to MySQL during his tenure. Nb. we really should create an abstract WWO user for doing this, rather than have it be tied to a particular individual.

After the password has been entered two times all the relevant data has been put into the MySQL database and the load process _per se_ is finished. HOWEVER, because of a bug the MySQL password for the "philologic" user is *not* being automatically entered in /var/www/htdocs/philologic/images/wwo/lib/philo-db.cfg, so you have to do it by hand.

Open the file /opt/local/etc/philologic/philologic.cfg, find and copy the password, then insert it as the value of $PASSWD in /var/www/htdocs/philologic/images/wwo/lib/philo-db.cfg.

Note that you MUST do this as soon as the load sequence has finished, because it's easy to forget and if it isn't done the bibliographic criteria search stuff won't work

Debugging: PROBLEM: load process stops at bibliography phase, message mentions XML::Twig CAUSE : XML::Twig module not in Perl library SOLUTION: ask Peter DiCamillo to install XML::Twig module

PROBLEM:

Can't call method "text" on an undefined value at /usr/lib/perl5/site_perl/5.8.0/XML/Twig.pm line 5512.

make: *** [bibliography] Error 255 make: Leaving directory `/var/lib/philologic'

###################### FAILURE #################### Load failed, alas. Look at /var/www/htdocs/philologic/images/wwo//LOADER.LOG for clues

then run philoload again.

CAUSE : philoload was looking for an XPath that wasn't there. SOLUTION: Try looking at the LOADER.LOG file, but I found it wasn't there. So I changed /var/lib/philologic/utils/mkbiblio-twig.plin to spit out debugging code to STDERR. That gave me the info as to which file and which path were the problem.