Difference between revisions of "Conversion task chart"
(Created page with "=TTD for P5 Migration= ==conversion of instances== Last column, "scope", indicates: * '''req''' = simple, straightforward, required automatble conversion * '''req+''' = requ...") |
|||
Line 427: | Line 427: | ||
Questionable | Questionable | ||
− | | remove highlighting of 's where it exists? | | | | | | | + | {| class="wikitable" |
+ | |remove highlighting of 's where it exists? | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |} | ||
Remaining to be decided | Remaining to be decided | ||
− | | values for type= on <text> and <floatingText> | | | | | | | + | {| class="wikitable" |
− | | use <text> for poems?? | | | | | | | + | | values for type= on <text> and <floatingText> |
− | | encoding of <tables>? | | | | | | | + | | |
− | | whitespace rendition? | | | | | | | + | | |
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | use <text> for poems?? | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | encoding of <tables>? | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | whitespace rendition? | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |} | ||
Larger and prerequisite tasks | Larger and prerequisite tasks | ||
− | | Harvest placenames and orgnames and create place/orgography | | | | | | | + | {| class="wikitable" |
− | | Harvest bibls and create bibliography | | | | | | | + | | Harvest placenames and orgnames and create place/orgography |
− | | Harvest persnames and supplement personography | | | | | | | + | | |
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | Harvest bibls and create bibliography | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | Harvest persnames and supplement personography | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |} | ||
− | + | ==post-conversion== | |
* do stuff in chart above | * do stuff in chart above | ||
* generate DTD (/) | * generate DTD (/) | ||
* compile DTD (/) | * compile DTD (/) | ||
− | * ensure Emacs/psgml invokes proper DTD on TB files (/)** currently done by using DOCTYPE declaration | + | * ensure Emacs/psgml invokes proper DTD on TB files (/) |
+ | ** currently done by using DOCTYPE declaration | ||
** investigate using some other method, so we can drop DOCTYPE | ** investigate using some other method, so we can drop DOCTYPE | ||
− | * find place for custom documentation (i.e., HTML from ODD) (?) ( | + | * find place for custom documentation (i.e., HTML from ODD) (?) (<tt>/opt/local/share/doc/wwpstore/</tt>, but this is not web-accessible --- still need to symlink or copy it to web area) |
− | * find place for customization (i.e., ODD file) (/) ( | + | * find place for customization (i.e., ODD file) (/) (<tt>/opt/local/share/xml/wwpstore/odd/</tt>) |
* finish writing custom documentation (i.e., prose of ODD) | * finish writing custom documentation (i.e., prose of ODD) | ||
− | * update | + | * update <tt>C-c C-v</tt> to validate files properly (/) |
− | * add Schematron validation (to | + | * add Schematron validation (to <tt>C-c C-v</tt>?) (/) |
* update Emacs registers, if needed (/) | * update Emacs registers, if needed (/) | ||
− | * update | + | * update <tt>wwp-smart-return-context-alist</tt> and any needed parts of <tt>wwp-smart-return-default-functions</tt> (/) --- seems OK, but not thoroughly tested |
− | * update | + | * update <tt>C-c C-L</tt> (/) |
− | * update | + | * update <tt>wwp-ignore-markup-regexp</tt>, if needed (/) (not needed) |
* add '_' to list of NAME characters in our SGML declaration, because P5 DTD uses it in parameter entity names (/) | * add '_' to list of NAME characters in our SGML declaration, because P5 DTD uses it in parameter entity names (/) | ||
− | * update | + | * update <tt>run-xslt-on-TB</tt> (low priority, as it currently works) |
− | * update | + | * update <tt>create_WWP_tb_and_corpus.bash</tt> |
− | * overhaul | + | * overhaul <tt>supraValidate.perl</tt> |
* decide on proper place to store schemas & move them there | * decide on proper place to store schemas & move them there | ||
− | * create a TB version of | + | * create a TB version of <tt>gxp.bash</tt>? |
− | * consider adding well-formedness check to | + | * consider adding well-formedness check to <tt>save-buffer</tt> |
− | * rewrite | + | * rewrite <tt>transcriptionData2wwp-store.xslt</tt> and if needed update <tt>tadpole.perl</tt> |
− | * rewrite | + | * rewrite <tt>validate_these_files.bash</tt> --- maybe use a Makefile, instead? |
* strip P4 stuff out of internal documentation | * strip P4 stuff out of internal documentation | ||
− | + | ==Notes== | |
− | + | [1] Only 2 files use <xref>, and only 1 of 'em has extended pointers. I will probably just convert these by hand using query-replace-regexp, rather than trying to automate. | |
− | + | [2] There are '''very''' few of these. See e-mail ''half Titles --> headings'' from Syd to JF & JM 2008-09-13 17:04. As there has been no further discussion, we will be deferring any such change until after P5 conversion. | |
− | + | [3] Pretty silly thing to do, as we just nuked all <tt><langUsage></tt> and <tt><language></tt> in TEI_generic.xslt, anyway. | |
− | + | [4] There are no <tt><printer></tt> elements in the textbase at all, and all of the <tt><printer></tt> elements are within the <tt><teiHeader></tt> or in a <tt><bib></tt>, and thus should not be changed. So there is nothing to do. See e-mail exchange ''on <publisher> --> <docRole>'' of 2008-10-26. | |
− | + | [5] At least, <tt>WWP_specific.xslt</tt> handles all those that are in <tt><mw></tt> by spitting out the <tt><mw></tt> with a <tt>type=border-ornamental</tt>. As for those not in <tt><mw></tt>, I hope to hear back from John soon. | |
− | + | [6] other than the 865 on <tt><gap></tt> and 38 on <tt><unknown></tt> that we've decided will remain <tt>desc=</tt>. | |
− | + | [7] This is, in theory, automatable. But it would be quite hard, and there are only 16 occurrences, so we're just going to do them by hand. | |
− | + | [8] This is essentially the same problem pre- and post- P5, non-essential, and quite hard to do. Thus I'm deferring this for now. | |
− | + | [9] We may want to go through resulting generated <tt><rendition></tt> elements looking for redundancies, and resolving them. This is probably something that could be automated, but would be faster to do by hand, as there are only some 287 occurences in 7 files. | |
− | + | <tt>WWP_departed.xslt</tt> is basically the same as <tt>huntingGathering_to_pointing_quotations.xslt</tt>, except that it looks for <tt><q></tt> or <tt><quote></tt> only when a descendant of an <tt><lg></tt>. | |
− | The | + | The <tt>root_ns.xslt</tt> program is basically the same as the <tt>Dot-two.xslt</tt> on the TEI wiki, except that it spits out a document in the WWP textbase storage namesapce (instead of the main TEI namespace). |
Latest revision as of 15:46, 16 May 2014
TTD for P5 Migration
conversion of instances
Last column, "scope", indicates:
- req = simple, straightforward, required automatble conversion
- req+ = required possibly automatable conversion, may not be so simple (e.g., what to do w/ attributes of an element that is being deleted?) The number of plus signs is a (very) rough indication of difficulty
- maybe = not required by P4 → P5 conversion, but other encoding projects likely to be interested
- WWP = a WWP-specific change, generally not useful to others
task | automatable | pre-hand | hand | post-hand | scope |
---|---|---|---|---|---|
id= to xml:id= | TEI_id2uri.xslt | req | |||
id= of <language> to ident=[3] | TEI_id2uri.xslt | req | |||
method= of <normalization>: "tags"→"markup" | TEI_generic.xslt | req | |||
TEI.2 to TEI | root_ns.xslt | req | |||
namespace | root_ns.xslt | req | |||
update <editionStmt> | WWP_specific.xslt | WWP | |||
ensure <floatingText> in proper environ | maybe? | ||||
<change> | TEI_generic.xslt | req+ | |||
IDREF to URI bare name | TEI_id2uri.xslt | req | |||
remove <imprint>s in <bibl> | TEI_generic.xslt | req+ | |||
<namespace> in <tagsDecl> | TEI_generic.xslt | req | |||
<ent> to <name type="ent"> | TEI_generic.xslt | req | |||
<desc> not desc= | no occurences[6] | req | |||
anchored=yes/no to true/false | TEI_generic.xslt | req | |||
add ref= (or key= temporarily) to all names and <rs> | no | hand | WWP | ||
embedded <text> to <floatingText> | TEI_generic.xslt | req | |||
eliminate part= on <quote> and except in poetry |
WWP_specific.xslt | WWP | |||
convert part= to next/prev on <quote> and in poetry |
WWP_departed.xslt | fix-up | WWP | ||
move from child of <lg> to children of <l> |
WWP_metQuot.xslt | fix-up | WWP | ||
convert <sic> encoding to <choice> | TEI_generic.xslt | req | |||
convert <orig> encoding to <vuji> | WWP_specific.xslt | WWP | |||
convert encoding to <choice>, with <am> and <ex> for letter-level | [8] | req+ | |||
duplicate <lb> and other milestone elements within <choice> | [7] | post | maybe | ||
eliminate play-specific portion of who= prefix | no | hand | WWP | ||
eliminate <docTitle> | WWP_specific.xslt | WWP | |||
change &ornament; encoding to "pre(deco)" and "post(deco)" or similar | WWP_specific.xslt [5] | WWP | |||
change half-titles to <head> where permitted | yes[2] | WWP | |||
eliminate use of <ptr> in TOCs that lack page numbers | WWP_specific.xslt | WWP | |||
move target= attribute in TOCs from <ref> to enclosing <item> | WWP_specific.xslt | WWP | |||
change <mcr> within name elements to <hi> | WWP_specific.xslt | WWP | |||
<handList> → <handNotes> | TEI_generic.xslt | req | |||
<hand> → <handNote> | TEI_generic.xslt | req++ | |||
<ps> → <postscript> | WWP_specific.xslt | WWP | |||
<xref>, <xptr> → <ref>, <ptr> | not worth[1] | syd[1] | req+++ | ||
<figure> | yes? | req | |||
date attributes | partially | hand | req++ | ||
lang= to xml:lang= | TEI_id2uri.xslt | done | req++ | ||
move 's inside name elements | partially | hand | WWP | ||
move punctuation outside name elements? | partially | WWP | |||
require <docImprint>, add where necessary | partially | hand | WWP | ||
add cit= attribute to <quote> | yes | WWP | |||
add name encoding for titles of nobility | no | hand | WWP | ||
add <placeName> encoding inside <persName> | no | hand | WWP | ||
add <rs type="properAdjective"> | no | hand | WWP | ||
add metaRef= to metaphorical names | no | yes | WWP | ||
review all name encoding for category correctness | no | hand | WWP | ||
convert long narrative quotations to <floatingText> | no | yes | maybe | ||
add type= to <floatingText> | no | yes | maybe | ||
add <docRole> with type= encoding for all roles | no |
hand |
WWP | ||
eliminate <publisher>, <printer> if present, convert to <docRole> | [4] | WWP | |||
add type="referenceList" as a possible kind of <list> | no | hand | WWP | ||
delete our subtitles in headers | WWP_specific.xslt | WWP | |||
alter how <extent> is handled | WWP_specific.xslt | maybe | |||
<hi type="dic"> → <hi rend="type(#DIC)"> | WWP_specific.xslt | WWP | |||
use rendition= not rend= iff PUA char | WWP_PUA_chars.xslt | [9] | WWP | ||
to= → spanTo= on <addSpan> & <delSpan> | TEI_generic.xslt | req | |||
type= → type=, subtype= of <lg> | WWP_specific | maybe | |||
fix errata list notes & refs in cowley.dramas subfiles | no | hand |
Questionable
remove highlighting of 's where it exists? |
Remaining to be decided
values for type= on <text> and <floatingText> | |||||
use <text> for poems?? | |||||
encoding of <tables>? | |||||
whitespace rendition? |
Larger and prerequisite tasks
Harvest placenames and orgnames and create place/orgography | |||||
Harvest bibls and create bibliography | |||||
Harvest persnames and supplement personography |
post-conversion
- do stuff in chart above
- generate DTD (/)
- compile DTD (/)
- ensure Emacs/psgml invokes proper DTD on TB files (/)
- currently done by using DOCTYPE declaration
- investigate using some other method, so we can drop DOCTYPE
- find place for custom documentation (i.e., HTML from ODD) (?) (/opt/local/share/doc/wwpstore/, but this is not web-accessible --- still need to symlink or copy it to web area)
- find place for customization (i.e., ODD file) (/) (/opt/local/share/xml/wwpstore/odd/)
- finish writing custom documentation (i.e., prose of ODD)
- update C-c C-v to validate files properly (/)
- add Schematron validation (to C-c C-v?) (/)
- update Emacs registers, if needed (/)
- update wwp-smart-return-context-alist and any needed parts of wwp-smart-return-default-functions (/) --- seems OK, but not thoroughly tested
- update C-c C-L (/)
- update wwp-ignore-markup-regexp, if needed (/) (not needed)
- add '_' to list of NAME characters in our SGML declaration, because P5 DTD uses it in parameter entity names (/)
- update run-xslt-on-TB (low priority, as it currently works)
- update create_WWP_tb_and_corpus.bash
- overhaul supraValidate.perl
- decide on proper place to store schemas & move them there
- create a TB version of gxp.bash?
- consider adding well-formedness check to save-buffer
- rewrite transcriptionData2wwp-store.xslt and if needed update tadpole.perl
- rewrite validate_these_files.bash --- maybe use a Makefile, instead?
- strip P4 stuff out of internal documentation
Notes
[1] Only 2 files use <xref>, and only 1 of 'em has extended pointers. I will probably just convert these by hand using query-replace-regexp, rather than trying to automate.
[2] There are very few of these. See e-mail half Titles --> headings from Syd to JF & JM 2008-09-13 17:04. As there has been no further discussion, we will be deferring any such change until after P5 conversion.
[3] Pretty silly thing to do, as we just nuked all <langUsage> and <language> in TEI_generic.xslt, anyway.
[4] There are no <printer> elements in the textbase at all, and all of the <printer> elements are within the <teiHeader> or in a <bib>, and thus should not be changed. So there is nothing to do. See e-mail exchange on <publisher> --> <docRole> of 2008-10-26.
[5] At least, WWP_specific.xslt handles all those that are in <mw> by spitting out the <mw> with a type=border-ornamental. As for those not in <mw>, I hope to hear back from John soon.
[6] other than the 865 on <gap> and 38 on <unknown> that we've decided will remain desc=.
[7] This is, in theory, automatable. But it would be quite hard, and there are only 16 occurrences, so we're just going to do them by hand.
[8] This is essentially the same problem pre- and post- P5, non-essential, and quite hard to do. Thus I'm deferring this for now.
[9] We may want to go through resulting generated <rendition> elements looking for redundancies, and resolving them. This is probably something that could be automated, but would be faster to do by hand, as there are only some 287 occurences in 7 files.
WWP_departed.xslt is basically the same as huntingGathering_to_pointing_quotations.xslt, except that it looks for or <quote> only when a descendant of an <lg>.
The root_ns.xslt program is basically the same as the Dot-two.xslt on the TEI wiki, except that it spits out a document in the WWP textbase storage namesapce (instead of the main TEI namespace).