Skip to content

Ingesting Text Objects

California Historical Society edited this page Feb 16, 2018 · 2 revisions

Basic procedure used for Broadsides collection

  • Create MODS "shell" and ingest as Manuscript object
  • In each object, go to Manage > Manuscript and Add Page
  • Select image file for page, and ingest
  • Page > Manage > Edit OCR -- this will get indexed in Solr, so best to make it accurate. (Not sure what to do with incorrect hOCR except to delete...)
  • Copy edited OCR and paste into a .txt file. Go to parent Manuscript, Manage > Datastreams > Add Datastream, and add OCR
  • Take OCRed text and create a simple TEI XML document using this template
  • Manage manuscript object and add TEI datastream

To do

Work out a procedure for larger collections of compound text objects using Islandora Compound Batch

Clone this wiki locally