WORKSHOP: Transkribus - computerised text-transcriptions (with a little human help)
The workshop will be led by Annemieke Romein
Annemieke Romeinobtained her PhD at Erasmus University in 2016 on a comparative study of the political terminology of fatherland, patria and patriot in Hessen-Kassel, Gulik and Bretagne. In 2017 she received an NWO Rubicon grant with which she worked in Ghent from September 2017 to February 2020 on a project on political-institutional / legal history, a comparison between the regions of Flanders and Holland between 1576-1702. She was also the project leader of the Digital Humanities "Entangled Histories"-project at the National Library of the Netherlands in The Hague; here she was as a Researcher-in-Residence from May to October 2019. From 2020 she will be working at Huygens ING where she will continue her research into early modern provincial regulations with her NWO Veni project 'A Game of Thrones?'. This is a comparison between Holland, Guelders and Bern in the period 1576-1702.
During this lecture/workshop you will learn the basics of working with the READ-project’s tool “Transkribus” (desktop version). Transkribus is a tool with a very user friendly interface, which will guide you through making a transcription of any text on paper. It will not be able to make a transcription if you cannot train it – so some basic palaeographic knowledge is still required.
After some training, it will be able to decipher the handwriting of the author you trained it with and will be able to transcribe many more pages for you. Even if it isn’t perfect (yet), you can use tools such as “word spotting” to find potential candidates for terms within the transcribed text – allowing you to search for relevant parts for your research!
What will you know after this lecture/workshop:
Basic workshop:
- The basic principles behind HTR(+) vs OCR.
- How to upload documents within Transkribus (photo’s, pdfs, etc.).
- Know the basic functionalities of Transkribus.
- How to manually create a Lay-out Analysis; and how to do have Transkribus do this for you.
- How to transcribe texts within Transkribus (desktop based).
- How to tag words within Transkribus (and why you would want to do that).
- Searching/ KeyWordSpotting
- How to export files from Transkribus.
- Train a model.
Advanced workshop (follow-up on the basic):
- OCR
- Text2Image (what if you have readily available transcripts)
- Tabels and other Lay-Out Analysis (training)
- Creation of generic models
- Judge the quality of your model
- Labelling (text and regions)
- Dictionaries
Requirements:
Please install:
- https://transkribus.eu/Transkribus/ (desktop version).
- Do create an account, please use your work e-mail address as you will need to be added to a shared folder in order to get access to training material.