Automatic Text Recognition (ATR) - Where and How to Get Images

Automatic Text Recognition (ATR) - Where and How to Get Images

Authors

Topics

In this tutorial, you learn about the typical methods for obtaining high-quality scanned images, setting the stage for successful text recognition processes. It outlines obtaining digital copies from archives, digitisation needs, and copyright considerations. It also covers using online resources like Google Books and Internet Archive while navigating public domain copyright issues. Lastly, it highlights the Heritage Data Reuse Charter for collaboration between cultural heritage institutions and researchers.

Learning Outcomes

After completing this resource, learners will be able to:

Recognise the sources for acquiring suitable images for text recognition.
Assess the quality of images in terms of their suitability for ATR.
Implement strategies for collecting and organising images for processing.
Apply basic techniques for scanning and digitizing textual materials.

You can read the blogpost (available in English, French, and German), and watch our video (with subtitles in English, French, and German) embedded in the post.

Cite as

Anna Busch, David Lassner and Aneta Plzáková (2024). Automatic Text Recognition (ATR) - Where and How to Get Images. Version 1.0.0. Edited by Anne Baillot and Mareike König. Deutsches Historisches Institut Paris [Training module]. https://harmoniseatr.hypotheses.org/332

Reuse conditions

Resources hosted on DARIAH-Campus are subjects to the DARIAH-Campus Training Materials Reuse Charter.

Full metadata

Title:: Automatic Text Recognition (ATR) - Where and How to Get Images
Authors:: Anna Busch, David Lassner, Aneta Plzáková
Domain:: Social Sciences and Humanities
Language:: English
Published to DARIAH-Campus:: 12/06/2024
Originally published:: 16/05/2024
URL:: https://harmoniseatr.hypotheses.org/332
Content type:: Training module
License:: CC BY 4.0
Sources:: DARIAH
Topics:: Editing tools, Machine Learning, Automatic Text Recognition
Version:: 1.0.0
PID:: https://hdl.handle.net/21.11159/019595c4-2c37-7079-8d74-3463a9b31b3b