Over the last ten years, the field of computer vision, which seeks to gain a high-level understanding of images using computational techniques, has seen rapid innovation. For example, computer vision models can locate and identify people, animals and thousands of objects included in images with high accuracy. This technological advancement promises to do the same for image recognition that the combination of OCR/NLP techniques has done for texts. Put simply, computer vision opens up a part of the digital archive for large-scale analysis that has remained mostly unexplored: the millions of images in digitised books, newspapers, periodicals, and historical documents. Consequently, historians will now be able to explore the ‘visual side of the digital turn in historical research’.
This two-part lesson provides examples of how computer vision techniques can be applied to analyse large historical visual corpora in new ways and how to train custom computer vision models. As well as identifying the contents of images and classifying them according to category — two tasks which focus on visual features — computer vision techniques can also be used to chart the stylistic (dis)similarities between images.
A particular focus of this lesson will be on how the fuzziness of concepts can translate (or fail to translate) into machine learning models. Using machine learning for research tasks will involve mapping messy and complex categories and concepts onto a set of labels that can be used to train machine learning models. This process can cause challenges, some of which we’ll touch on during this lesson.
After completing this lesson, you will be able to:
- Interpret what different metrics mean about your model’s performance, and to identify where it is performing poorly
- Use data augmentation as a tool for reducing the amount of training data needed to train a machine learning model.
- Understand the complexity of mapping complex or fuzzy concepts onto set categories
Check out this lesson on Programming Historian's websiteGo to this resource