Skip to main content

Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 2)

Over the last ten years, the field of computer vision, which seeks to gain a high-level understanding of images using computational techniques, has seen rapid innovation. For example, computer vision models can locate and identify people, animals and thousands of objects included in images with high accuracy. This technological advancement promises to do the same for image recognition that the combination of OCR/NLP techniques has done for texts. Put simply, computer vision opens up a part of the digital archive for large-scale analysis that has remained mostly unexplored: the millions of images in digitised books, newspapers, periodicals, and historical documents. Consequently, historians will now be able to explore the ‘visual side of the digital turn in historical research’.

This two-part lesson provides examples of how computer vision techniques can be applied to analyse large historical visual corpora in new ways and how to train custom computer vision models. As well as identifying the contents of images and classifying them according to category — two tasks which focus on visual features — computer vision techniques can also be used to chart the stylistic (dis)similarities between images.

A particular focus of this lesson will be on how the fuzziness of concepts can translate (or fail to translate) into machine learning models. Using machine learning for research tasks will involve mapping messy and complex categories and concepts onto a set of labels that can be used to train machine learning models. This process can cause challenges, some of which we’ll touch on during this lesson.

Learning outcomes

After completing this lesson, you will be able to:

  • Interpret what different metrics mean about your model’s performance, and to identify where it is performing poorly
  • Use data augmentation as a tool for reducing the amount of training data needed to train a machine learning model.
  • Understand the complexity of mapping complex or fuzzy concepts onto set categories
Interested in learning more?

Check out this lesson on Programming Historian's website

Go to this resource

Cite as

Daniel van Strien, Kaspar Beelen, Melvin Wevers, Thomas Smits, Katherine McDonough, Michael Black and Catherine DeRose (2022). Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 2). Version 1.0.0. Edited by Nabeel Siddiqui and Alex Wermer-Colan. ProgHist Ltd. [Training module]. https://doi.org/10.46430/phen0102

Reuse conditions

Resources hosted on DARIAH-Campus are subjects to the DARIAH-Campus Training Materials Reuse Charter

Full metadata

Title:
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 2)
Authors:
Daniel van Strien, Kaspar Beelen, Melvin Wevers, Thomas Smits, Katherine McDonough
Domain:
Social Sciences and Humanities
Language:
en
Published:
2/6/2024
Content type:
Training module
Licence:
CCBY 4.0
Sources:
Programming Historian
Topics:
Python, Machine Learning
Version:
1.0.0