Machine Learning

Machine learning deals with the realm of Artificial Intelligence (AI) where software can be trained to predict outcomes based on data.

Facial Recognition in Historical Photographs with Artificial Intelligence in Python
EN
In this lesson, you’ll learn computer vision and machine learning principles for object recognition, and how to apply these principles using Python to recognize and classify smiling faces in historical photographs.
Authors
Charles Goldberg
Zach Haala
Read more →
Automatic Text Recognition (ATR) - Where and How to Get Images
EN
This tutorial explores where and how to find, create, and collect images of textual material, a crucial initial step in any process using Automatic Text Recognition (ATR).
Authors
Anna Busch
David Lassner
Aneta Plzáková
Read more →
Automatic Text Recognition (ATR) - Layout Analysis
EN
Discover the subtleties of region and line segmentation and learn about the purpose of layout analysis for Automatic Text Recognition!
Authors
Alix Chagué
Hugo Scheithauer
Read more →
Automatic Text Recognition (ATR) - End Formats and Reusability
EN
Increase the visibility of your ATR output while fostering Open Science.
Authors
Floriane Chiffoleau
Sarah Ondraszek
Read more →
Automatic Text Recognition (ATR) - Text Recognition and Post-ATR Correction
EN
Dive into the fine-tuning of Automatic Text Recognition outputs!
Authors
Floriane Chiffoleau
Sarah Ondraszek
Read more →
Understanding and Creating Word Embeddings
EN
Word embeddings allow you to analyze the usage of different terms in a corpus of texts by capturing information about their contextual usage. Through a primarily theoretical lens, this lesson will teach you how to prepare a corpus and train a word embedding model. You will explore how word vectors work, how to interpret them, and how to answer humanities research questions using them.
Authors
Avery Blankenship
Sarah Connell
Quinn Dombrowski
Read more →
Automatic Text Recognition (ATR) - Getting Started
EN
Kick off your journey into Automatic Text Recognition (ATR) with our introductory tutorial video. This is the first video of a tutorial series dedicated to extracting full text from scanned images.
Authors
Ariane Pinche
Pauline Spychala
Read more →
Clustering and Visualising Documents Using Word Embeddings
EN
This lesson uses word embeddings and clustering algorithms in Python to identify groups of similar documents in a corpus of approximately 9,000 academic abstracts. It will teach you the basics of dimensionality reduction for extracting structure from a large corpus and how to evaluate your results.
Authors
Jonathan Reades
Jennie Williams
Read more →
Creating Deep Convolutional Neural Networks for Image Classification
EN
This lesson provides a beginner-friendly introduction to convolutional neural networks (CNNs) for image classification. The tutorial provides a conceptual understanding of how neural networks work by using Google’s Teachable Machine to train a model on paintings from the ArtUK database. This lesson also demonstrates how to use Javascript to embed the model in a live website.
Authors
Nabeel Siddiqui
Read more →
Interrogating a National Narrative with GPT-2
EN
In this lesson, you will learn how to apply a Generative Pre-trained Transformer language model to a large-scale corpus so that you can locate broad themes and trends within written text.
Authors
Chantal Brousseau
Read more →
The CLS INFRA Survey of Methods in Computational Literary Studies
EN
This resource from the CLS INFRA project offers an introduction to several research areas and issues that are prominent withinComputational Literary Studies (CLS), including authorship attribution, literary history, literary genre, gender in literature, and canonicity/prestige, as well as to several key methodological concerns that are of importance when performing research in CLS.
Authors
Christof Schöch
Julia Dudar
Evegniia Fileva
Read more →
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1)
EN
This is the first of a two-part lesson introducing deep learning based computer vision methods for humanities research. Using a dataset of historical newspaper advertisements and the fastai Python library, the lesson walks through the pipeline of training a computer vision model to perform image classification.
Authors
Daniel van Strien
Kaspar Beelen
Melvin Wevers
Read more →

Machine Learning

Resources