Skip to main content

Understanding and Creating Word Embeddings

Word embeddings allow you to analyze the usage of different terms in a corpus of texts by capturing information about their contextual usage. This lesson is designed to get you started with word embedding models. Through a primarily theoretical lens, this lesson will teach you how to prepare a corpus and train a word embedding model. You will explore how word vectors work, how to interpret them, and how to answer humanities research questions using them.

This lesson involves running some Python code: a basic familiarity with Python would be helpful, but no particular technical expertise is required.

Reviewed by:

  • Anne Heyer
  • Ruben Ros

Learning outcomes

After completing this lesson, you will be able to:

  • Know what word embedding models and word vectors are, and what kinds of questions we can answer with them
  • Create and interrogate word vectors using Python
  • Put together the corpus you want to analyze using word vectors
  • Understand the limitations of word vectors as a methodology for answering common questions
Interested in learning more?

Check out this lesson on Programming Historian's website

Go to this resource

Cite as

Avery Blankenship, Sarah Connell and Quinn Dombrowski (2024). Understanding and Creating Word Embeddings. Version 1.0.0. Edited by Yann Ryan. ProgHist Ltd. [Training module]. https://doi.org/10.46430/phen0116

Reuse conditions

Resources hosted on DARIAH-Campus are subjects to the DARIAH-Campus Training Materials Reuse Charter

Full metadata

Title:
Understanding and Creating Word Embeddings
Authors:
Avery Blankenship, Sarah Connell, Quinn Dombrowski
Domain:
Social Sciences and Humanities
Language:
en
Published to DARIAH-Campus:
1/27/2025
Originally published:
1/31/2024
Content type:
Training module
Licence:
CCBY 4.0
Sources:
Programming Historian
Topics:
Python, Machine Learning, Corpus Analysis
Version:
1.0.0