Data Management is a set of practices and techniques used by researchers to ensure that their data is organised, structured and easily reusable for future research
This lesson covers tokenization, part-of-speech tagging, and lemmatization, as well as automatic language detection, for non-English and multilingual text. You’ll learn how to use the Python packages NLTK, spaCy, and Stanza to analyze a multilingual Russian and French text.
In this lesson, you will learn how to download YouTube video comments and use the R programming language to analyze the dataset with Wordfish, an algorithm designed to identify opposing ideological perspectives within a corpus.
Humanities and social scientific data is fundamentally different in type to a great deal of data available in the sciences. This resource will help you to understand your data, and therefore how to handle it. This resource looks at humanities data and its reliability, as well as different types of data you may encounter.
A digital gazetteer records information associated with specific places. This lesson teaches you how to create a gazetteer from a historical text, using the Linked Places Delimited (LP-TSV) format.
“What gets into your dataset and what doesn’t?" For database projects in the humanities and social sciences, having a concrete idea of your project scope can be very important. This resource covers scoping methods for Database projects to help narrow down and accurately size the database you are working with in your research.
What does "curating data stories" mean from a technical and academic perspective? This podcast from the curiositas5.0 project features a discussion between Jane Haller and Joana Meier about experimenting with digital exhibitions.
The data you generate in humanities and social science projects may well need longer term storage
beyond the scope of your own research project. Medium to long term data storage is vital for
allowing other scholars to examine and test your data and models, and ensuring open access to your
data is an increasingly prominent issue. This resource will guide you through a thoughtful discussion of Data Management and Storage.
This article introduces the main concepts in Git and basic Git commands that can be used from the command line. Understanding these commands will help you with using Git in a code editor, the Git desktop and other options, like GitHub online.
Getting access to the data on GitLab is different on all three operating systems. This post shows how to use the code editor VS Code with its graphical user interface for working collaboratively in Git with Windows, Mac and Linux.
This short course will help learners understand how to work with Git in a collaborative setting such as teamwork or group projects, and how to make use of platforms like GitHub or GitLab to complete that work.
This resource offers an introduction to copyright laws within the UK context when dealing with multidimensional media from repositories, archives and collections from that country.
In this resource students will learn what a database is and how it is used in humanities research, go through examples of Humanities Databases in use by researchers today, learn when a researcher would need to use a database and how to distinguish between different database technologies.