Skip to main content

Regression Analysis with Scikit-Learn (part 1 - Linear)

This lesson is the first of two that focus on an indispensable set of data analysis methods, linear and logistic regression. Linear regression represents how one (or more) quantitative measures relate to, or predict, some other quantitative measure. A computational historian, for example, might use linear regression analysis to do the following:

  • Assess how access to rail transportation affected population density and urbanization in the American Midwest between 1850 and 18601
  • Interrogate the ostensible link between periods of drought and the stability of nomadic societies

Logistic and linear regression are perhaps the most widely used methods in quantitative analysis, including (but not limited to) computational history. They remain popular in part because:

  • They are extremely versatile, as the above examples suggest
  • Their performance can be evaluated with easy-to-understand metrics
  • The underlying mechanics of model predictions are accessible to human interpretation (in contrast to many ‘black box’ models)

Learning outcomes

After completing this lesson, you will be able to:

  • Run linear regression algorithms in Python using the Scikit-learn library
  • Validate models and assess their performance
  • Interpret the results given by linear regression models
  • Know which common pitfalls to avoid when conducting regression analysis
Interested in learning more?

Check out this lesson on Programming Historian's website

Go to this resource

Cite as

Matthew J Lavin, Thomas Jurczyk and Rennie C Mapp (2022). Regression Analysis with Scikit-Learn (part 1 - Linear). Version 1.0.0. Edited by James Baker. ProgHist Ltd. [Training module]. https://doi.org/10.46430/phen0099

Reuse conditions

Resources hosted on DARIAH-Campus are subjects to the DARIAH-Campus Training Materials Reuse Charter

Full metadata

Title:
Regression Analysis with Scikit-Learn (part 1 - Linear)
Authors:
Matthew J Lavin
Domain:
Social Sciences and Humanities
Language:
en
Published:
12/21/2023
Content type:
Training module
Licence:
CCBY 4.0
Sources:
Programming Historian
Topics:
DH, Open education, Open access, Data visualisation
Version:
1.0.0