Introduction to Political Methodology

The Linear Model

For this demonstration, download the grades.csv dataset.

d <- read.csv('grades.csv')

head(d)

  midterm final overall gradeA
1   79.25 47.00    69.2      0
2   96.25 87.75    94.3      1
3   58.25 37.75    62.0      0
4   54.50 62.00    72.4      0
5   83.00 39.75    72.4      0
6   41.75 49.50    59.5      0

The Linear Model

plot(d$midterm, d$final, 
     xlab = 'Midterm Grade', 
     ylab = 'Final Grade')

The Linear Model

m <- lm(final ~ midterm, data = d) # predict final grade from midterm grade

abline(a = m$coefficients['(Intercept)'], b = m$coefficients['midterm'])

The Linear Model

\[ y_i = \alpha + \beta x_i + \varepsilon_i \]

The Linear Model

Partitioning the outcome into two parts—the part we can explain, and the part we’re ignoring:

\[ \underbrace{y_i}_\text{outcome} = \underbrace{\alpha + \beta x_i}_\text{explained} + \underbrace{\varepsilon_i}_\text{unexplained} \]

The Linear Model

Partitioning the outcome into two parts—the part we can explain, and the part we’re ignoring:

\[ \underbrace{y_i}_\text{outcome} = \overbrace{\alpha}^\text{intercept parameter} + \beta x_i + \varepsilon_i \]

The Linear Model

Partitioning the outcome into two parts—the part we can explain, and the part we’re ignoring:

\[ \underbrace{y_i}_\text{outcome} = \overbrace{\alpha}^\text{intercept parameter} + \underbrace{\beta}_\text{slope parameter} x_i + \varepsilon_i \]

The Linear Model

Partitioning the outcome into two parts—the part we can explain, and the part we’re ignoring:

\[ \underbrace{y_i}_\text{outcome} = \overbrace{\alpha}^\text{intercept parameter} + \underbrace{\beta}_\text{slope parameter} \overbrace{x_i}^\text{explanatory variable} + \varepsilon_i \]

The Linear Model

Partitioning the outcome into two parts—the part we can explain, and the part we’re ignoring:

\[ \underbrace{y_i}_\text{outcome} = \overbrace{\alpha}^\text{intercept parameter} + \underbrace{\beta}_\text{slope parameter} \overbrace{x_i}^\text{explanatory variable} + \underbrace{\varepsilon_i}_\text{prediction error} \]

But where do the \(\alpha\) and \(\beta\) values come from? How do we estimate the “line of best fit”?

An Optimization Problem

We want to find values for \(\alpha\) and \(\beta\) that minimize the sum of squared error.

sse <- function(a,b){
  y <- d$final # outcome
  x <- d$midterm # explanatory variable
  
  predicted_y <- a + b*x
  
  error <- y - predicted_y
  
  return( sum(error^2) )
}

An Optimization Problem

plot(d$midterm, d$final,
     xlab = 'Midterm Grade', ylab = 'Final Grade')

abline(a = 10, b = 0.5) # too shallow

sse(a = 10, b = 0.5)

[1] 54632.59

An Optimization Problem

plot(d$midterm, d$final,      
     xlab = 'Midterm Grade', ylab = 'Final Grade')  

abline(a = 0, b = 1.2) # too steep!

sse(a = 0, b = 1.2)

[1] 61043.95

An Optimization Problem

We could keep hunting blindly for values \(\alpha\) and \(\beta\) that minimize the sum of squared errors, or we could take a more systematic approach…

\[ \text{SSE} = \sum_{i=1}^n(y_i - \alpha - \beta x_i)^2 \]

An Optimization Problem

\(\text{SSE} = \sum_{i=1}^n(y_i - \alpha - \beta x_i)^2\)

Imagine dropping a ball on this surface. The ball will roll until it reaches a perfectly flat point: the function’s minimum.

Review: Slopes

What is the slope of this function? \(f(x) = 3x + 2\)

The slope of a linear function (a straight line) is measured by how much \(y\) increases when you increase \(x\) by \(1\). In this case, \(3\).

Review: Slopes

Find the slope of each function:

\(y = 2x + 4\)
\(f(x) = \frac{1}{2}x - 2\)
life expectancy (years) = 18.09359 + 5.737335 \(\times\) log(GDP per capita)

Slope of a line \(= \frac{rise}{run} = \frac{\Delta Y}{\Delta X} = \frac{f(x+h) - f(x)}{h}\)

Nonlinear Functions

Nonlinear functions are confusing and scary…