Linear Regression and Correlation

Find lines of best fit and interpret correlation in real-world contexts.

🎯⭐ INTERACTIVE LESSON

Try the Interactive Version!

Learn step-by-step with practice exercises built right in.

Start Interactive Lesson →

Linear Regression and Correlation

Scatter Plots Review

A scatter plot shows the relationship between two quantitative variables.

Direction: Positive, negative, or none Form: Linear or nonlinear Strength: Strong, moderate, or weak

Correlation Coefficient (rr)

Measures the strength and direction of a linear relationship:

  • r=1r = 1: Perfect positive linear
  • r=1r = -1: Perfect negative linear
  • r=0r = 0: No linear relationship
  • r|r| close to 1: Strong

Line of Best Fit (Regression Line)

The line that best represents the trend in the data: y^=mx+b\hat{y} = mx + b

Making Predictions

Use the regression equation: If y^=2.5x+10\hat{y} = 2.5x + 10 and x=8x = 8: y^=2.5(8)+10=30\hat{y} = 2.5(8) + 10 = 30

Residuals

Residual=ActualPredicted=yy^\text{Residual} = \text{Actual} - \text{Predicted} = y - \hat{y}

  • Positive residual: actual is above the line
  • Negative residual: actual is below the line

Interpreting the Slope

"For every 1-unit increase in xx, the predicted yy increases/decreases by mm units."

Interpreting the Y-Intercept

"When x=0x = 0, the predicted yy is bb." (May not always make practical sense.)

Cautions

  1. Correlation ≠ Causation
  2. Don't extrapolate beyond the data range
  3. Outliers can strongly affect the regression line
  4. rr only measures linear relationships

Example interpretation: "There is a strong positive linear relationship (r=0.92r = 0.92) between hours studied and test score. For each additional hour of studying, the predicted test score increases by about 5 points."

📚 Practice Problems

No example problems available yet.