Linear Regression and Correlation - Complete Interactive Lesson
Part 1: Scatterplots & Describing a Relationship
๐ Linear Regression and Correlation
Part 1 of 5 โ Scatterplots & Describing a Relationship
Topics in This Part
| Section |
|---|
| Bivariate Data & Scatterplots |
| Direction, Form, and Strength |
| Explanatory vs. Response Variables |
๐ Key Concept: When two numerical variables are measured on the same individuals โ like hours studied and test score โ we plot the pairs as points. The picture that emerges (a scatterplot) tells us whether the variables move together, and how tightly.
Bivariate Data & Scatterplots
Bivariate means two variables. Each individual contributes an ordered pair .
| Hours studied () | Test score () |
|---|---|
| 1 | 50 |
| 2 | 65 |
| 3 | 70 |
| 4 | 80 |
| 5 | 90 |
Plotting these five points gives a scatterplot. Here the points climb from lower-left to upper-right โ more hours tends to mean a higher score.
We always put the explanatory variable (the one we think does the explaining) on the horizontal -axis and the response variable (the one we think responds) on the vertical -axis.
๐ก In "hours studied vs. test score," hours is explanatory () and score is response (). Ask yourself: which one would I change to affect the other? That one is .
Concept Check ๐ฏ
Direction, Form, and Strength
Describe every scatterplot with three words:
- Direction โ Positive (points rise left-to-right) or negative (points fall).
- Form โ Linear (points follow a straight-line pattern) or nonlinear (a curve).
- Strength โ How tightly the points hug the pattern: strong, moderate, or weak.
| Picture | Direction | Form | Strength |
|---|---|---|---|
| Points rise tightly along a line | Positive | Linear | Strong |
| Points fall in a loose cloud | Negative | Linear | Weak |
| Points trace a U-shape | (curved) | Nonlinear | โ |
โ ๏ธ Always look at the scatterplot first. Everything else in this lesson โ the correlation coefficient, the line of best fit โ only makes sense when the form is linear. A strong curved pattern can fool a number into looking weak.
Describe the Scatterplot ๐ฝ
A scatterplot of outdoor temperature vs. ice cream sales shows points climbing steadily from lower-left to upper-right, clustered tightly around a straight line.
Negative Relationships
A negative relationship is just as common: as one variable goes up, the other goes down, so the points fall from upper-left to lower-right.
| Explanatory () | Response () | Likely direction |
|---|---|---|
| Hours of TV watched | Test grade | Negative |
| Outdoor temperature | Heating bill | Negative |
| Age of a used car | Resale price | Negative |
| Hours exercised | Resting heart rate | Negative |
๐ก A negative relationship can still be strong โ strength is about how tightly the points hug the line, not which way the line tilts. A tightly falling cloud of points is strong and negative.
Concept Check ๐ฏ
Why This Matters
Once we know a relationship is linear, two powerful tools open up:
- The correlation coefficient โ a single number measuring direction and strength (Part 2).
- The line of best fit โ an equation that lets us predict (Parts 3โ4).
You can now read a scatterplot and describe it in three words. Next we turn "strong/moderate/weak" into an exact number.
Part 2: The Correlation Coefficient r
๐ Linear Regression and Correlation
Part 2 of 5 โ The Correlation Coefficient
๐ The Idea: The correlation coefficient squeezes the direction and strength of a linear relationship into one number between and .
What Tells You
Part 3: The Line of Best Fit
๐ Linear Regression and Correlation
Part 3 of 5 โ The Line of Best Fit
๐ The Goal: Find the single straight line that comes closest to all the points. We call it the least-squares regression line, and it lets us predict from .
Part 4: Predictions, Residuals & Cautions
๐ Linear Regression and Correlation
Part 4 of 5 โ Predictions, Residuals & Cautions
๐ The Payoff: With the line in hand, plug in any to predict . But predictions come with rules โ and a famous warning about .
Part 5: Mixed Practice & Mastery Check
๐ Linear Regression and Correlation
Part 5 of 5 โ Mixed Practice & Mastery Check
You can now (1) describe a scatterplot, (2) interpret and , (3) build the line of best fit, and (4) predict, find residuals, and avoid the causation trap. Let's put it all together.
Quick Reference
| Goal | Key move |
|---|---|
| Describe a scatterplot | direction (pos/neg) ยท form (linear?) ยท strength |
| Measure strength + direction | , with |