Scatter Plots and Correlation
Create scatterplots and calculate the correlation coefficient r to describe linear relationships.
🎯⭐ INTERACTIVE LESSON
Try the Interactive Version!
Learn step-by-step with practice exercises built right in.
Scatterplots and Correlation
Scatterplots
A scatterplot displays the relationship between two quantitative variables. Each point represents one individual.
- Explanatory variable (): plotted on the horizontal axis
- Response variable (): plotted on the vertical axis
Describing Scatterplots (DOFS)
- Direction: Positive, negative, or no association
- Outliers: Unusual points
- Form: Linear, curved, clusters
- Strength: Weak, moderate, strong
Correlation Coefficient ()
The correlation measures the strength and direction of a linear relationship.
Properties of
- : positive association
- : negative association
- close to 1: strong linear relationship
- close to 0: weak or no linear relationship
- has no units (dimensionless)
- is not affected by changes in units (adding, multiplying)
- is the same regardless of which variable is or
Interpreting
| | Strength | |-------|----------| | 0.8 – 1.0 | Strong | | 0.5 – 0.8 | Moderate | | 0.0 – 0.5 | Weak |
Cautions About Correlation
- Correlation ≠ Causation: Association does not imply cause-and-effect
- only measures linear relationships (a curved pattern may have )
- is sensitive to outliers
- should only be used for quantitative variables
- Always look at the scatterplot — don't rely on alone
Influential Points
An influential point substantially changes the regression line or correlation when removed.
- Points with extreme -values are often influential
- Outliers may or may not be influential
AP Tip: Always plot the data before calculating . The correlation coefficient can be misleading without seeing the actual pattern (recall Anscombe's quartet).
📚 Practice Problems
No example problems available yet.