Scatter Plots and Correlation
Visualizing and measuring linear relationships
Scatterplots and Correlation
Scatterplots
Scatterplot: Graph showing relationship between two quantitative variables
- x-axis: Explanatory variable (independent)
- y-axis: Response variable (dependent)
- Each point represents one individual
Purpose: Visualize relationship, identify patterns, detect outliers
Describing Scatterplots: DCFS
Direction: Positive, negative, or no association
Positive: As x increases, y tends to increase
Negative: As x increases, y tends to decrease
No association: No clear pattern
Cluster: Data grouped together or spread evenly
Form: Linear or nonlinear
Linear: Points follow straight-line pattern
Nonlinear: Curved pattern (quadratic, exponential, etc.)
Strength: How closely points follow pattern
Strong: Points close to pattern
Moderate: Some scatter but clear pattern
Weak: Lots of scatter, vague pattern
Outliers: Points far from overall pattern
Correlation Coefficient (r)
Measures: Strength and direction of linear relationship
Formula:
Properties:
- Range: -1 ≤ r ≤ 1
- r = 1: Perfect positive linear relationship
- r = -1: Perfect negative linear relationship
- r = 0: No linear relationship
- r > 0: Positive association
- r < 0: Negative association
Interpreting |r|
|r| = 1: Perfect linear relationship
0.8 < |r| < 1: Strong linear relationship
0.5 < |r| < 0.8: Moderate linear relationship
0 < |r| < 0.5: Weak linear relationship
|r| = 0: No linear relationship
Note: These are rough guidelines, context matters!
Important Properties of r
1. Unitless: No units (standardized)
2. Not affected by units: Converting x or y doesn't change r
3. Not affected by which variable is x or y: Switching variables doesn't change r
4. Affected by outliers: Single outlier can dramatically change r
5. Measures linear relationship only: Can be 0 even if strong nonlinear relationship exists!
Example: Calculating r
Data: (1, 2), (2, 4), (3, 5), (4, 7), (5, 8)
,
,
(strong positive)
In practice: Use calculator!
Calculator Method
TI-83/84:
- Enter x-values in L1, y-values in L2
- STAT → CALC → 8:LinReg(a+bx)
- r appears (if diagnostics on: 2nd 0 → DiagnosticOn)
Correlation vs Causation
CRITICAL: Correlation does NOT imply causation!
r = 0.9 means:
- Strong linear relationship exists
- x and y tend to vary together
r = 0.9 does NOT mean:
- x causes y
- y causes x
Possible explanations for correlation:
- x causes y
- y causes x
- Confounding variable causes both
- Coincidence
Example: Spurious Correlation
Ice cream sales and drowning deaths: r ≈ 0.9
NOT because:
- Ice cream causes drowning
- Drowning causes ice cream sales
ACTUALLY:
- Confounding variable: Summer/temperature
- Both increase in summer
Outliers and Influential Points
Outlier: Point far from overall pattern
Effect on r:
- Can increase or decrease r
- Can change sign of r
- Single outlier can dominate
Always: Identify outliers, consider their impact
Influential point: If removed, would substantially change r or regression line
When Correlation Inappropriate
Don't use r if:
- Relationship is nonlinear (r only measures linear!)
- Severe outliers present (distort r)
- Categorical variables (need different analysis)
Always plot data first! Don't rely on r alone.
Describing Associations
Template: "There is a [direction] [form] [strength] association between [x] and [y]."
Example: "There is a strong positive linear association between study hours and test scores."
Add: "With no outliers" or "With one outlier at..."
Association vs Relationship
Association: Variables vary together (correlation)
Relationship: Generic term (could be causal or not)
Causation: x directly causes changes in y
Always distinguish!
Quick Reference
DCFS: Direction, Cluster, Form, Strength (+ outliers)
Correlation r:
- Range: -1 to 1
- Measures linear relationship only
- Unitless
- Affected by outliers
Key: Correlation ≠ Causation
Remember: Always make scatterplot first! r alone can be misleading. A nonlinear relationship might have r ≈ 0 but still be strongly related!
📚 Practice Problems
No example problems available yet.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics