Scatterplots and Line of Best Fit

Interpret scatterplots, correlation, and trend lines

Scatterplots and Line of Best Fit (SAT Math)

What is a Scatterplot?

Graph showing relationship between two variables

Each point represents:

x-coordinate: one variable
y-coordinate: another variable

Example: Height vs. Weight

Each point = one person
x = height
y = weight

Types of Correlation

Positive Correlation

As x increases, y increases

Pattern: Points slope upward (↗)

Examples:

Study time vs. test scores
Temperature vs. ice cream sales
Height vs. shoe size

Negative Correlation

As x increases, y decreases

Pattern: Points slope downward (↘)

Examples:

Speed vs. travel time
Price vs. quantity demanded
Outdoor temperature vs. heating costs

No Correlation

No clear pattern

Pattern: Points scattered randomly

Examples:

Shoe size vs. test scores
Height vs. favorite color

Strength of Correlation

Strong Correlation

Points cluster tightly around a line

Clear pattern
Easy to predict

Weak Correlation

Points loosely follow pattern

General trend but lots of variation
Harder to predict

Perfect Correlation

All points exactly on a line

Rare in real data
r = 1 (positive) or r = -1 (negative)

Line of Best Fit (Trend Line)

What Is It?

Line that best represents the data trend

Also called:

Regression line
Trend line
Best-fit line

Equation Form

Usually written as: $y = mx + b$

Where:

$m$ = slope (rate of change)
$b$ = y-intercept (starting value)

Or: $\hat{y} = ax + b$ (predicted value)

Interpreting Slope

Slope ( $m$ ) Meaning

Positive slope ( $m > 0$ ):

Positive correlation
For every 1 unit increase in x, y increases by $m$

Example: $y = 2x + 10$

For every 1 hour of study, score increases by 2 points

Negative slope ( $m < 0$ ):

Negative correlation
For every 1 unit increase in x, y decreases by $|m|$

Example: $y = -3x + 100$

For every 1 mph faster, travel time decreases by 3 minutes

Interpreting Y-Intercept

Y-Intercept ( $b$ ) Meaning

Value of y when x = 0

Example: $y = 5x + 20$

When study time = 0, predicted score = 20

Watch out: Sometimes x = 0 doesn't make sense!

If x = year (like 2020), y-intercept is for year 0 (not useful!)

Making Predictions

Interpolation

Predicting within the data range

Generally reliable

Example: Data from x = 10 to x = 50

Predicting at x = 30 → interpolation ✓

Extrapolation

Predicting outside the data range

Less reliable - pattern may not continue!

Example: Data from x = 10 to x = 50

Predicting at x = 100 → extrapolation ⚠️

Outliers

What is an Outlier?

Point far from the general pattern

Effects:

Can significantly affect line of best fit
May indicate error or special case

On SAT:

Questions may ask about outliers
"Which point doesn't fit the pattern?"

Correlation vs. Causation

CRITICAL DISTINCTION!

Correlation: Variables are related Causation: One variable CAUSES change in other

Correlation ≠ Causation!

Example:

Ice cream sales and drowning deaths are correlated
But ice cream doesn't CAUSE drowning!
Both are caused by third factor (hot weather!)

SAT Trap: Don't assume correlation means causation!

Correlation Coefficient ( $r$ )

What is $r$ ?

Number measuring strength and direction of correlation

Range: $-1 \leq r \leq 1$

$r = 1$ : Perfect positive correlation $r = 0.8$ : Strong positive correlation $r = 0.5$ : Moderate positive correlation $r = 0$ : No correlation $r = -0.5$ : Moderate negative correlation $r = -0.8$ : Strong negative correlation $r = -1$ : Perfect negative correlation

Interpreting $r$

Sign (+ or -): Direction

Positive = positive correlation
Negative = negative correlation

Magnitude (how close to 1): Strength

Close to 1 or -1 = strong
Close to 0 = weak

Residuals

What is a Residual?

Difference between actual value and predicted value

Formula: Residual = Actual - Predicted

Positive residual: Point above line (actual > predicted) Negative residual: Point below line (actual < predicted) Zero residual: Point exactly on line

Residual Plots

Graph of residuals

Random pattern: Good fit Clear pattern: Poor fit (need different model)

SAT Question Types

Type 1: Interpret Slope

"What does the slope represent?"

Answer: Rate of change, change in y per unit change in x

Type 2: Use Equation to Predict

"According to the line, what is y when x = 10?"

Plug in: $y = m(10) + b$

Type 3: Identify Correlation

"Which best describes the relationship?"

Look at: Direction and strength of pattern

Type 4: Find Outlier

"Which point is farthest from the trend?"

Look for: Point that doesn't fit pattern

Type 5: Correlation vs. Causation

"Does x cause y?"

Remember: Correlation doesn't prove causation!

SAT Strategies

Read the Axes!

Always check what variables are being plotted

Look at the Pattern

Upward slope = positive, downward = negative

Use the Equation

Plug in values - don't try to eyeball!

Check Units

Slope units = (y units) per (x unit)

Remember Real-World Context

Does the answer make sense?

Common SAT Patterns

Temperature and Sales

Often positive correlation

Hot temperature → more cold drinks sold

Time and Distance

Positive correlation for travel

More time → more distance covered

Price and Demand

Negative correlation

Higher price → lower demand

Practice and Performance

Positive correlation

More practice → better performance

SAT Tips

Positive correlation: Both increase together (upward slope ↗)
Negative correlation: One increases, other decreases (downward slope ↘)
No correlation: Random scatter, no pattern
Strong correlation: Points cluster tightly around line
Weak correlation: Points loosely follow pattern
Slope ( $m$ ): Rate of change (rise/run)
Y-intercept ( $b$ ): Value when x = 0
Outlier: Point far from pattern
Interpolation: Predicting within data range (reliable)
Extrapolation: Predicting outside data range (less reliable)
Correlation ≠ Causation: Related doesn't mean one causes other!
Use the equation: Plug in values to predict
Read axes carefully: Know what x and y represent
Context matters: Does answer make real-world sense?
$r$ close to 1 or -1: Strong correlation
$r$ close to 0: Weak or no correlation

📚 Practice Problems

1Problem 1easy

❓ Question:

A scatterplot shows the relationship between hours studied (x-axis) and test scores (y-axis). The points show an upward trend from left to right. This indicates:

A) Negative correlation B) Positive correlation C) No correlation D) Causation

💡 Show Solution

Solution:

Pattern: Upward trend (↗)

Meaning: As x increases, y increases

This is positive correlation!

Check choices:

A) Negative → downward slope ✗
B) Positive → upward slope ✓
C) No correlation → random scatter ✗
D) Causation → correlation doesn't prove causation ✗

Answer: B

Why not D? Scatterplot shows correlation, but doesn't prove studying CAUSES higher scores (though it likely does - the graph alone doesn't prove it!)

SAT Tip: Upward slope = positive correlation; Downward slope = negative correlation!

2Problem 2medium

❓ Question:

A line of best fit has equation $y = 3x + 15$ , where $x$ represents hours worked and $y$ represents earnings in dollars. What does the slope represent?

A) Total earnings B) Earnings when hours = 0 C) Dollars earned per hour D) Total hours worked

💡 Show Solution

Solution:

Equation: $y = 3x + 15$

Slope = 3

Slope meaning: Change in y per unit change in x

In context:

x = hours worked
y = earnings (dollars)
Slope = change in dollars per hour

Slope = 3 means earning $3 per hour

Check choices:

A) Total earnings → that's $y$ , not slope ✗
B) Earnings when hours = 0 → that's y-intercept (15) ✗
C) Dollars per hour → YES! ✓
D) Total hours → that's $x$ ✗

Answer: C

Note: Y-intercept of 15 might represent a base payment or starting amount.

SAT Tip: Slope = rate of change = (y units) per (x unit)!

3Problem 3hard

❓ Question:

A scatterplot shows the relationship between age of a car (years) and its value (thousands of dollars). The line of best fit is $y = -2x + 30$ . According to the model, what is the predicted value of a 12-year-old car?

A) $6,000 B) $8,000 C) $54,000 D) $66,000

💡 Show Solution

Solution:

Given equation: $y = -2x + 30$

Variables:

x = age (years)
y = value (thousands of dollars)

Find: Value when x = 12

Plug in x = 12: $y = -2(12) + 30$ $y = -24 + 30$ $y = 6$

But y is in THOUSANDS of dollars!

$y = 6$ thousand = $6,000

Answer: A) $6,000

Check reasonableness:

Negative slope (-2) makes sense: car loses value as it ages ✓
Starting value (y-intercept) = 30 thousand = $30,000 (new car) ✓
Loses $2,000 per year ✓
After 12 years: 30 - 24 = 6 thousand ✓

SAT Tip: Watch the UNITS! "Thousands of dollars" means multiply by 1,000!

🎴

Practice with Flashcards

Review key concepts with our flashcard system

📖

Browse All Topics

Explore other calculus topics

Scatterplots and Line of Best Fit

Scatterplots and Line of Best Fit (SAT Math)

What is a Scatterplot?

Types of Correlation

Positive Correlation

Negative Correlation

No Correlation

Strength of Correlation

Strong Correlation

Weak Correlation

Perfect Correlation

Line of Best Fit (Trend Line)

What Is It?

Equation Form

Interpreting Slope

Slope (mmm) Meaning

Interpreting Y-Intercept

Y-Intercept (bbb) Meaning

Making Predictions

Interpolation

Extrapolation

Outliers

What is an Outlier?

Correlation vs. Causation

CRITICAL DISTINCTION!

Correlation ≠ Causation!

Correlation Coefficient (rrr)

What is rrr?

Range: −1≤r≤1-1 \leq r \leq 1−1≤r≤1

Interpreting rrr

Residuals

What is a Residual?

Residual Plots

SAT Question Types

Type 1: Interpret Slope

Type 2: Use Equation to Predict

Type 3: Identify Correlation

Type 4: Find Outlier

Type 5: Correlation vs. Causation

SAT Strategies

Read the Axes!

Look at the Pattern

Use the Equation

Check Units

Remember Real-World Context

Common SAT Patterns

Temperature and Sales

Time and Distance

Price and Demand

Practice and Performance

SAT Tips

📚 Practice Problems

1Problem 1easy

❓ Question:

2Problem 2medium

❓ Question:

3Problem 3hard

❓ Question:

Practice with Flashcards

Browse All Topics

Slope ( $m$ ) Meaning

Y-Intercept ( $b$ ) Meaning

Correlation Coefficient ( $r$ )

What is $r$ ?

Range: $-1 \leq r \leq 1$

Interpreting $r$