Loading…
Use power, logarithmic, and exponential transformations to achieve linearity.
Learn step-by-step with practice exercises built right in.
Problem: Scatterplot or residual plot shows curved (non-linear) pattern.
Solution: Transform one or both variables to "straighten" the relationship.
Common growth patterns:
A scatterplot of x vs y shows a curved exponential pattern. The residual plot for ŷ = a + bx is curved. Try plotting log(y) vs x. What pattern should you see if this transformation works?
Step 1: Understand the original problem
Step 2: Why try log(y) vs x? Exponential relationship: y = ae^(bx) Take log of both sides: log(y) = log(a) + bx
This is LINEAR in x!
Step 3: What to look for after transformation If log transformation is appropriate: ✓ Scatterplot of log(y) vs x should be LINEAR ✓ Residual plot should show RANDOM scatter ✓ No curved pattern in residuals
Step 4: How to check
Avoid these 3 frequent errors
Review key concepts with our flashcard system
Explore more AP Statistics topics
When: Scatterplot shows exponential shape; residuals curve upward.
Transform: Let or
Now: is approximately linear
Regression: Fit
Back-transformation (for predictions):
When: Both variables show exponential/power growth; need
Transform: Let and
Now: is linear
Back-transformation:
Data: Bacteria population vs. time (hours)
| Hours | Count |
|---|---|
| 0 | 100 |
| 1 | 150 |
| 2 | 225 |
| 3 | 340 |
Scatterplot shows rapid growth (exponential).
Transform:
| Hours | |
|---|---|
| 0 | 4.61 |
| 1 | 5.01 |
| 2 | 5.42 |
| 3 | 5.83 |
Linear regression on vs. Hours:
Prediction: At hour 4:
Back-transform: bacteria
❌ Transforming without checking scatterplot first ❌ Forgetting to back-transform predictions ❌ Using log base 10 and natural log inconsistently ❌ Log(negative number) — ensure all values positive!
Show the original scatterplot. State "The relationship appears exponential, so I used ln(y)." Show the transformed scatterplot. Report for the transformed data. Always back-transform final predictions.
Step 5: Interpretation After transformation:
Answer: After log transformation, the plot of log(y) vs x should show a LINEAR pattern, and residuals should be randomly scattered with no curve.
Data shows a power relationship: y = ax^b. What transformation will linearize this relationship?
Step 1: Identify the relationship Power model: y = ax^b (Example: area = πr², where b = 2)
Step 2: Apply log transformation to BOTH variables Take log of both sides: log(y) = log(a × x^b) log(y) = log(a) + log(x^b) log(y) = log(a) + b·log(x)
Step 3: Recognize linear form Let: Y = log(y), X = log(x), A = log(a) Then: Y = A + bX
This is LINEAR!
Step 4: How to transform
Step 5: Interpret coefficients After regression:
To predict original y: ŷ = e^(b₀) × x^(b₁) [if using natural log] ŷ = 10^(b₀) × x^(b₁) [if using log base 10]
Example: If Ŷ = 2 + 1.5X (using log base 10) Then y = 10² × x^1.5 = 100x^1.5
Answer: Take log of BOTH variables. Plot log(y) vs log(x), which linearizes power relationships.
After fitting y vs x, the residual plot fans out (variance increases). You try log(y) vs x and get a better residual plot. Why does this help?
Step 1: Identify the original problem Fan-shaped residuals mean:
Step 2: Why log(y) helps with variance When y is exponential or multiplicative:
Mathematical reason: If y has variance proportional to y²: Var(y) ∝ y²
Then: Var(log(y)) ≈ constant (Delta method from calculus)
Step 3: Additional benefit Log transformation often: ✓ Linearizes exponential relationships ✓ Stabilizes variance (fixes fan shape) ✓ Makes distribution more symmetric ✓ Reduces impact of outliers
Step 4: When to use log transformation Use log(y) when you see:
Step 5: Check after transformation After using log(y):
Answer: Log transformation stabilizes variance. When variance increases with mean (fan shape), log(y) typically has constant variance, fixing the heteroscedasticity problem.
You fit log(y) = 2 + 0.5x using natural log. Predict y when x = 10.
Step 1: Understand the model Fitted equation: log(y) = 2 + 0.5x This uses NATURAL LOG (ln)
Step 2: Predict log(y) for x = 10 log(y) = 2 + 0.5(10) log(y) = 2 + 5 log(y) = 7
Step 3: Back-transform to get y Since we used natural log (ln): ln(y) = 7
To solve for y, use exponential: y = e^7
Step 4: Calculate y = e^7 ≈ 1,096.63
Step 5: Interpretation "When x = 10, y is predicted to be approximately 1,097."
Important notes:
Alternative form: Original model: y = e^(2 + 0.5x) = e² × e^(0.5x) y = e² × e^(0.5x) ≈ 7.39 × e^(0.5x)
When x = 10: y = 7.39 × e^5 ≈ 1,097
Answer: y = e^7 ≈ 1,097
A residual plot shows both curvature AND fan shape. What transformations might you try?
Step 1: Identify TWO problems
Need transformation that fixes BOTH!
Step 2: Try log(y) vs x Often works for:
Check result: ✓ Should be linear ✓ Should have constant variance
Step 3: If log(y) doesn't work completely Try other transformations:
Step 4: Systematic approach
Step 5: Decision guide Pattern → Try transformation:
Step 6: After transformation Must verify: ✓ Scatterplot is linear ✓ Residuals randomly scattered ✓ Constant variance (no fan) ✓ Approximately normal residuals
Answer: Try log(y) vs x first, as it often fixes both curvature (exponential) and fan shape (non-constant variance). Check residual plot; if issues remain, try other transformations like √y or log-log.