Transformations for Linearity
Linearizing nonlinear relationships
Transformations to Achieve Linearity
Why Transform?
Problem: Many relationships are nonlinear
Solution: Transform one or both variables to make relationship linear
Benefits:
- Can use linear regression tools
- Easier interpretation
- Better predictions
When to Transform
Indicators need transformation:
- Scatterplot shows curve (not line)
- Residual plot shows pattern (not random)
- Low r² despite clear relationship
Don't transform if:
- Relationship already linear
- Residual plot looks good
Common Transformations
For y:
- log(y): Exponential growth/decay
- √y: Moderate curve
- 1/y: Inverse relationship
For x:
- log(x): Logarithmic curve
- x²: Quadratic relationship
- √x: Moderate curve
Both:
- log(y) vs log(x): Power relationship
Exponential Model
Original relationship:
Curved scatterplot, exponential growth/decay
Transform: Take log of y
Becomes linear:
Regression: log(y) on x gives linear relationship
Example: Population growth, compound interest, radioactive decay
Example 1: Exponential Transformation
Bacteria population over time:
Original data shows exponential growth (curved)
Transform: Calculate log(population) for each time
New scatterplot: log(population) vs time is linear!
Regression:
Back-transform for predictions:
Power Model
Original relationship:
Curved relationship
Transform: Take log of both
Becomes linear:
Regression: log(y) on log(x) gives linear relationship
Example: Area vs radius, metabolic rate vs body mass
Example 2: Power Transformation
Planet orbital period vs distance from sun:
Both variables on logarithmic scale → linear!
Regression:
Slope b ≈ 1.5 (Kepler's third law: )
Square Root and Squaring
√y transformation:
- Moderate upward curve
- Spread-increasing pattern
x² transformation:
- Quadratic relationship (parabola)
- But limited to one side
Example: Free-fall distance (d) vs time (t)
suggests regress d on t²
Choosing the Right Transformation
Trial and error approach:
- Try transformation
- Make scatterplot of transformed data
- Check residual plot
- Check r²
- If not linear, try different transformation
Guided approach:
- Exponential pattern → log(y)
- Power relationship → log-log
- Quadratic → x²
- Fan shape in residuals → log(y)
Interpreting Transformed Models
Log(y) on x:
Slope interpretation: "For each unit increase in x, y is multiplied by "
Example: Slope = 0.301 in log(population) vs time
"Each year, population multiplies by "
(Population doubles each year)
Log(y) on log(x):
Slope interpretation: "A 1% increase in x is associated with approximately b% increase in y"
Back-Transformation
After fitting model on transformed data:
Make predictions on transformed scale, then back-transform
Example: Model is
For x = 10:
Don't just transform predictions after the fact!
Checking the Transformation
Good transformation produces:
- Linear scatterplot
- Random residual plot
- Higher r²
- Roughly constant spread
Compare before/after:
- Original r² vs transformed r²
- Original residual plot vs transformed residual plot
Multiple Transformations
Sometimes try several:
Example: Comparing transformations for curved data
- log(y) vs x: r² = 0.85
- √y vs x: r² = 0.92
- y vs x²: r² = 0.78
Choose: √y vs x (highest r², simplest)
Common Patterns and Transformations
| Pattern | Try | |---------|-----| | Exponential growth/decay | log(y) | | Power relationship | log(y) and log(x) | | Quadratic (parabola) | x² | | Moderate upward curve | √y or √x | | Spread increases with y | log(y) |
Residual Plot After Transformation
Must check! Transformation successful if:
- No pattern in residuals
- Random scatter around 0
- Constant spread
If still see pattern: Try different transformation
Linearizable vs Non-linearizable
Linearizable: Can be made linear with transformation
- Exponential: y = ab^x
- Power: y = ax^p
- Quadratic: y = a + bx + cx²
Non-linearizable: Cannot be easily linearized
- Some periodic functions
- Complex curves
- May need nonlinear regression
Common Mistakes
❌ Not checking residual plot after transformation
❌ Back-transforming incorrectly
❌ Transforming when already linear
❌ Misinterpreting slope of transformed model
❌ Comparing r² before and after (different y variable!)
Practical Considerations
Pros of transformation:
- Use simple linear methods
- Often theoretically motivated
- Can improve predictions
Cons of transformation:
- Harder to interpret
- Must back-transform for predictions
- Not all relationships linearizable
Alternative: Modern nonlinear regression (beyond AP Stats)
Example 3: Complete Transformation
Original: y vs x is curved (r² = 0.40, residuals show pattern)
Transform: Use log(y)
New: log(y) vs x is linear (r² = 0.95, random residuals)
Equation:
Interpretation: "Each unit increase in x multiplies y by "
For prediction at x = 10:
Quick Reference
Exponential (y = ab^x): Use log(y) vs x
Power (y = ax^p): Use log(y) vs log(x)
Quadratic: Use y vs x²
Goal: Linear scatterplot, random residuals, high r²
Check: Always examine residual plot of transformed data
Interpret carefully: Slopes mean different things after transformation
Remember: Transform to fix nonlinearity, but always check if transformation worked! Linear models are powerful when applied to appropriately transformed data.
📚 Practice Problems
No example problems available yet.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics