Measures of Spread
Range, IQR, standard deviation, and variance
Measures of Spread
Introduction
While measures of center tell us the "typical" value, measures of spread (also called measures of variability or dispersion) tell us how spread out or variable the data is. Two datasets can have the same mean but very different spreads!
Example:
- Class A scores: 70, 72, 73, 74, 75 (Mean = 72.8, very consistent)
- Class B scores: 50, 60, 73, 80, 100 (Mean = 72.6, highly variable)
Both classes have similar means, but Class B has much more spread!
Range
Definition
Range: The difference between the maximum and minimum values
Formula:
Calculating Range
Example 1: Test scores: 68, 75, 82, 91, 88
- Max = 91
- Min = 68
- Range = 91 - 68 = 23 points
Example 2: Temperatures (°F): 45, 52, 58, 51, 62, 48
- Max = 62
- Min = 45
- Range = 62 - 45 = 17°F
Properties of Range
Advantages:
✓ Very easy to calculate and understand
✓ Gives sense of total spread
✓ Useful for quick assessment
Disadvantages:
❌ Only uses two values (ignores all others)
❌ Extremely sensitive to outliers
❌ Doesn't tell us about distribution between min and max
❌ Increases with sample size (larger samples tend to have more extreme values)
Example of outlier sensitivity:
Without outlier: 10, 12, 13, 14, 15
Range = 15 - 10 = 5
With outlier: 10, 12, 13, 14, 15, 50
Range = 50 - 10 = 40
One outlier dramatically changed the range!
When to Use Range
Appropriate for:
- Quick, rough sense of spread
- Knowing the extreme values matters
- Quality control (acceptable range of values)
Not appropriate when:
- Outliers present
- Need precise measure of variability
- Comparing datasets of different sizes
Interquartile Range (IQR)
Definition
IQR: The range of the middle 50% of data
Formula:
Where:
- Q1 = First quartile (25th percentile)
- Q3 = Third quartile (75th percentile)
Finding Quartiles and IQR
Step 1: Order data from smallest to largest
Step 2: Find median (Q2)
Step 3: Find median of lower half = Q1
Step 4: Find median of upper half = Q3
Step 5: Calculate IQR = Q3 - Q1
Example
Data: 12, 15, 17, 19, 20, 22, 25, 28, 30, 35, 40
Step 1: Already ordered
Step 2: Median (Q2) = 22 (middle value, n=11)
Step 3: Lower half: 12, 15, 17, 19, 20
Q1 = 17 (median of lower half)
Step 4: Upper half: 25, 28, 30, 35, 40
Q3 = 30 (median of upper half)
Step 5: IQR = 30 - 17 = 13
Interpretation: The middle 50% of data spans 13 units
Properties of IQR
Advantages:
✓ Resistant to outliers (uses middle 50% only)
✓ More stable than range
✓ Useful with skewed data
✓ Basis for outlier detection (1.5 × IQR rule)
Disadvantages:
❌ Ignores 50% of data (lowest 25%, highest 25%)
❌ Less mathematically useful than standard deviation
❌ Harder to calculate than range
Using IQR to Identify Outliers
1.5 × IQR Rule:
Lower fence:
Upper fence:
Outliers: Values below lower fence or above upper fence
Example (from previous):
- Q1 = 17, Q3 = 30, IQR = 13
- Lower fence = 17 - 1.5(13) = 17 - 19.5 = -2.5
- Upper fence = 30 + 1.5(13) = 30 + 19.5 = 49.5
- Any values < -2.5 or > 49.5 are outliers
When to Use IQR
Appropriate when:
✓ Distribution is skewed
✓ Outliers are present
✓ Want resistant measure
✓ Describing boxplots
Paired with: Median (both resistant measures)
Variance and Standard Deviation
Why We Need Them
Range and IQR don't use all data values. Variance and standard deviation measure average distance from the mean using ALL data points.
Variance ()
Definition: Average squared deviation from the mean
Formula (sample variance):
Steps to calculate:
- Find mean ()
- Find each deviation:
- Square each deviation:
- Sum squared deviations:
- Divide by
Note: We divide by (not ) for sample variance. This is called Bessel's correction and gives a better estimate of population variance.
Standard Deviation ()
Definition: Square root of variance
Formula (sample standard deviation):
Why take square root?
- Variance is in squared units (points², dollars²)
- Standard deviation returns to original units (points, dollars)
- More interpretable!
Example Calculation
Data: 10, 12, 14, 16, 18
Step 1: Find mean
Step 2: Find deviations and square them
| | | | |---------|---------------------|------------------------| | 10 | -4 | 16 | | 12 | -2 | 4 | | 14 | 0 | 0 | | 16 | 2 | 4 | | 18 | 4 | 16 |
Step 3: Sum squared deviations
Step 4: Calculate variance
Step 5: Calculate standard deviation
Interpretation: On average, values deviate from the mean by about 3.16 units.
Properties of Standard Deviation
Interpretation:
- Typical distance from mean
- Larger SD = more spread out
- Smaller SD = more clustered around mean
- SD = 0 only when all values are identical
Properties:
- Always ≥ 0
- Same units as original data
- Sensitive to outliers (because we square deviations)
- Used in many statistical procedures
Empirical Rule (for roughly normal distributions):
- About 68% of data within 1 SD of mean
- About 95% of data within 2 SD of mean
- About 99.7% of data within 3 SD of mean
When to Use Standard Deviation
Appropriate when:
✓ Distribution is roughly symmetric
✓ No extreme outliers
✓ Want to use all data
✓ Need for statistical inference
✓ Describing normal distributions
Paired with: Mean (both use all data, both sensitive to outliers)
Not appropriate when:
❌ Distribution is heavily skewed
❌ Outliers present
❌ Want resistant measure
Choosing the Right Measure
Decision Framework
Distribution Shape:
Symmetric, no outliers:
- Center: Mean
- Spread: Standard deviation
- "The mean is [value] with a standard deviation of [value]"
Skewed or outliers present:
- Center: Median
- Spread: IQR
- "The median is [value] with an IQR of [value]"
Comparison Table
| Measure | Resistant? | Uses All Data? | Units | |----------------------|------------|----------------|-----------------| | Range | No | No (only 2) | Original | | IQR | Yes | No (middle 50%)| Original | | Variance | No | Yes | Squared | | Standard Deviation | No | Yes | Original |
Effect of Transformations
Adding/Subtracting a Constant
Adding to every value:
- Range: Unchanged
- IQR: Unchanged
- SD: Unchanged
Example: Convert test scores from points to percent by adding 50
- Original SD = 5 points
- New SD = 5 percent
- Spread didn't change, just units!
Multiplying/Dividing by a Constant
Multiplying every value by :
- Range: Multiplied by
- IQR: Multiplied by
- SD: Multiplied by
- Variance: Multiplied by
Example: Convert heights from inches to centimeters (multiply by 2.54)
- Original SD = 3 inches
- New SD = 3 × 2.54 = 7.62 cm
Coefficient of Variation
Definition
Coefficient of Variation (CV): Ratio of standard deviation to mean
Formula:
Purpose
Compare variability across different units or scales
Example:
-
Heights: Mean = 66 inches, SD = 3 inches
CV = (3/66) × 100% = 4.5% -
Weights: Mean = 150 lbs, SD = 20 lbs
CV = (20/150) × 100% = 13.3%
Weights are more variable relative to their mean than heights!
When to Use CV
✓ Comparing datasets with different units
✓ Comparing datasets with very different means
✓ Wanting relative (not absolute) measure of spread
Common Mistakes
❌ Using SD with skewed data
Use IQR instead!
❌ Forgetting units
Range, IQR, SD all have units!
❌ Confusing variance and SD
Variance is squared units, SD is original units
❌ Dividing by instead of
Sample SD uses (degrees of freedom)
❌ Reporting spread without center
Always report both!
❌ Comparing SDs of very different datasets
Consider CV for fair comparison
Quick Reference
Range:
- Formula:
- When: Quick assessment
- Property: Sensitive to outliers
IQR:
- Formula:
- When: Skewed data, outliers
- Property: Resistant
Standard Deviation:
- Formula:
- When: Symmetric, no outliers
- Property: Uses all data
Choosing:
- Symmetric → Mean & SD
- Skewed → Median & IQR
Outlier Rule:
- Outliers beyond or
Remember: Spread is just as important as center! Two datasets can have the same mean but completely different spreads. Always report both center AND spread when describing data!
📚 Practice Problems
No example problems available yet.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics