Measures of Center
Mean, median, and mode
Measures of Center
Introduction
Measures of center describe the "typical" or "middle" value in a dataset. They help us answer: "What is a representative value?" The three main measures — mean, median, and mode — each have different properties and appropriate uses.
The Mean
Definition
Mean (): The arithmetic average
Formula:
Where:
- = sum of all values
- = number of observations
Calculating the Mean
Example 1: Test scores: 85, 90, 78, 92, 88
Mean test score = 86.6 points
Example 2: Heights (in inches): 64, 67, 65, 70, 64
Mean height = 66 inches
Properties of the Mean
Uses all data:
- Every value contributes
- Change any value, mean changes
- Adding up all deviations from mean = 0
Balance point:
- If data were on a number line with equal weights, mean is where it would balance
- Sum of distances below mean = sum of distances above mean
Sensitive to outliers:
- Extreme values pull mean toward them
- One very high/low value can change mean substantially
Example showing outlier effect:
Without outlier: 10, 12, 11, 13, 12
With outlier: 10, 12, 11, 13, 12, 100
The outlier (100) dramatically increased the mean from 11.6 to 26.3!
When to Use the Mean
Appropriate when:
✓ Distribution is roughly symmetric
✓ No extreme outliers
✓ Need to use all data values
✓ Want mathematical properties (use in further calculations)
Not appropriate when:
❌ Distribution is heavily skewed
❌ Outliers present
❌ Want resistant measure
❌ Data is ordinal (ranked) only
The Median
Definition
Median: The middle value when data is ordered
- 50th percentile
- Splits data in half
- Half values below, half above
Finding the Median
Step 1: Order data from smallest to largest
Step 2: Find middle position
If is odd: Median = middle value
Position =
If is even: Median = average of two middle values
Positions = and
Examples
Example 1 (odd n): Scores: 78, 85, 90, 82, 88
Step 1: Order: 78, 82, 85, 88, 90
Step 2: (odd), position =
Median = 85 (the 3rd value)
Example 2 (even n): Scores: 78, 85, 90, 82, 88, 92
Step 1: Order: 78, 82, 85, 88, 90, 92
Step 2: (even), positions = 3 and 4
Step 3: Values are 85 and 88
Median =
Properties of the Median
Resistant to outliers:
- Position-based, not value-based
- Extreme values don't affect it much
- More stable measure for skewed data
Example:
Data: 10, 12, 11, 13, 12 → Median = 12
With outlier: 10, 12, 11, 13, 12, 100 → Median = 12
The outlier didn't change the median!
50-50 split:
- Half the data ≤ median
- Half the data ≥ median
- Useful for understanding data distribution
Not affected by exact values:
- Only needs order and middle position
- Works well for ordinal data (rankings)
When to Use the Median
Appropriate when:
✓ Distribution is skewed
✓ Outliers are present
✓ Want resistant measure
✓ Data is ordinal (ordered categories)
✓ Interested in "typical" individual
Examples where median is better:
- Income (right-skewed, few very high earners)
- Home prices (right-skewed, few very expensive homes)
- Reaction times (right-skewed, occasional very slow responses)
The Mode
Definition
Mode: The most frequently occurring value
- Can have one mode (unimodal)
- Can have multiple modes (bimodal, multimodal)
- Can have no mode (all values occur once)
Finding the Mode
Count frequency of each value, identify most common
Example 1: Scores: 85, 90, 85, 92, 88, 85
- 85 appears 3 times
- 90, 92, 88 each appear once
- Mode = 85
Example 2: Scores: 85, 90, 85, 92, 90, 88
- 85 appears twice
- 90 appears twice
- Modes = 85 and 90 (bimodal)
Example 3: Scores: 85, 90, 92, 88, 82
- All values appear once
- No mode
When to Use the Mode
Appropriate when:
✓ Categorical data
✓ Want most common value
✓ Describing bimodal distributions
Examples:
- "The most common car color is white" (mode of categorical data)
- "The distribution is bimodal with peaks at 65 and 72" (describing shape)
Not very useful for:
❌ Continuous numerical data (values rarely repeat)
❌ Summarizing center of distribution
Comparing Mean and Median
Relationship to Distribution Shape
Symmetric distribution:
Both measures give similar values, either can be used
Right-skewed distribution:
Mean pulled right by high values in tail
Median more representative of "typical" value
Left-skewed distribution:
Mean pulled left by low values in tail
Median more representative of "typical" value
Visual Representation
Symmetric: Mean and median at same location (center of distribution)
Right-skewed: Mean to the right of median (toward tail)
Left-skewed: Mean to the left of median (toward tail)
Choosing Between Mean and Median
Use Mean when:
- Distribution is symmetric
- No outliers or extreme skewness
- Want to use all data
- Need for further calculations (variance, hypothesis tests)
Use Median when:
- Distribution is skewed
- Outliers are present
- Want resistant measure
- Ordinal data
- Interested in "typical" individual rather than arithmetic average
Real-world example: Income
Town income data:
- Median income: 45,000 dollars
- Mean income: 75,000 dollars
Mean is much higher because a few very wealthy residents pull it up. The median of 45,000 dollars better represents the "typical" resident's income.
Weighted Mean
Definition
Weighted Mean: When values have different importance or frequency
Formula:
Where:
- = weight for each value
- = data value
Example: Course Grade
Your course grade is calculated as:
- Tests: 60% of grade (weight = 0.60)
- Homework: 25% of grade (weight = 0.25)
- Final: 15% of grade (weight = 0.15)
Scores:
- Test average: 85
- Homework average: 92
- Final exam: 78
Weighted mean:
Course grade = 85.7%
Note: Cannot just average 85, 92, and 78 because they have different weights!
Trimmed Mean
Definition
Trimmed Mean: Mean calculated after removing extreme values
Common: 5% trimmed mean (remove lowest 5% and highest 5%)
Purpose
- More resistant than regular mean
- Still uses most of data
- Compromise between mean and median
Example
Data (ordered): 10, 12, 13, 14, 15, 16, 17, 18, 19, 100
Regular mean: (affected by outlier 100)
10% trimmed mean: Remove lowest 10% (10) and highest 10% (100)
Trimmed mean (15.5) more representative than regular mean (23.4)
Common Mistakes
❌ Using mean with skewed data
Use median instead!
❌ Forgetting to order data for median
Always sort first!
❌ Reporting mode for continuous data
Usually not meaningful when values don't repeat
❌ Not specifying units
Always include units (inches, dollars, points, etc.)
❌ Confusing which measure to use
Consider shape and outliers
❌ Calculating mean of percentages
May need weighted mean if groups are different sizes
Quick Reference
Mean:
- Formula:
- When: Symmetric, no outliers
- Property: Uses all data, sensitive to extremes
- Symbol: (sample), (population)
Median:
- Method: Middle value when ordered
- When: Skewed, outliers present
- Property: Resistant, 50-50 split
- Symbol: M or
Mode:
- Method: Most frequent value
- When: Categorical data, describe shape
- Property: Can have multiple or none
Relationship to shape:
- Symmetric: Mean ≈ Median
- Right-skewed: Mean > Median
- Left-skewed: Mean < Median
Remember: The best measure of center depends on the distribution's shape and the presence of outliers. When in doubt, report both mean and median!
📚 Practice Problems
No example problems available yet.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics