Measures of Center

Mean, median, and mode

Measures of Center

Introduction

Measures of center describe the "typical" or "middle" value in a dataset. They help us answer: "What is a representative value?" The three main measures — mean, median, and mode — each have different properties and appropriate uses.

The Mean

Definition

Mean (xˉ\bar{x}): The arithmetic average

Formula: xˉ=xin=x1+x2+...+xnn\bar{x} = \frac{\sum x_i}{n} = \frac{x_1 + x_2 + ... + x_n}{n}

Where:

  • xi\sum x_i = sum of all values
  • nn = number of observations

Calculating the Mean

Example 1: Test scores: 85, 90, 78, 92, 88

xˉ=85+90+78+92+885=4335=86.6\bar{x} = \frac{85 + 90 + 78 + 92 + 88}{5} = \frac{433}{5} = 86.6

Mean test score = 86.6 points

Example 2: Heights (in inches): 64, 67, 65, 70, 64

xˉ=64+67+65+70+645=3305=66\bar{x} = \frac{64 + 67 + 65 + 70 + 64}{5} = \frac{330}{5} = 66

Mean height = 66 inches

Properties of the Mean

Uses all data:

  • Every value contributes
  • Change any value, mean changes
  • Adding up all deviations from mean = 0

Balance point:

  • If data were on a number line with equal weights, mean is where it would balance
  • Sum of distances below mean = sum of distances above mean

Sensitive to outliers:

  • Extreme values pull mean toward them
  • One very high/low value can change mean substantially

Example showing outlier effect:

Without outlier: 10, 12, 11, 13, 12
xˉ=585=11.6\bar{x} = \frac{58}{5} = 11.6

With outlier: 10, 12, 11, 13, 12, 100
xˉ=1586=26.3\bar{x} = \frac{158}{6} = 26.3

The outlier (100) dramatically increased the mean from 11.6 to 26.3!

When to Use the Mean

Appropriate when: ✓ Distribution is roughly symmetric
✓ No extreme outliers
✓ Need to use all data values
✓ Want mathematical properties (use in further calculations)

Not appropriate when: ❌ Distribution is heavily skewed
❌ Outliers present
❌ Want resistant measure
❌ Data is ordinal (ranked) only

The Median

Definition

Median: The middle value when data is ordered

  • 50th percentile
  • Splits data in half
  • Half values below, half above

Finding the Median

Step 1: Order data from smallest to largest

Step 2: Find middle position

If nn is odd: Median = middle value
Position = n+12\frac{n+1}{2}

If nn is even: Median = average of two middle values
Positions = n2\frac{n}{2} and n2+1\frac{n}{2} + 1

Examples

Example 1 (odd n): Scores: 78, 85, 90, 82, 88

Step 1: Order: 78, 82, 85, 88, 90
Step 2: n=5n = 5 (odd), position = 5+12=3\frac{5+1}{2} = 3
Median = 85 (the 3rd value)

Example 2 (even n): Scores: 78, 85, 90, 82, 88, 92

Step 1: Order: 78, 82, 85, 88, 90, 92
Step 2: n=6n = 6 (even), positions = 3 and 4
Step 3: Values are 85 and 88
Median = 85+882=86.5\frac{85 + 88}{2} = 86.5

Properties of the Median

Resistant to outliers:

  • Position-based, not value-based
  • Extreme values don't affect it much
  • More stable measure for skewed data

Example: Data: 10, 12, 11, 13, 12 → Median = 12
With outlier: 10, 12, 11, 13, 12, 100 → Median = 12

The outlier didn't change the median!

50-50 split:

  • Half the data ≤ median
  • Half the data ≥ median
  • Useful for understanding data distribution

Not affected by exact values:

  • Only needs order and middle position
  • Works well for ordinal data (rankings)

When to Use the Median

Appropriate when: ✓ Distribution is skewed
✓ Outliers are present
✓ Want resistant measure
✓ Data is ordinal (ordered categories)
✓ Interested in "typical" individual

Examples where median is better:

  • Income (right-skewed, few very high earners)
  • Home prices (right-skewed, few very expensive homes)
  • Reaction times (right-skewed, occasional very slow responses)

The Mode

Definition

Mode: The most frequently occurring value

  • Can have one mode (unimodal)
  • Can have multiple modes (bimodal, multimodal)
  • Can have no mode (all values occur once)

Finding the Mode

Count frequency of each value, identify most common

Example 1: Scores: 85, 90, 85, 92, 88, 85

  • 85 appears 3 times
  • 90, 92, 88 each appear once
  • Mode = 85

Example 2: Scores: 85, 90, 85, 92, 90, 88

  • 85 appears twice
  • 90 appears twice
  • Modes = 85 and 90 (bimodal)

Example 3: Scores: 85, 90, 92, 88, 82

  • All values appear once
  • No mode

When to Use the Mode

Appropriate when: ✓ Categorical data
✓ Want most common value
✓ Describing bimodal distributions

Examples:

  • "The most common car color is white" (mode of categorical data)
  • "The distribution is bimodal with peaks at 65 and 72" (describing shape)

Not very useful for: ❌ Continuous numerical data (values rarely repeat)
❌ Summarizing center of distribution

Comparing Mean and Median

Relationship to Distribution Shape

Symmetric distribution: MeanMedianMean \approx Median

Both measures give similar values, either can be used

Right-skewed distribution: Mean>MedianMean > Median

Mean pulled right by high values in tail
Median more representative of "typical" value

Left-skewed distribution: Mean<MedianMean < Median

Mean pulled left by low values in tail
Median more representative of "typical" value

Visual Representation

Symmetric: Mean and median at same location (center of distribution)

Right-skewed: Mean to the right of median (toward tail)

Left-skewed: Mean to the left of median (toward tail)

Choosing Between Mean and Median

Use Mean when:

  • Distribution is symmetric
  • No outliers or extreme skewness
  • Want to use all data
  • Need for further calculations (variance, hypothesis tests)

Use Median when:

  • Distribution is skewed
  • Outliers are present
  • Want resistant measure
  • Ordinal data
  • Interested in "typical" individual rather than arithmetic average

Real-world example: Income

Town income data:

  • Median income: 45,000 dollars
  • Mean income: 75,000 dollars

Mean is much higher because a few very wealthy residents pull it up. The median of 45,000 dollars better represents the "typical" resident's income.

Weighted Mean

Definition

Weighted Mean: When values have different importance or frequency

Formula: xˉw=wixiwi\bar{x}_w = \frac{\sum w_i x_i}{\sum w_i}

Where:

  • wiw_i = weight for each value
  • xix_i = data value

Example: Course Grade

Your course grade is calculated as:

  • Tests: 60% of grade (weight = 0.60)
  • Homework: 25% of grade (weight = 0.25)
  • Final: 15% of grade (weight = 0.15)

Scores:

  • Test average: 85
  • Homework average: 92
  • Final exam: 78

Weighted mean: xˉw=0.60(85)+0.25(92)+0.15(78)\bar{x}_w = 0.60(85) + 0.25(92) + 0.15(78) =51+23+11.7=85.7= 51 + 23 + 11.7 = 85.7

Course grade = 85.7%

Note: Cannot just average 85, 92, and 78 because they have different weights!

Trimmed Mean

Definition

Trimmed Mean: Mean calculated after removing extreme values

Common: 5% trimmed mean (remove lowest 5% and highest 5%)

Purpose

  • More resistant than regular mean
  • Still uses most of data
  • Compromise between mean and median

Example

Data (ordered): 10, 12, 13, 14, 15, 16, 17, 18, 19, 100

Regular mean: 23410=23.4\frac{234}{10} = 23.4 (affected by outlier 100)

10% trimmed mean: Remove lowest 10% (10) and highest 10% (100)
12+13+14+15+16+17+18+198=15.5\frac{12+13+14+15+16+17+18+19}{8} = 15.5

Trimmed mean (15.5) more representative than regular mean (23.4)

Common Mistakes

Using mean with skewed data
Use median instead!

Forgetting to order data for median
Always sort first!

Reporting mode for continuous data
Usually not meaningful when values don't repeat

Not specifying units
Always include units (inches, dollars, points, etc.)

Confusing which measure to use
Consider shape and outliers

Calculating mean of percentages
May need weighted mean if groups are different sizes

Quick Reference

Mean:

  • Formula: xˉ=xin\bar{x} = \frac{\sum x_i}{n}
  • When: Symmetric, no outliers
  • Property: Uses all data, sensitive to extremes
  • Symbol: xˉ\bar{x} (sample), μ\mu (population)

Median:

  • Method: Middle value when ordered
  • When: Skewed, outliers present
  • Property: Resistant, 50-50 split
  • Symbol: M or x~\tilde{x}

Mode:

  • Method: Most frequent value
  • When: Categorical data, describe shape
  • Property: Can have multiple or none

Relationship to shape:

  • Symmetric: Mean ≈ Median
  • Right-skewed: Mean > Median
  • Left-skewed: Mean < Median

Remember: The best measure of center depends on the distribution's shape and the presence of outliers. When in doubt, report both mean and median!

📚 Practice Problems

No example problems available yet.