Loadingโฆ
Describe the shape, center, spread, and outliers of a distribution using SOCS.
Learn step-by-step with practice exercises built right in.
Always describe a distribution using SโOโCโS:
Symmetry:
Modality:
A distribution shows most values clustered near 50, with a long tail extending to the right toward 100. Describe the shape and identify where the mean is relative to the median.
This distribution is skewed to the right (positively skewed). When data is skewed right, the mean is pulled toward the tail (toward the higher values), so the mean > median. The long tail of extreme high values increases the average more than it affects the median. This is common in real-world data like incomes or test scores with a ceiling effect.
Avoid these 3 frequent errors
Review key concepts with our flashcard system
Explore more AP Statistics topics
Peakedness:
Definition: observations unusually far from the rest
Impact:
Investigation: is it a genuine measurement, data entry error, or unusual case?
Mean (\(\bar{x}\)): arithmetic average
Median: middle value when ordered
Mode: most frequent value
Rule of thumb:
Range: max โ min
Interquartile Range (IQR): \(Q3 - Q1\)
Variance (\(s^2\)): average squared deviation from mean
Standard Deviation (\(s\)): square root of variance
Data: Heights (inches) of 10 students: 62, 64, 65, 66, 67, 68, 69, 71, 72, 75
SOCS Description:
When comparing two distributions:
Shape: "Distribution A is symmetric while Distribution B is right-skewed."
Center: "The median for Group X is approximately _____ inches, compared to _____ inches for Group Y, so Group X tends to be taller."
Spread: "Group X has an IQR of _____, while Group Y has IQR of _____, so Group Y is more variable."
Outliers: "Distribution A has one outlier at _____, while Distribution B has no outliers."
On free response, examiners want to see you use SOCS explicitly. Write:
Use appropriate statistics for the shape (median/IQR for skewed; mean/std dev for symmetric).
Two datasets have the same median (both 70) but different shapes: Dataset A is symmetric, while Dataset B is skewed left. Without seeing the distributions, explain what this tells you about their means.
Dataset A (symmetric): Mean โ Median โ 70, because symmetry means the values balance equally on both sides.
Dataset B (skewed left): Mean < Median. The median is 70, but the mean is pulled toward the left tail by the extreme low values. The mean will be less than 70.
Why? In a left-skewed distribution, the tail extends toward lower values. These extreme low outliers pull the mean down more than they affect the median (which is just the middle position). This is common in age-at-death data or grade distributions where there's a floor but not a ceiling.
A histogram shows test scores for a large class with two distinct peaks: one at 70 and another at 85. Interpret this distribution and suggest what might explain it. How would you describe center and spread?
Distribution characteristics:
This is a bimodal distribution with two peaks (modes) at 70 and 85. The presence of two modes suggests two distinct groups within the class, not a single homogeneous population.
Possible explanations:
Center & Spread:
Lesson: Always look for multiple peaks; they suggest mixture of populations.