Loadingโฆ
Learn to identify categorical vs. quantitative data, and understand different sampling methods.
Learn step-by-step with practice exercises built right in.
Categorical (Qualitative) Data
Quantitative (Numerical) Data
Population: entire group of individuals we want information about
A survey asks students: 'Do you prefer morning or afternoon classes?' and 'How many hours per week do you study?' Classify each variable.
The first variable (morning or afternoon) is categorical because it describes a category preference with no numerical value. The second variable (study hours) is quantitative (specifically continuous) because it represents a measurable quantity that can take any value within a range. In data collection, categorical variables describe qualities while quantitative variables measure quantities.
Avoid these 3 frequent errors
Review key concepts with our flashcard system
Explore more AP Statistics topics
Sample: subset of the population we actually collect data from
Parameter: numerical summary of a population (unknown, fixed)
Statistic: numerical summary of a sample (known, varies sample to sample)
Simple Random Sample (SRS)
Stratified Random Sample
Cluster Sample
Systematic Sample
Convenience Sample (avoid for inference)
Sampling bias (selection bias): certain individuals more likely to be selected
Response bias: individuals respond untruthfully or refuse
Non-response bias: some selected individuals don't respond
Undercoverage: some part of population not accessible
Scenario: A school wants to estimate mean SAT score for all 1,200 seniors.
Method 1 (SRS): Generate 100 random student IDs from 1โ1200, compute mean score for those students.
Method 2 (Stratified): Divide into 3 strata by gender (400 male, 500 female, 300 nonbinary). From each stratum, randomly select 33โ34 students. Compute mean.
Method 3 (Systematic): Generate random starting point (say, 5), then select students 5, 17, 29, 41, ... until 100 selected.
On FRQ prompt about study design, identify:
Common error: assuming \(ar{x}\) = ฮผ just because you have a large sample. Sampling bias can produce bad estimates even with large n.
A researcher wants to estimate the average GPA of all 10,000 students at a university. She randomly selects 250 students and calculates their mean GPA as 3.42. Identify the population, sample, parameter, and statistic.
Population: All 10,000 students at the university (the entire group of interest)
Sample: The 250 randomly selected students (the subset actually studied)
Parameter: The average (mean) GPA of all 10,000 students โ this is unknown and what the researcher wants to estimate. It's a fixed value describing the population.
Statistic: The mean GPA of 3.42 from the sample of 250 students โ this is known from the data and used to estimate the parameter.
Key distinction: A parameter describes a population (usually unknown); a statistic describes a sample (known from data) and is used to estimate the parameter.
Explain why a census might be impractical for estimating the average lifespan of light bulbs manufactured by a company, and explain what sampling method you would use instead.
Why a census is impractical:
A census requires testing all light bulbs produced by the company. This would mean destroying them all to measure their lifespan โ the manufacturer would have no product left to sell! This is both destructive and economically infeasible.
Better approach: Sampling
Use random sampling (specifically, a simple random sample or SRS) by:
This preserves most inventory, is cost-effective, and gives a reliable estimate when the sample size is adequate. The randomness ensures the sample is representative of all bulbs produced.