Data Collection and Statistics
Understand data collection methods, sampling, observational studies vs experiments, and draw valid conclusions from statistical data.
Try the Interactive Version!
Learn step-by-step with practice exercises built right in.
Placeholder - will be populated by expansion script
๐ Practice Problems
1Problem 1easy
โ Question:
A histogram shows the following frequencies for test scores: 60-69: 3 students 70-79: 8 students 80-89: 12 students 90-100: 7 students
How many students scored below 80?
๐ก Show Solution
Step 1: Add the frequencies for intervals below 80:
Answer: 11 students scored below 80.
SAT Tip: On histograms, "below 80" includes the 60-69 and 70-79 intervals. Be careful about whether the boundary value is included.
2Problem 2medium
โ Question:
The mean of 6 numbers is 15. When a 7th number is added, the mean becomes 17. What is the 7th number?
๐ก Show Solution
Step 1: Find the original sum:
Step 2: Find the new sum:
Step 3: The 7th number =
Check: โ
Answer: The 7th number is 29.
3Problem 3medium
โ Question:
In a box plot, , , , , . What is the IQR, and which single value, if added, would most change the mean but least change the median?
๐ก Show Solution
IQR:
Effect of adding an extreme value: Adding a very large value (e.g., 200) would:
- Mean: Increase significantly (the mean is sensitive to extremes)
- Median: Change very little (the median is resistant to outliers โ it only depends on the middle value)
Answer: IQR = 30. An extreme outlier (very large or very small) would most change the mean but least change the median.
4Problem 4hard
โ Question:
A researcher wants to determine if a new teaching method improves test scores. She randomly assigns 50 students to use the new method and 50 to use the traditional method. The new method group has a mean score of 82, while the traditional group has a mean of 78. Can she conclude the new method CAUSES higher scores?
๐ก Show Solution
Answer: YES โ with appropriate caveats.
Why: This is a randomized controlled experiment, not just an observational study.
Key features that allow a causal conclusion:
- Random assignment to treatment groups โ this controls for confounding variables
- Control group (traditional method) for comparison
- Same number in each group
However, she should also consider:
- Statistical significance โ is the 4-point difference large enough to not be due to chance? (She'd need a p-value or confidence interval.)
- Practical significance โ is a 4-point difference meaningful in practice?
SAT Rule:
- Randomized experiment โ CAN conclude causation
- Observational study โ can only conclude ASSOCIATION
5Problem 5expert
โ Question:
Set A has values {10, 12, 14, 16, 18} and Set B has values {12, 13, 14, 15, 16}. Without calculating, which set has the larger standard deviation? Explain.
๐ก Show Solution
Set A has the larger standard deviation.
Reasoning: Both sets have the same mean:
- Set A:
- Set B:
Set A has values that are more spread out from 14 (ranging from 10 to 18, each value 2 units apart from the next).
Set B has values that are more tightly clustered around 14 (ranging from 12 to 16, each value only 1 unit apart).
Since standard deviation measures how far values are from the mean on average, Set A has the larger standard deviation.
Answer: Set A
SAT Tip: You don't need to calculate SD on the SAT โ just understand that wider spread = larger SD.