Data Collection and Statistics

Understand data collection methods, sampling, observational studies vs experiments, and draw valid conclusions from statistical data.

๐ŸŽฏโญ INTERACTIVE LESSON

Try the Interactive Version!

Learn step-by-step with practice exercises built right in.

Start Interactive Lesson โ†’

Placeholder - will be populated by expansion script

๐Ÿ“š Practice Problems

1Problem 1easy

โ“ Question:

A histogram shows the following frequencies for test scores: 60-69: 3 students 70-79: 8 students 80-89: 12 students 90-100: 7 students

How many students scored below 80?

๐Ÿ’ก Show Solution

Step 1: Add the frequencies for intervals below 80: 3+8=113 + 8 = 11

Answer: 11 students scored below 80.

SAT Tip: On histograms, "below 80" includes the 60-69 and 70-79 intervals. Be careful about whether the boundary value is included.

2Problem 2medium

โ“ Question:

The mean of 6 numbers is 15. When a 7th number is added, the mean becomes 17. What is the 7th number?

๐Ÿ’ก Show Solution

Step 1: Find the original sum: 6ร—15=906 \times 15 = 90

Step 2: Find the new sum: 7ร—17=1197 \times 17 = 119

Step 3: The 7th number = 119โˆ’90=29119 - 90 = 29

Check: (90+29)รท7=119รท7=17(90 + 29) \div 7 = 119 \div 7 = 17 โœ“

Answer: The 7th number is 29.

3Problem 3medium

โ“ Question:

In a box plot, Q1=30Q_1 = 30, Median=45\text{Median} = 45, Q3=60Q_3 = 60, Min=10\text{Min} = 10, Max=85\text{Max} = 85. What is the IQR, and which single value, if added, would most change the mean but least change the median?

๐Ÿ’ก Show Solution

IQR: Q3โˆ’Q1=60โˆ’30=30Q_3 - Q_1 = 60 - 30 = 30

Effect of adding an extreme value: Adding a very large value (e.g., 200) would:

  • Mean: Increase significantly (the mean is sensitive to extremes)
  • Median: Change very little (the median is resistant to outliers โ€” it only depends on the middle value)

Answer: IQR = 30. An extreme outlier (very large or very small) would most change the mean but least change the median.

4Problem 4hard

โ“ Question:

A researcher wants to determine if a new teaching method improves test scores. She randomly assigns 50 students to use the new method and 50 to use the traditional method. The new method group has a mean score of 82, while the traditional group has a mean of 78. Can she conclude the new method CAUSES higher scores?

๐Ÿ’ก Show Solution

Answer: YES โ€” with appropriate caveats.

Why: This is a randomized controlled experiment, not just an observational study.

Key features that allow a causal conclusion:

  1. Random assignment to treatment groups โ€” this controls for confounding variables
  2. Control group (traditional method) for comparison
  3. Same number in each group

However, she should also consider:

  • Statistical significance โ€” is the 4-point difference large enough to not be due to chance? (She'd need a p-value or confidence interval.)
  • Practical significance โ€” is a 4-point difference meaningful in practice?

SAT Rule:

  • Randomized experiment โ†’ CAN conclude causation
  • Observational study โ†’ can only conclude ASSOCIATION

5Problem 5expert

โ“ Question:

Set A has values {10, 12, 14, 16, 18} and Set B has values {12, 13, 14, 15, 16}. Without calculating, which set has the larger standard deviation? Explain.

๐Ÿ’ก Show Solution

Set A has the larger standard deviation.

Reasoning: Both sets have the same mean:

  • Set A: 10+12+14+16+185=14\frac{10+12+14+16+18}{5} = 14
  • Set B: 12+13+14+15+165=14\frac{12+13+14+15+16}{5} = 14

Set A has values that are more spread out from 14 (ranging from 10 to 18, each value 2 units apart from the next).

Set B has values that are more tightly clustered around 14 (ranging from 12 to 16, each value only 1 unit apart).

Since standard deviation measures how far values are from the mean on average, Set A has the larger standard deviation.

Answer: Set A

SAT Tip: You don't need to calculate SD on the SAT โ€” just understand that wider spread = larger SD.