🎯⭐ INTERACTIVE LESSON

Inference for Two Sample Means (CI and Test)

Learn step-by-step with interactive practice!

← Back to Standard Lesson

Inference for Two Sample Means (CI and Test) - Complete Interactive Lesson

Part 1: Two-Sample Z-Test for Proportions

⚖️ Comparing Two Populations

Part 1 of 7 — Two-Sample Z-Test for Proportions

When to Compare Two Proportions

Use when you have two independent groups and want to test whether their population proportions differ.

Example: Is the proportion of smartphone users higher among teens than adults?

Hypotheses

$H_0: p_1 - p_2 = 0 \quad (\text{no difference})$ $H_a: p_1 - p_2 \neq 0 \quad (\text{or } > 0 \text{ or } < 0)$

Test Statistic

$z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$

where $\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}$ is the .

Conditions

Random samples from both populations
Independent groups (and $n < 10\%$ of population for each)
Large counts: $n_1\hat{p}, n_1(1-\hat{p}), n_2\hat{p}, n_2(1-\hat{p}) \geq 10$

🔑 Under $H_0$ , we assume $p_1 = p_2$ , so we use the for the standard error.

Two-Proportion Test 🎯

Two-Proportion Calculations 🧮

Treatment group: 45 successes out of 150. Control group: 30 successes out of 150.

1) $\hat{p}_1 = 45/150 =$ ? (Express as a decimal)

2) ${\hat{p}}_{2} = 30 / 150$ ?

Part 2: Two-Sample T-Test for Means

📊 Two-Sample T-Test for Means

Part 2 of 7 — Comparing Two Independent Groups

Topics in This Part

Section
🎯 Hypotheses for $\mu_1 - \mu_2$
📐 The Two-Sample -Statistic

Part 3: Paired T-Test

📊 Paired T-Test

Part 3 of 7 — When Data Come in Pairs

Topics in This Part

Section
🔗 Paired vs. Two-Sample Designs
📐 The Paired $t$ -Test Procedure
✅ Conditions for Paired Data
📝 Worked Example

🔑 Key Concept: A paired $t$ -test is a one-sample $t$ -test performed on the differences within each pair.

When to Use a Paired Test

Use a paired $t$ -test when:

Part 4: Confidence Intervals for Differences

📊 Confidence Intervals for Differences

Part 4 of 7 — Estimating How Much Two Groups Differ

Topics in This Part

Section
📐 Two-Sample CI for $\mu_1 - \mu_2$
🔗 Paired CI for

Part 5: Power and Sample Size

📊 Power and Sample Size

Part 5 of 7 — Detecting Real Differences

Topics in This Part

Section
⚡ What Is Power?
🎯 Type I and Type II Errors
📐 Factors Affecting Power
🧮 Sample Size Considerations

🔑 Key Concept: Power is the probability of correctly rejecting $H_0$ when $H_0$ is actually false. Higher power = better ability to detect a real effect.

Part 6: Problem-Solving Workshop

📊 Problem-Solving Workshop

Part 6 of 7 — Complete Worked Examples

Worked Example 1: Two-Sample T-Test

Problem

A fitness company wants to compare two training programs. 45 volunteers are randomly assigned: 22 to Program A, 23 to Program B. After 8 weeks, weight loss (in pounds) is recorded:

	Program A	Program B
$n$	22	23
$\bar{x}$	12.4	9.7

Part 7: Review & Applications

📊 Review & Applications

Part 7 of 7 — Comprehensive Review

Complete Formula Reference

Two-Sample T-Test for Means

Component	Formula
Standard Error	$SE = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$

pooled proportion

\overset{p}{^}_{2} = 30/150 =

3) Pooled $\hat{p} = (45+30)/(150+150) =$ ?

🔑 Key Concept: The two-sample $t$ -test compares means from two independent groups. The key question: "Is the difference in sample means large enough to conclude the population means differ?"

Two independent random samples:

Group 1: $n_1$ observations, sample mean $\bar{x}_1$ , sample SD $s_1$
Group 2: $n_2$ observations, sample mean $\bar{x}_2$ , sample SD $s_2$

$H_0: \mu_1 - \mu_2 = 0 \quad (\text{or equivalently, } \mu_1 = \mu_2)$ $H_a: \mu_1 - \mu_2 \neq 0 \quad (\text{or } > 0 \text{ or } < 0)$

$\boxed{t = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}}$

Degrees of freedom: Use the calculator/technology value (Welch's approximation) or the conservative value $\text{df} = \min(n_1 - 1, n_2 - 1)$ .

⚠️ AP Tip: The AP formula sheet provides this formula. On the AP exam, use the conservative df or the calculator df — both are acceptable.

Condition	Check
Random	Both samples are random (or randomly assigned in an experiment)
Independent	The two groups are independent of each other
Normal/Large Sample	Both populations are Normal OR both $n_1 \geq 30$ and $n_2 \geq 30$
10% Rule	$n_1 < 10\%$ of population 1 AND $n_2 < 10\%$ of population 2

Study: Does a new teaching method improve test scores?

Method A (traditional): $n_1 = 35$ , $\bar{x}_1 = 74.2$ , $s_1 = 8.5$
Method B (new): $n_2 = 38$ , $\bar{x}_2 = 79.6$ ,

Step 1 — Hypotheses: $H_0: \mu_B - \mu_A = 0$ (no difference in mean scores) $H_a: \mu_B - \mu_A > 0$ (new method produces higher mean scores)

Step 2 — Conditions:

Random: Students randomly assigned to methods ✓
Independent: Two separate groups ✓
Normal: $n_1 = 35 \geq 30$ and $n_2 = 38 \geq 30$ ✓
10%: Each group is less than 10% of all students ✓

Step 3 — Test statistic: $t = \frac{79.6 - 74.2}{\sqrt{\frac{7.9^2}{38} + \frac{8.5^2}{35}}} = \frac{5.4}{\sqrt{1.642 + 2.064}} = \frac{5.4}{\sqrt{3.706}} = \frac{5.4}{1.925} = 2.805$

df $= \min(34, 37) = 34$ (conservative) or use calculator df

Step 4 — Conclusion: $P \approx 0.004$ (one-sided). Since $P = 0.004 < 0.05 = \alpha$ , we reject $H_0$ . There is convincing evidence that the new teaching method produces higher mean test scores than the traditional method.

Two-Sample $t$ -Test Concepts 🎯

Computing the $t$ -Statistic 🧮

Group 1: $\bar{x}_1 = 50$ , $s_1 = 10$ , $n_1 = 25$ Group 2: $\bar{x}_2 = 44$ , $s_2 = 12$ , $n_2 = 30$

1) $\bar{x}_1 - \bar{x}_2 =$

2) $s_1^2/n_1 + s_2^2/n_2 =$ (compute to one decimal)

3) Conservative df $= \min(n_1-1, n_2-1) =$

Decisions and Interpretations 🔍

Exit Quiz — Two-Sample $t$ -Test ✅

The same subjects are measured twice (before/after)
Subjects are matched in pairs (e.g., twins, partners)
Each observation in one group has a natural partner in the other

Use a two-sample $t$ -test when:

The two groups are independent (different subjects, no pairing)

Step 1: Compute the differences: $d_i = x_{1i} - x_{2i}$ for each pair

Step 2: Treat the differences as a single sample and perform a one-sample $t$ -test:

$\boxed{t = \frac{\bar{d} - 0}{s_d / \sqrt{n}}}$

$\bar{d}$ = mean of the differences
$s_d$ = standard deviation of the differences
$n$ = number of pairs
df $= n - 1$

$H_0: \mu_d = 0 \quad (\text{no difference, on average})$ $H_a: \mu_d \neq 0 \quad (\text{or } > 0 \text{ or } < 0)$

⚠️ AP Tip: Define $d$ clearly: " $d$ = After $-$ Before" or " $d$ = Treatment $-$ Control." The direction matters for one-sided tests.

Condition	Check
Random	Pairs are randomly selected or treatments randomly assigned
Independent	Individual pairs are independent of each other (10% condition on pairs)
Normal	The differences ( $d$ ) are approximately Normal (check histogram/QQ plot of $d$ )

Study: Does a tutoring program improve SAT scores? 12 students take the SAT, receive tutoring, then retake it.

Student	Before	After	$d$ = After $-$ Before
1	520	560	$40$
2	480	510	$30$
3	550	540	$-10$
...	...	...	...
Summary			$\bar{d} = 28.5$ , $s_d = 22.3$

Step 1 — Hypotheses: $H_0: \mu_d = 0$ (tutoring has no effect on mean SAT scores) $H_a: \mu_d > 0$ (tutoring increases mean SAT scores) where $d$ = After $-$ Before

Step 2 — Conditions:

Random: Students randomly selected ✓
Independent: 12 students < 10% of all SAT takers ✓
Normal: Histogram of differences shows no strong skew ✓

Step 3 — Test statistic: $t = \frac{28.5 - 0}{22.3/\sqrt{12}} = \frac{28.5}{6.436} = 4.43$ df $= 12 - 1 = 11$

Step 4 — Conclusion: $P < 0.001$ (one-sided). Since $P < 0.001 < 0.05$ , we reject $H_0$ . There is convincing evidence that the tutoring program increases mean SAT scores.

Paired vs. Two-Sample: Why It Matters

Feature	Two-Sample	Paired
Data structure	Two independent groups	Pairs of related observations
Parameter	$\mu_1 - \mu_2$	$\mu_d$
Advantage	Simpler design	Controls for variability between subjects
df	Complex (Welch)	$n_{\text{pairs}} - 1$

🔑 Key Insight: Pairing reduces variability by eliminating subject-to-subject differences, making it easier to detect a treatment effect.

Paired $t$ -Test Concepts 🎯

Computing the Paired $t$ 🧮

$n = 16$ pairs, $\bar{d} = 5.0$ , $s_d = 8.0$

1) SE of $\bar{d}$ $= s_d / \sqrt{n} =$

2) $t = \bar{d} / \text{SE} =$

3) df $=$

Paired or Two-Sample? 🔍

Exit Quiz — Paired $t$ -Test ✅

🔑 Key Concept: A CI for the difference gives a range of plausible values for how much two population means (or the mean difference) differ.

Two-Sample CI for $\mu_1 - \mu_2$

$\boxed{(\bar{x}_1 - \bar{x}_2) \pm t^* \cdot \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$

Same conditions as the two-sample $t$ -test (Random, Independent, Normal/Large, 10%)
df: use calculator (Welch) or conservative $\min(n_1 - 1, n_2 - 1)$

Paired CI for $\mu_d$

$\boxed{\bar{d} \pm t^* \cdot \frac{s_d}{\sqrt{n}}}$

Same conditions as the paired $t$ -test (Random, Independent pairs, Normal differences)
df $= n - 1$ (where $n$ = number of pairs)

Interpretation Templates

Two-Sample: "We are [C]% confident that the true difference in mean [context] between [group 1] and [group 2] is between [lower] and [upper] [units]."

Paired: "We are [C]% confident that the true mean difference in [context] is between [lower] and [upper] [units]."

CI ↔ Test Connection

CI contains 0?	Test conclusion at $\alpha = 1 - C$
Yes	Fail to reject $H_0: \mu_1 = \mu_2$ (or $\mu_d = 0$ )
No	Reject $H_0$

Worked Example — Two-Sample CI

Group A (old drug): $n_1 = 40$ , $\bar{x}_1 = 82$ , $s_1 = 12$ Group B (new drug): $n_2 = 45$ , $\bar{x}_2 = 88$ , $s_2 = 10$

95% CI for $\mu_B - \mu_A$ :

$\bar{x}_B - \bar{x}_A = 88 - 82 = 6$
SE $= \sqrt{10^2/45 + 12^2/40} = \sqrt{2.222 + 3.6} = \sqrt{5.822} = 2.413$
Conservative df $= \min(39, 44) = 39$ , $t^* \approx 2.023$
$6 \pm 2.023(2.413) = 6 \pm 4.881 = (1.119, 10.881)$

Interpretation: "We are 95% confident that the true difference in mean recovery scores (new $-$ old) is between 1.1 and 10.9 points. Since 0 is not in the interval, there is evidence the new drug produces higher mean scores."

Worked Example — Paired CI

15 patients measured before and after treatment. $\bar{d} = 8.3$ (After $-$ Before), $s_d = 6.2$

95% CI for $\mu_d$ : df $= 14$ , $t^* = 2.145$ $8.3 \pm 2.145 \cdot \frac{6.2}{\sqrt{15}} = 8.3 \pm 2.145(1.601) = 8.3 \pm 3.434$ $(4.866, 11.734)$

Interpretation: "We are 95% confident that the true mean change in [outcome] after treatment is between 4.9 and 11.7 units."

⚠️ AP Tip: Always define $d$ (e.g., After $-$ Before) and include context and units.

CI for Differences Concepts 🎯

Building CIs 🧮

Two-Sample: $\bar{x}_1 - \bar{x}_2 = 10$ , SE $= 4$ , $t^* = 2.0$

1) Margin of error $=$

2) Lower bound $=$

Paired: $\bar{d} = -3$ , $s_d = 6$ , $n = 36$ ,

3) SE of $\bar{d}$ $=$

Interpretation Decisions 🔍

Exit Quiz — CIs for Differences ✅

	$H_0$ True	$H_0$ False
Reject $H_0$	Type I Error ( $\alpha$ )	Correct! (Power)
Fail to Reject	Correct!	Type II Error ( $\beta$ )

$\text{Power} = 1 - \beta = P(\text{reject } H_0 \mid H_0 \text{ is false})$

Type I Error ( $\alpha$ )

Rejecting $H_0$ when it is true (false positive)
Probability $= \alpha$ (the significance level)
Example: Concluding a drug works when it actually does not

Type II Error ( $\beta$ )

Failing to reject $H_0$ when it is false (false negative)
Probability = $\beta$
Example: Concluding a drug does not work when it actually does

Factors That Increase Power

Factor	Direction	Effect on Power
Sample size ( $n$ )	↑	Power ↑
Significance level ( $\alpha$ )	↑	Power ↑
True effect size ($	\mu_1 - \mu_2	$)
Population variability ( $\sigma$ )	↓	Power ↑

⚠️ AP Tip: You will NOT be asked to calculate power on the AP exam, but you MUST understand conceptually how each factor affects power.

Intuition for Each Factor

Larger $n$ : More data → smaller SE → easier to detect a difference $\text{SE} = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}} \quad \text{→ larger } n \text{ → smaller SE → larger } |t|$

Larger $\alpha$ : Easier rejection threshold → more likely to reject (but more risk of Type I error)

Larger effect: A bigger real difference is easier to detect than a tiny one

Smaller $\sigma$ : Less noise → the signal (difference) stands out more clearly

The Power- $\alpha$ Tradeoff

$\alpha \downarrow \Rightarrow \text{Power} \downarrow \Rightarrow \beta \uparrow$

Decreasing $\alpha$ (e.g., from 0.05 to 0.01) reduces Type I error but increases Type II error (reduces power). The only way to reduce BOTH errors simultaneously is to increase sample size.

Sample Size Planning

Before collecting data, researchers choose $n$ to achieve desired power (typically 80% or higher):

Specify the smallest meaningful effect size
Estimate population variability ( $\sigma$ )
Choose $\alpha$ (usually 0.05)
Use a power table or software to find the required $n$

🔑 Key Insight: Larger samples are always better for power, but they cost more. Sample size planning balances statistical needs with practical constraints.

Power Concepts 🎯

Error and Power Calculations 🧮

1) If $\beta = 0.15$ , what is the power? (give as decimal)

2) If $\alpha = 0.01$ , what is the probability of a Type I error?

3) A test has power $= 0.80$ . What is $\beta$ ?

Power Factors 🔍

Exit Quiz — Power & Sample Size ✅

Test whether Program A produces greater average weight loss at $\alpha = 0.05$ .

Step 1: State Hypotheses

$H_0: \mu_A - \mu_B = 0$ $H_a: \mu_A - \mu_B > 0$

Where $\mu_A$ = true mean weight loss for Program A, $\mu_B$ = true mean weight loss for Program B.

Step 2: Check Conditions

✅ Random: Volunteers randomly assigned to groups (experiment) ✅ Normal: $n_A = 22 \geq 10$ and $n_B = 23 \geq 10$ (or no strong skewness mentioned) ✅ Independent: Groups are independent; each person in only one program. Both $< 10\%$ of all potential participants.

$SE = \sqrt{\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}} = \sqrt{\frac{3.8^2}{22} + \frac{4.1^2}{23}} = \sqrt{\frac{14.44}{22} + \frac{16.81}{23}}$

$= \sqrt{0.6564 + 0.7309} = \sqrt{1.3873} \approx 1.178$

$t = \frac{(\bar{x}_A - \bar{x}_B) - 0}{SE} = \frac{12.4 - 9.7}{1.178} = \frac{2.7}{1.178} \approx 2.291$

Using technology with $df \approx 42$ : $P$ -value $\approx 0.0136$

Since $P = 0.0136 < 0.05 = \alpha$ , we reject $H_0$ .

AP-Style Conclusion: There is convincing evidence that the true mean weight loss for Program A is greater than the true mean weight loss for Program B.

Worked Example 2: Paired T-Test

A researcher tests whether a meditation app reduces stress. 30 participants rate their stress (1–100) before and after 4 weeks of daily use:

	Before	After	Differences (Before − After)
$n$	30	30	30
$\bar{x}$	68.2	59.5	$\bar{x}_d = 8.7$
$s$	—	—	$s_d = 11.3$

Test whether the app reduces stress at $\alpha = 0.05$ .

Step 1: State Hypotheses

$H_0: \mu_d = 0$ $H_a: \mu_d > 0$

Where $\mu_d$ = true mean difference in stress scores (Before − After) for all users of this app.

Step 2: Check Conditions

✅ Random: 30 participants randomly selected (or assume representative) ✅ Normal: $n = 30 \geq 30$ (CLT applies for differences) ✅ Independent: Differences within each person are independent; $30 < 10\%$ of all potential users

⚠️ Key: We check conditions on the DIFFERENCES, not the individual scores.

$SE = \frac{s_d}{\sqrt{n}} = \frac{11.3}{\sqrt{30}} = \frac{11.3}{5.477} \approx 2.063$

$t = \frac{\bar{x}_d - 0}{SE} = \frac{8.7}{2.063} \approx 4.217$

$df = n - 1 = 29$ , $P$ -value $< 0.001$

Since $P < 0.001 < 0.05 = \alpha$ , we reject $H_0$ .

AP-Style Conclusion: There is convincing evidence that the meditation app reduces mean stress scores.

Mistake	Why It Costs Points
Using two-sample test when data is paired	Wrong procedure → wrong test statistic → wrong conclusion
Not defining $\mu_d$ clearly	"Mean difference" must include direction (A − B) and context
Skipping conditions	Automatic deduction on free-response
No context in conclusion	Must reference the specific variables and setting
Saying "accept $H_0$ "	Always say "fail to reject $H_0$ "
Not identifying data as paired	Look for: same subjects, before/after, matched pairs

Workshop Practice 🎯

Computation Practice 🧮

1) Two groups: $\bar{x}_1 = 45$ , $s_1 = 6$ , $n_1 = 36$ ; $\bar{x}_2 = 41$ , $s_2 = 8$ , $n_2 = 36$ . Calculate SE (round to 2 decimal places).

2) Using the values above, calculate the t-statistic (round to 2 decimal places).

3) Paired data: $\bar{x}_d = 3.2$ , $s_d = 5.0$ , . Calculate the t-statistic (round to 2 decimal places).

Procedure Selection 🔍

Exit Quiz — Problem-Solving Workshop ✅

Component	Formula
Mean Difference	$\bar{x}_d = \frac{\sum d_i}{n}$
Standard Error	$SE = \frac{s_d}{\sqrt{n}}$
Test Statistic	$t = \frac{\bar{x}_d - \mu_{d_0}}{SE}$
Confidence Interval	$\bar{x}_d \pm t^* \cdot \frac{s_d}{\sqrt{n}}$
Degrees of Freedom	$df = n - 1$

Decision Guide: Paired vs. Two-Sample

Question	Paired	Two-Sample
Same subjects measured twice?	✅	❌
Before/after design?	✅	❌
Matched pairs (twins, siblings)?	✅	❌
Two independent groups?	❌	✅
Random assignment to groups?	❌	✅
Check conditions on...	Differences $d_i$	Each sample
$df =$	$n - 1$	Technology

Two-Sample Tests/CIs

Random: Both samples from random processes (or random assignment)
Normal: $n_1 \geq 30$ and $n_2 \geq 30$ , or no strong skewness/outliers
Independent: Samples independent of each other; each $< 10\%$ of its population

Random: Random sample of pairs (or randomly determine order)
Normal: $n \geq 30$ differences, or differences show no strong skewness/outliers
Independent: Individual pairs are independent; $n < 10\%$ of all pairs in population

Interpretation Templates

Hypothesis Test Conclusion

"Since $P$ -value = ___ is [less/greater] than $\alpha =$ ___, we [reject/fail to reject] $H_0$ . There [is/is not] convincing evidence that [context: what the alternative hypothesis claims]."

Confidence Interval Interpretation

"We are ___% confident that the true difference in means [context: $\mu_1 - \mu_2$ in words] is between ___ and ___."

Confidence Interval and Significance

If 0 is NOT in the CI → Reject $H_0$ (at the corresponding $\alpha$ )
If 0 IS in the CI → Fail to reject $H_0$

Key Concepts from Every Part

Part	Topic	Key Idea
1	Introduction	Two-sample vs. paired designs
2	Two-Sample T-Test	Tests for differences between independent groups
3	Paired T-Test	Tests for differences within matched pairs
4	Confidence Intervals	Estimate the true difference with a range
5	Power & Sample Size	Power = $1 - \beta$ ; larger $n$ → more power
6	Problem-Solving	Complete 4-step process with real data

$\text{Power} = 1 - \beta = P(\text{reject } H_0 \mid H_0 \text{ is false})$

Power increases when: $n \uparrow$ , $\alpha \uparrow$ , effect size $\uparrow$ , $\sigma \downarrow$

🔑 Final Tip: The AP exam tests your ability to (1) choose the right test, (2) check conditions, (3) calculate correctly, and (4) interpret in context. Practice the full 4-step process until it becomes automatic.

Comprehensive Review 🎯

Formula Application 🧮

1) Two-sample data: $\bar{x}_1 = 78$ , $\bar{x}_2 = 72$ , $SE = 2.5$ . Calculate $t$ (round to 1 decimal place).

2) Paired data: $\bar{x}_d = -4.5$ , $s_d = 6$ , . Calculate (round to 1 decimal place).

3) Using the values from #2, what is $df$ ?

Concept Connections 🔍

Final Exam — Comparing Populations ✅

Inference for Two Sample Means (CI and Test)

Inference for Two Sample Means (CI and Test) - Complete Interactive Lesson

Part 1: Two-Sample Z-Test for Proportions

⚖️ Comparing Two Populations

When to Compare Two Proportions

Hypotheses

Test Statistic

Conditions

Part 2: Two-Sample T-Test for Means

📊 Two-Sample T-Test for Means

Topics in This Part

Part 3: Paired T-Test

📊 Paired T-Test

Topics in This Part

When to Use a Paired Test

Part 4: Confidence Intervals for Differences

📊 Confidence Intervals for Differences

Topics in This Part

Part 5: Power and Sample Size

📊 Power and Sample Size

Topics in This Part

Part 6: Problem-Solving Workshop

📊 Problem-Solving Workshop

Worked Example 1: Two-Sample T-Test

Problem

Part 7: Review & Applications

📊 Review & Applications

Complete Formula Reference

Two-Sample T-Test for Means

The Setup

Hypotheses

Test Statistic

Conditions

Worked Example

The Procedure

Hypotheses

Conditions

Worked Example

Paired vs. Two-Sample: Why It Matters

Two-Sample CI for μ1−μ2\mu_1 - \mu_2μ1​−μ2​

Paired CI for μd\mu_dμd​

Interpretation Templates

CI ↔ Test Connection

Worked Example — Two-Sample CI

Worked Example — Paired CI

Error Types

Type I Error (α\alphaα)

Type II Error (β\betaβ)

Factors That Increase Power

Intuition for Each Factor

The Power-α\alphaα Tradeoff

Sample Size Planning

Step 1: State Hypotheses

Step 2: Check Conditions

Step 3: Calculate

Step 4: Conclude

Worked Example 2: Paired T-Test

Problem

Step 1: State Hypotheses

Step 2: Check Conditions

Step 3: Calculate

Step 4: Conclude

Common AP Mistakes

Paired T-Test

Decision Guide: Paired vs. Two-Sample

Conditions Summary

Two-Sample Tests/CIs

Paired Tests/CIs

Interpretation Templates

Hypothesis Test Conclusion

Confidence Interval Interpretation

Confidence Interval and Significance

Key Concepts from Every Part

Power Quick Review

Two-Sample CI for $\mu_1 - \mu_2$

Paired CI for $\mu_d$

Type I Error ( $\alpha$ )

Type II Error ( $\beta$ )

The Power- $\alpha$ Tradeoff