Sampling Methods

Simple random, stratified, cluster, and systematic sampling

Sampling Methods

Why Sample?

Sampling allows us to study a subset of a population to make inferences about the whole population. It's practical, economical, and often the only feasible approach.

Population: All individuals/items of interest
Sample: Subset selected for study
Goal: Use sample statistics to estimate population parameters

Simple Random Sample (SRS)

Definition: Every individual has equal probability of selection; every group of size n has equal probability.

How to obtain:

  1. Assign number to each population member
  2. Use random number generator or table
  3. Select corresponding individuals

Example: Select 50 students from 500 by randomly generating 50 numbers between 1-500.

Advantages: Unbiased, every member equally likely
Disadvantages: Requires complete population list, may not represent subgroups well

Stratified Random Sampling

Method:

  1. Divide population into homogeneous groups (strata)
  2. Take SRS from each stratum
  3. Combine samples

When to use: Want guaranteed representation from each subgroup

Example: School has 40% freshmen, 30% sophomores, 20% juniors, 10% seniors. For sample of 100, randomly select 40 freshmen, 30 sophomores, 20 juniors, 10 seniors.

Advantages: Ensures all strata represented, more precise estimates, can compare groups
Disadvantages: Requires knowledge of strata, more complex

Cluster Sampling

Method:

  1. Divide population into clusters (heterogeneous groups)
  2. Randomly select some clusters
  3. Survey ALL members in selected clusters

When to use: Population geographically spread, no complete list available

Example: Select 5 random schools, survey all students in those 5 schools.

Key difference from stratified: In stratified, sample from all groups; in cluster, select whole groups.

Advantages: Practical, economical, reduces travel costs
Disadvantages: Less precise than SRS, clusters must be mini-populations

Systematic Sampling

Method:

  1. Calculate k = N/n (population size / sample size)
  2. Randomly select starting point (1 to k)
  3. Select every kth individual

Example: From 1000 students, want 100. k = 10. Start at random number 7, then select 7, 17, 27, 37, etc.

Advantages: Easy to implement, spreads sample across population
Disadvantages: Problems if list has hidden patterns or cycles

Comparing Methods

Use SRS when: Simplest approach, have complete list
Use Stratified when: Subgroups matter, want comparisons
Use Cluster when: Geographic spread, practical constraints
Use Systematic when: Have ordered list, want efficiency

Sampling Bias

Selection Bias: Some individuals more likely to be selected
Voluntary Response: Individuals self-select (those with strong opinions respond)
Undercoverage: Some groups excluded from sampling frame
Nonresponse: Selected individuals don't participate

Avoid bias: Use random selection, ensure complete sampling frame, maximize response rate

Key Principles

Randomization reduces bias
Larger samples generally better (but quality > quantity)
Representative samples crucial for valid inference
Response rate matters (low response = nonresponse bias)

Remember: Good sampling is the foundation of statistical inference. A biased sample, no matter how large, leads to invalid conclusions!

📚 Practice Problems

No example problems available yet.