Bias in Sampling and Surveys

Types of bias and how to minimize them

Bias in Sampling and Surveys

What is Bias?

Bias: Systematic tendency to over- or under-estimate population parameter.

Key point: Bias ≠ random error. Bias is consistent, predictable deviation in one direction.

Unbiased method: On average, gives correct answer
Biased method: Systematically off, doesn't improve with larger sample

Types of Sampling Bias

1. Selection Bias

Definition: Some members of population systematically more/less likely to be selected.

Causes:

  • Non-random sampling method
  • Convenience sampling
  • Judgment/purposive sampling

Examples:

  • Survey only people at shopping mall (excludes non-shoppers)
  • Online poll (excludes those without internet)
  • Call only landlines (excludes cell-phone-only households)

Result: Sample not representative of population

Solution: Use random sampling methods

2. Undercoverage

Definition: Some groups in population left out of sampling frame.

Sampling frame: List from which sample is drawn

Examples:

  • Phone directory excludes unlisted numbers
  • Email list excludes those without email
  • Voter registration list excludes unregistered voters

Result: Missing groups lead to biased estimates

Solution: Use complete, up-to-date sampling frame that covers entire population

3. Voluntary Response Bias

Definition: Individuals choose whether to participate.

Characteristics:

  • Self-selection
  • Those with strong opinions more likely to respond
  • Usually overrepresents extreme views

Examples:

  • Online polls where anyone can vote
  • Call-in surveys
  • Mail-back questionnaires (without follow-up)
  • Social media polls

Result: Respondents not representative (tend to have stronger, more extreme opinions)

Solution: Use probability sampling where researcher selects participants

4. Nonresponse Bias

Definition: Selected individuals don't respond, and non-respondents differ from respondents.

Types:

  • Unit nonresponse: Entire survey not completed
  • Item nonresponse: Specific questions skipped

Examples:

  • Mail survey with 20% response rate
  • Phone survey where people don't answer
  • Web survey where people start but don't finish

Result: If non-respondents differ systematically from respondents, estimates are biased

Solutions:

  • Follow up with non-respondents
  • Make survey convenient/appealing
  • Keep it short
  • Offer incentives (if appropriate)
  • Compare respondent characteristics to population

Response Bias

Definition: Responses are systematically incorrect due to how question is asked or answered.

1. Question Wording Bias

Loaded/leading questions suggest a particular answer:

  • "Don't you agree that...?"
  • "Like most Americans, do you support...?"

Emotionally charged language:

  • "Should innocent babies be protected?" vs "Should abortion be legal?"

Solution: Use neutral, clear language

2. Question Order Bias

Earlier questions influence later responses

Example:

  • Q1: "How satisfied are you with the president?"
  • Q2: "How satisfied are you with the economy?"

Q1 may influence Q2 answers

Solution: Randomize question order or carefully consider order effects

3. Response Option Bias

Limited or unbalanced options can bias results

Example:

  • Only offering "Yes" or "No" when "Unsure" is valid
  • 4 positive options, 1 negative option

Solution: Offer balanced, complete response options including "no opinion" when appropriate

4. Social Desirability Bias

Respondents give socially acceptable answers rather than truthful ones

Examples:

  • Overreporting voting, recycling, charitable donations
  • Underreporting illegal behavior, prejudice, embarrassing habits

Solutions:

  • Anonymous surveys
  • Neutral wording
  • Indirect questioning
  • Validation against records when possible

5. Interviewer Bias

Interviewer characteristics or behavior influence responses

Examples:

  • Gender, race, age of interviewer affects responses to sensitive topics
  • Interviewer tone, body language suggests preferred answer
  • Recording errors

Solutions:

  • Standardize interviewer training
  • Use self-administered surveys when possible
  • Monitor interviewer performance

6. Recall Bias

Inaccurate memory of past events

Examples:

  • "How many times did you exercise last month?" (people forget)
  • "What did you eat for lunch 3 days ago?"

Solution: Ask about recent, specific time periods; verify with records when possible

Other Survey Issues

1. Overcoverage

Sampling frame includes units not in target population

Example: List includes deceased people, duplicates, or out-of-scope units

Solution: Clean and update sampling frame regularly

2. Measurement Error

Inaccurate measurements of response variable

Causes:

  • Poor question design
  • Respondent misunderstanding
  • Recording errors
  • Equipment problems

Solution: Pilot test survey, train data collectors, use validated measures

3. Processing Error

Errors in data entry, coding, or analysis

Solution: Double-check data entry, use data validation, verify calculations

Reducing Bias: Best Practices

Sampling: ✓ Use probability sampling (random selection)
✓ Ensure complete, accurate sampling frame
✓ Maximize response rate
✓ Follow up with non-respondents
✓ Compare respondent characteristics to population

Survey Design: ✓ Use clear, neutral question wording
✓ Avoid leading or loaded questions
✓ Offer balanced, complete response options
✓ Consider question order effects
✓ Pilot test before full implementation

Data Collection: ✓ Train interviewers/data collectors
✓ Standardize procedures
✓ Consider anonymity for sensitive topics
✓ Verify data accuracy
✓ Document procedures

Impact of Bias

Key insight: Large sample doesn't fix bias!

  • Unbiased small sample > Biased large sample
  • Bias is systematic - doesn't average out
  • Can't use statistics to "correct" for bias after the fact

Example: 1936 Literary Digest poll

  • Mailed 10 million ballots (huge sample!)
  • Predicted Landon would beat Roosevelt
  • Roosevelt won in landslide
  • Problem: Undercoverage and nonresponse bias (sampled from phone books and car registrations during Depression; only 24% responded)

Identifying Bias in Studies

When evaluating study, ask:

  1. How were participants selected? (Random? Convenient?)
  2. What's the sampling frame? (Complete? Current?)
  3. What's the response rate? (High? Low?)
  4. How are questions worded? (Neutral? Leading?)
  5. Who conducted the survey? (Potential conflicts of interest?)
  6. How were data collected? (Method may introduce bias)

Quick Reference

Selection Bias: Non-random sampling
Undercoverage: Incomplete sampling frame
Voluntary Response: Self-selection
Nonresponse: Low response rate

Question Wording: Leading/loaded questions
Social Desirability: Giving "acceptable" answers
Interviewer Bias: Interviewer influences responses
Recall Bias: Inaccurate memory

Key Principle: Use random selection, neutral questions, high response rate, careful measurement

Remember: No amount of sophisticated analysis can fix a biased sample. Preventing bias through good design is essential. When evaluating studies, always look for potential sources of bias before trusting the conclusions!

📚 Practice Problems

No example problems available yet.