Introduction to Statistics - Complete Interactive Lesson
Part 1: Data, Variables, and Good Questions
๐ Introduction to Statistics
Part 1 of 5 โ Data, Variables, and Good Questions
Topics in This Part
| Section |
|---|
| What Is Statistics? |
| Categorical vs. Numerical Data |
| Statistical Questions |
| The Data Cycle |
๐ Key Concept: Statistics is the science of collecting, organizing, describing, and drawing conclusions from data. Before we ever compute an average, we have to know what kind of data we have and what question we are trying to answer.
What Is Statistics?
Every day you are surrounded by data โ facts and numbers collected about the world:
- The heights of everyone in your class
- Your favorite ice cream flavors
- The number of pets each student owns
- How many minutes you spent on homework last night
A single fact, like "Maria is 152 cm tall," is one data value. A whole collection of values, like the heights of all 25 students, is a data set.
๐ก Statistics turns a pile of numbers into a story. Instead of staring at 25 separate heights, statistics lets us say things like "most students are about 150 cm tall" or "the tallest student is 20 cm taller than the shortest."
We do this in four stages: collect data, organize it, describe it, and draw conclusions from it.
Two Kinds of Data
Every data set is built from a variable โ the thing we are measuring or recording. Variables come in two main types.
| Type | What it records | Examples |
|---|---|---|
| Categorical (qualitative) | a category or label | eye color, favorite sport, yes/no, type of pet |
| Numerical (quantitative) | a number you can count or measure | height, age, number of siblings, test score |
A quick test: "Can I find the average of it and have it make sense?"
- Average eye color? โ โ eye color is categorical.
- Average height? โ โ height is numerical.
โ ๏ธ Watch out: Numbers are not always numerical data. A jersey number or a ZIP code is just a label โ averaging jersey numbers is meaningless, so those are categorical.
Categorical or Numerical? ๐ฝ
Sort each variable into the correct type.
Statistical Questions
A statistical question is one you expect to answer with data that varies โ the answers are not all the same.
| Question | Statistical? | Why |
|---|---|---|
| "How tall is my teacher?" | โ No | One person, one answer โ no variability. |
| "How tall are the students in my class?" | โ Yes | Heights differ from student to student. |
| "What is today's date?" | โ No | Single fixed fact. |
| "How many hours do 8th graders sleep?" | โ Yes | Answers vary across many people. |
๐ Key Idea: A statistical question anticipates variability โ different responses you will need to summarize. A question with exactly one answer is not statistical.
Concept Check ๐ฏ
The Data Cycle
Statisticians follow a repeating data cycle:
- Ask a statistical question.
- Collect data that can answer it.
- Organize & display the data (tables, plots โ Part 4).
- Analyze the data using numbers like averages (Parts 2โ3).
- Interpret the results and answer the question โ which often leads to a new question.
๐ก In the rest of this lesson we focus on stages 3 and 4: how to summarize a data set with a single typical value (center), how to describe how spread out it is, and how to display it so the story is easy to see.
Part 2: Measures of Center: Mean, Median, Mode
๐ Introduction to Statistics
Part 2 of 5 โ Measures of Center: Mean, Median, Mode
๐ The Big Idea: A measure of center is a single number that represents a "typical" value for a whole data set. The three most common are the mean, the median, and the mode.
The Mean (Average)
The mean is what most people call the "average." To find it:
Part 3: Spread, Range, and Outliers
๐ Introduction to Statistics
Part 3 of 5 โ Spread, Range, and Outliers
๐ The Big Idea: A center tells you the "typical" value, but two data sets with the same mean can look completely different. Spread describes how far apart the values are, and outliers are values that sit far from the rest.
The Range
The simplest measure of spread is the range:
Worked Example
Daily high temperatures (ยฐC): .
Part 4: Displaying Data
๐ Introduction to Statistics
Part 4 of 5 โ Displaying Data
๐ The Big Idea: A good display lets you see a data set's story at a glance โ what's typical, what's rare, and how spread out things are. We'll read and build three common displays: frequency tables, dot plots, and bar graphs.
Frequency Tables
A frequency table records how many times each value occurs. The frequency is the count.
Suppose 12 students reported how many siblings they have:
Part 5: Mixed Practice & Mastery Check
๐ Introduction to Statistics
Part 5 of 5 โ Mixed Practice & Mastery Check
You can now (1) tell categorical from numerical data, (2) find the mean, median, and mode, (3) measure spread with the range and judge outliers, and (4) read and build data displays. Let's put it all together.
Quick Reference
| Idea | What it tells you / how to find it |
|---|---|
| Categorical data | a label (color, sport) โ use a bar graph, find the mode |
| Numerical data | a count/measure โ use a dot plot, find mean/median |
| Mean | โ the "fair share"; sensitive to outliers |