Relationships
Lesson Plan
Displaying Relationships: Scatterplots
For All Practical
Purposes
Mathematical Literacy in Today’s World, 7th ed. Regression Lines
Correlation
Least-Squares Regression
Interpreting Correlation and Regression
© 2006, W.H. Freeman and
1
Chapter 6: Exploring Data: Distributions
Displaying Relationships
Relationship Between Two Variables
Examine data for two variables to see if there is a relationship between the variables. Does one influence the other?
Study both variables on the same individual.
If a relationship exists between variables, typically one variable influences or causes a change in another variable.
Explanatory variable explains, or causes, the change in another variable.
Response variable measures the outcome, or response to the change.
Response variable –
A variable that measures an outcome or result of a study (observed outcome). Explanatory variable –
A variable that explains or causes change in the response variable. 2
Chapter 6: Exploring Data: Distributions
Displaying Relationships: Scatterplots
Data to Be Used for a Scatterplot
A scatterplot is a graph that shows the relationship between two numerical variables, measured on the same individual.
Explanatory variable, x, is plotted on the horizontal axis, (x).
Response variable, y, is plotted on the vertical axis (y).
Each pair of related variables (x, y) is plotted on the graph.
Example: A study done to see how the number of beers that a young adult drinks predicts his/her blood alcohol content (BAC). Results of 16 people:
Explanatory variable, x = beers drunk Response variable, y = BAC level
Young Adult
1
2
3
4
5
6
7
8
Beers
5
2
9
8
3
7
3
5
BAC
0.10
0.03
0.19
0.12
0.04
0.095
0.07
0.06
Young Adult
9
10
11
12
13
14
15
16
Beers
3
5
4
6
5
7
1
4
BAC
0.02
0.05
0.07
0.10
0.85
0.09
0.01
0.05
3
Chapter 6: Exploring Data: Distributions
Displaying Relationships: Scatterplots
Scatterplot
BAC vs. number of beers consumed
Example continued: The scatterplot of the blood alcohol content, BAC, (y, response variable) against the number of beers a young adult drinks
(x, explanatory variable).
The data from the previous table are plotted as points on the graph (x, y).
Examining This Scatterplot…
1. What is the overall pattern (form, direction, and strength)? Form – Roughly a straight-line pattern.
Direction – Positive association (both increase).
Strength – Moderately strong (mostly on line).
2. Any striking deviations (outliers)? Not here.
Outliers – A deviation in a distribution of a data point falling outside the overall
4
pattern.
Chapter 6: Exploring Data: Distributions
Regression Lines
Regression Line
A straight line that describes how a response variable y changes as an explanatory variable x changes.
Regression lines are often used to predict the value of y for a given value of x.
BAC vs. number of beers consumed
A regression line has been added to be able to predict blood alcohol content from the number of beers a young adult drinks.
Graphically, you can predict that if x = 6 beers, then y = 0.95 BAC.
(Legal limit for driving in most states is BAC = 0.08.)
5
Chapter 6: Exploring Data: Distributions
Regression Lines
Using the Equation of the Line for Predictions
It is easier to use the equation of the line for predicting the value of y, given the value of x.
Using the equation of the line for the previous example: predicted BAC
= −0.0127 + (0.01796)(beers) y = −0.0127 + 0.01796 (x)
For a young adult drinking 6 beers (x = 6): predicted BAC = −0.0127 + 0.01796 (6) = 0.095
Straight Lines
A straight line for predicting y from x has an equation of the form: predicted y = a + b x
In this equation, b is the slope, the amount by which y changes when x increases by 1 unit.
The number a is the intercept, the value of y when x =0.
6
Chapter 6: Exploring Data: Distributions
Correlation