Page 1 of 1
Metropolitan State University
ECON 497: Research and Forecasting
Lecture 2: Ordinary Least Squares
Ordinary least squares (OLS) is a mathematical technique used to estimate a relationship between different variables. The most simple version of this relationship is:
Yi = β0 + β1Xi + εi where Yi is the value of the dependent variable, Xi is the value of the explanatory variable and εi is the value of the stochastic error term for the ith observation.
The result of this estimation procedure is estimates of the coefficients, β0 and β1 which are called β0-hat and β1-hat. These coefficients are used to generate estimates of or predicted values for the dependent variable which are called Yi-hat and we can say that
ˆ
Yi = ˆ0 + ˆ1Xi β β
The difference between the actual value of Yi and its estimated value, ˆi is equal to ei, the error term. This
Y
can be written as:
Yi = ˆ0 + ˆ1Xi + ei β β
ˆ
Yi − Yi = ei
So, what do these things look like on a graph?
ECON 497: Lecture 2
Page 2 of 2
ECON 497: Lecture 2
What do ei look like?
What do Yi-hat look like?
Page 3 of 3
ECON 497: Lecture 2
Page 4 of 4
The line relating X and Y that is calculated by OLS is good because it minimizes the sum of the squared errors. That is it minimizes:
∑e
2 i It is equivalent to say that it minimizes:
∑ (Y
i
− ˆi
Y
)
2
This has three good characteristics:
1. The regression line goes through the point X, Y which is the mean of the data
2. The sum of the errors or residuals is zero
3. OLS gives the "best" estimation, depending on some conditions and definitions
(
Definitions
Standard Error of the Estimate (SEE)
∑
e i2
SEE =
n−2
1/2
Total Sum of Squares (TSS)
TSS =
∑ (Y
i
−Y
)
2
)
ECON 497: Lecture 2
Page 5 of 5
Explained Sum of Squares (ESS)
ESS =
Y
∑ (ˆ
i
−Y
)
2
ECON 497: Lecture 2
Page 6 of 6
Residual Sum of Squares (RSS)
RSS =
Y
∑ (e ) = ∑ (ˆ
2
i
i
− Yi
)
2
To put this all together: TSS = ESS + RSS
R2
How much of the variation in the dependent variable is explained by the model? This is given by R2, which is ratio of explained sum of squares to total sum of squares, or:
R2 =
ESS
RSS
=1−
TSS
TSS
A higher R2 means that the model being estimated explains a higher level of variation in the dependent variable. If the R2 is zero, then the model offers no information about the dependent variable and the best prediction you can make about the value of the dependent variable is its mean. The "explanatory" variables really offer no explanatory power whatsoever.
What does this look like?
ECON 497: Lecture 2
Page 7 of 7
ECON 497: Lecture 2
Page 8 of 8
Multivariate Regression
There aren't a lot of interesting questions to be answered using one explanatory variable. It's more fun to look at a number of explanatory variables through multivariate regressions, or regression on multiple variables. For a model with K explanatory variables this would be:
Yi = β0 + β1X1i + β2X2i + ... + β K XKi + εi
Each of the coefficients, βj, is the partial derivative of the dependent variable, Yi with respect to the explanatory variable Xji. That is, the coefficient is the expected change in Y resulting from a one unit change in X holding the other variables constant.
A Beef Example (Studenmund, p. 44)
Consider the following estimated model:
ˆ
Bt = 37.53 − 0.88Pt + 11.9Ydt where Bt
= the per capita consumption of beef in year t in pounds per person
Pt
= the price of beef in year t (in cents per pound)
Ydt
= the per capita disposable income in year t (in thousands of dollars)
Questions
1. What is the interpretation of the coefficient on Pt?
2. What is the interpretation of the coefficient on Ydt?
3. Is beef a normal good?
4. Do these estimated coefficients conform to the law of demand?
5.