Nguyeãn Troïng Hoaøi Analytical Methods 9
Specification Error
When constructing any regression model, we are always most interested in
explaining what variables cause the dependent variable to change and by how
much. This will always depend on a combination of economic theory; basic human
behavior; and past experience.
One of the assumptions of OLS is that the model is correctly specified. The
specification error can be explained by these two aspects : -
a) Missing / omitting relevant information / explanatory variables or from
including irrelevant variables.
b) Incorrect functional form.
This lecture will discuss the following issues : which regressors should be included
and / or excluded from a particular model. In other words, we will consider the
following cases : -
a) A regression model that excludes some important explanatory variables.
b) A regression model that includes some irrelevant regressors.
1) Exclusion of relevant variables
Suppose that we are interested in the following model : -
( ) ( ) ( ) ( )
1 2 2 K i
K 1 1 K L K L
i i Ki
K i i
Y X X X X
β β β β β ε
+ + + +
= + + + + + + +L L
The question is whether the set of L regressors -
( ) ( )
X X
This means we have excluded an important regressor X
3i
.
1
1
Nguyeãn Troïng Hoaøi Analytical Methods 9
The LS estimator of
2
β
ˆ
is.
∑
∑
=
2
i2
i2i
2
x
Yx
β
ˆ
9.3
Recall the lecture of Prof. Motahar in calculating the coefficient for regressor X
2
.
Important consequences of excluding important explanatory variables
a)
2 2
ˆ
+++=
=
∑
∑
2
i2
ii33i221i2i
2
x
ε Xβ Xβ β Y x
E β
ˆ
E
9.4
[ ]
∑
∑
+=
2
i2
i32i
322
x
Xx
β β β
ˆ
E
9.5
2i 3 2i 3
2
2
x
i
i
Y
x
β
∧
=
∑
∑
9.7
So, if the simple regression is
3 1 22 2 i
i i
X X
β β ε
= + +
the coefficient of X
2
can
also be defined by the expression, in which,the estimator is : -
2i 3
22
2
2
x
i
β
∧
= + + +
=
∑
∑
n n n n n
1 2 3 2 2 2 2 3 2
i 1 i 1 i 1 i 1 i 1
2 1 2 3
n n n n n
2 2 2 2 2
2 2 2 2
1 1 1 1 1
ˆ
i i i i i i i i i i
i i i i i
i i i i i
x X x x X x X x
x x x x x
β β β ε ε
β β β β
= = = = =
= = = = =
1 i
ii
xXx
as compared with : -
∑
∑
=
=
n
1 i
2
i
n
1 i
ii
x
Xx
=1
Thus,
n n n
2 2 2 3 2
i 1 i 1 i 1
2 2 3
n n n
2 2 2
2 2 2
1 1 1
ˆ
i i i i i i
= +
9.10
Important meanings :
Gross effect of X
2
on Y in the model,
2
ˆ
β
equals the direct effect of X
2
on Y (that
is,
2
β
∧
in the true model) plus the indirect effect of X
2
on Y (that is,
3 22
.
β β
∧
).
Thus, the estimated coefficient in the regression without X
3
(and assuming that this
variable is relevant), so then
2