Applied Econometrics Multicollinearity
1
Applied Econometrics
Lecture 7: Multicollinearity
Double whom you will, but never yourself
1) Introduction
Multiple regression can be written as follows:
Y
i
= b
0
+ b
1
X
1
+ b
2
X
2
+ … + b
k
X
kCollinearity refers to linear relationships between two X variables. Multicollinearity encompasses
linear relationships between more than two X variables. Multiple regression is impossible in the
presence of perfect collinearity or multicollinearity. If X
to underestimates β
j
, and vice versa.
Because of the large standard errors, the confident intervals for the relevant population parameters
tend to be larger.
Written by Nguyen Hoang Bao May 24, 2004
Applied Econometrics Multicollinearity
2
Sensitive coefficients
One of the consequences of high correlation between explanatory variables is that the parameters
estimates would be very sensitive to addition or deletion of observations.
A high R
2
but few significant t-ratios
There are few coefficients, which are not statistically significant difference from zero and the
coefficient of determination is high.
3) Detection of multicollinearity
3.1) There is high R
2
but few significant t-ratios. The F-test will reject the hypothesis that partial
slope coefficient are simultaneously equal to zero, but the individual t-test show that non or
very few partial slope coefficients are statistically different from zero.
3.2) Multicollinearity can be considered a serious problem only if R
2
y
3
, and X
4
, if one find that R
2
1.234
is very high but r
2
12.34
, r
2
13.24
,
and r
2
14.23
are comparatively low, it may suggest that the variables X
2
, X
3
, and X
4
are highly
intercorrelated and that at least one of these variables is superfluous.
3.4) We may use the overall F test to check whether there is a relationship between any one of
explanatory variable on the remaining explanatory variables.
3.5) In the regression of Y on X
1
X
2
X
2i
2
S
22
n
1i
X
1
X
1i
2
S
11
−
∑
=
−
∑
=
−
∑
=
−
=
=
=
2
i
22
RRRm
where
R
2
is the squared multiple correlation coefficient between Y and the explanatory variables (X
1
,
X
2
, …, X
i
, …, X
k
)
R
2
-i
is the squared multiple correlation coefficient between Y and the explanatory variables (X
1
,
X
2
, …, X
i-1
, X
i+1
2
i
is the squared multiple correlation coefficient between X
i
and other explanatory
variables. We may calculate for each of explanatory variable separately. The VIF
i
s measures
the degree of multicollinearity among regressors with reference to the idea situation where all
explanatory variables are uncorrelated (R
2
i
= 0 implies VIF
i
= 1)
2
. VIF
j
s will be useful for
dropping some variables and imposing parameter constraints only in some very extreme cases
where R
2
i
is approximately equal to zero.
4) Remedial measures
4.1) Getting more data: Increasing the size of the sample may reduce the multicollinearity problem.
The variance of the coefficient is defined as follows:
=
where σ
2
is the variance of the error term.
()
∑
−
=
=
n
1i
2
ii
X
i
X
ii
S
R
2
i
is the squared multiple correlations coefficient between X
i
and other explanatory
variables
As sample increases, S
β
ˆ
2
β
ˆ
where Q, P and I represent quantity of products, price and income respectively.
The time-series data of income and price were both highly collinear.
First, we estimate the income elasticity because the data which is at a point in time, the
price do not vary much; is known as the extraneous estimate.
2
β
ˆ
2
β
ˆ
Second, we regress (lnQ – lnI) on lnP to get the estimates of and
2
β
ˆ
α
ˆ
1
β
ˆ
Written by Nguyen Hoang Bao May 24, 2004
Applied Econometrics Multicollinearity
5
We may not interpret the problem of how income elasticity is not changed over time. However,
β
ˆ
Y
2
= X
1
+
2
Z
1
β
ˆ
α
ˆ
X
1
and Z are not highly correlated and we get good estimate of . Then we regress
(Y
1
– X
1
) on X
2
to get an estimate .
1
β
ˆ
1
β
ˆ
X
1
+ β
2
X
2
+ β
3
X
3
As there are three variables, we have seven possible equations (the number of equation is
2
k
-1) to estimate, where k is the regressors. In some cases, there may be one or more
Written by Nguyen Hoang Bao May 24, 2004