tiểu luận kinh tế lượng factors that determine housing prices - Pdf 64

FOREIGN TRADE UNIVERSITY
FACULTY
OF INTERNATIONAL ECONOMICS
----------------- o0o--------------- --

ECONOMETRIC

REPORT

Topic : Factors that d etermine housing price s

Class
Group No.
Student Name – ID
Supervisor

:
:
:
:

57 JIB
29
Nguyen Hoang Dang 1815520160
Dr Tu Thuy Anh
Dr Chu Thi Mai Phuong

H anoi, 2018


Group 7 Econometric Report

X. Conclusion..................................................................................................................... 15
XI. References....................................................................................................................16

Exhibit 1: Definition of variables in the Housing Price model .......................................................... 4
Exhibit 2: Statistic indicators of variables in the Housing Price model ........................................... 5
Exhibit 3: Correlation matrix .............................................................................................................. 6
Exhibit 4: Scatterplot of variables in the Housing Price model ........................................................ 7
Exhibit 5: Regression model................................................................................................................. 8
Exhibit 6: Multicollinearity test ........................................................................................................... 9
Exhibit 7: Heteroskedasticity test ...................................................................................................... 10
Exhibit 8: Residual-versus-fitted plot of the Housing Price model ................................................ 11
Exhibit 9: Correcting heteroskedasticity .......................................................................................... 11

2


Group 7 Econometric Report
Exhibit 10: Hypothesis testing of multiple regression model of neighborhood factors ................ 12
Exhibit 11: Hypothesis testing of multiple regression model of accessibility factors .................... 13

I. Introduction
As much as Economy is a meaningful science that determines the social development in
general and national growth in particular, Econometrics is the use of statistical techniques
to understand those issues and test theories. Without evidence, economic theories are
abstract and might have no bearing on reality (even if they are completely rigorous).
Econometrics is a set of tools we can use to confront theory with real-world data.
Given the data set, our group, which includes three members: Nguyen Ha Trang, Nguyen Mai
Thuy Tien, and Nguyen Thi Lan Huong, follows the methodology of econometric comprising
eight steps to analyze the data. Note that because of the lack of information on the data set, all
inferences of abbreviations and others are based on assumptions and self-research. As a result,

Group 7 Econometric Report

III. Economic model
As data are provided up front, the economic model used in this report is an empirical one.
Note that the fundamental model is mathematical; with an empirical model, however, data is
gathered for the variables and using accepted statistical techniques, the data are used to
provide estimates of the model's values.
Empirical model discovery and theory evaluation are suggested to involve five key steps, but
for the limitation of purpose and resources, this part of the report only follows three of them:
(1) specifying the object for modeling, (2) defining the target for modeling, (3) embedding
that target in a general unrestricted model.
1. Specifying the object for modeling

price  f x

(1)

As such, this report finds the relationship between housing price, which is the object for
modeling, and each of relating factors including structure, neighborhood, accessibility, and air
pollution ones.
2. Defining the target for modeling by the choice of the variables to analyze, denoted

xi
As mentioned above, there are four main categories that are expected to affect housing prices:
structure, neighborhood, accessibility, and air pollution. Hence, the choices of xi would be
such variables that constitute them. After thorough research, factors have been narrowed down
to eight significant ones: (structure) number of rooms, (neighborhood) crimes, property tax,
the percentage of people of low status, student-teacher ratio, (accessibility) distances to
employment centers, accessibility to radial highways and (air pollution) nitrous oxide.
3. Embedding that target in a general unrestricted model (GUM)

Group 7 Econometric Report

lowstat

percentage of people of low status

IV. Econometric model
To demonstrate the relationship between housing price and other factors, the regression function
can be constructed as follows:
 (PRF):
lprice o 1crime2nox3rooms4dist 5radial 6proptax7stratio8lowstat i 
(SRF): lprice o 1crime2nox3rooms4dist 5radial 6proptax7stratio8lowstat
i
where:
0 is the intercept of the regression model
i is the slope coefficient of the independent variable xi
 is the disturbance of the regression model
0 is the estimator of 0
i is the estimator of i
i is the residual (the estimator of i )
From this model, this report is interested in explaining lprice in terms of each of the eight
independent variables (crim,nox,rooms,dist,radio, proptax,stratio ).

V. Data collection
1. Data overview
• This set of data is a secondary one, as they are collected from a given source.
• Data source: Regression Diagnostics: Identifying Influential Data and Sources of Collinearity,
by D.A. Belsey, E. Kuh, and R. Welsch, 1990. New York: Wiley
• The structure of Economic data: cross-sectional data
2. Data description

506
3.795751
2.106137
1.13
radial
506
9.549407
8.707259
1
proptax
506
40.82372
16.85371
18.7
stratio
506
18.45929
2.16582
12.6
lowstat
506
12.70148
7.238066
1.73

5

Min
10.8198
88.976

rooms
1.0000
0.5828
-0.3540
-0.4956

lprice
crime
nox
rooms
dist
radial proptax stratio
1.0000
crime
-0.5275
1.0000
nox
-0.5088
0.4212
0.6329 -0.2188 -0.3028
1.0000
dist
0.3420 -0.3799 -0.7702
radial
-0.4810
0.6254
0.6103 -0.2098 -0.4951
1.0000
proptax
0.6670 -0.2921 -0.5344

- lprice and dist have a weak uphill relationship
- lprice and radial have a moderate downhill relationship
- lprice and proptax have a moderate downhill relationship
- lprice and proptax have a moderate downhill relationship
- lprice and proptax have a strong downhill relationship
The correlation between each pair of variables can be visualized using the scatter command
in Stata.
The result is shown in Exhibit 4.

6


Group 7 Econometric Report
Exhibit 4: Scatterplot of variables in the Housing Price model

7


Group 7 Econometric Report

2. Regression run
Having checked the required condition of correlation among variables, the regression model is
ready to run. In Stata, this is done by using the command:
reg lprice crime nox rooms dist radial proptax stratio lowstat

The result is shown in Exhibit 5.
Exhibit 5: Regression model
Source

SS

.
3
0
0
0
0
6
0
8
.
6
5

84.582225

505

.167489554

8,

506

497) =

204.33

Prob > F

=

.
I
n
t
e
r
v
a
l
r i m e
. 0 1 1 1 8 2 5
. 0 0 1 3 6 1 4
8 . 2 1
0 .
. 0 1 3 8 5 7 3
. 0 0 8 5 0 7 8
n o x
. 0 7 5 4 5
0 1 4 6 9 3 6
5 . 1 4
0 . 0 0 0
. 1 0 4 3 2 5 6
. 0 4 6 5
r o o m s
. 0 9 9 6 5 4 5
. 0 1 6 7 6 9 7
5 . 9 4
0 0
. 0 6 6 7 0 6 1
. 1 3 2 6 0 2 8

. 0 3 1 3 8 4 6
l o w s t
0 2 8 0 3 8 4
. 0 0 1 9 1 5 4
1 4 . 6 4
0 . 0 0 0
. 0 3 1 8
. 0 2 4 2 7 5 2
_ c o n s
1 1 . 1 9 5 0 7
. 2 0 3 7 2
4 . 9 5
0 . 0 0 0
1 0 . 7 9 4 7 9
1 1 . 5 9 5 3 5

0
6
8
0
3
3
0
0
3
a
0
9

[


8

1 0.0112

: When the number of crime committed per capita increases by one, the expected
value of housing price decreases by 1.12%.
2 0.0755

: When nitrous oxide increases by one part per 100 million square, the expected
value of housing price decreases by 7.55%.


Group 7 Econometric Report

-

-

-

-

-

3 0.0997

: When the number of rooms increases by one, the expected value of housing price
decreases by 9.97%.
4 0.0464

• Based on the data collected from the table, the sample regression function is established:
SRF:lprice
11.20.01crime0.08nox0.1rooms0.05dist
0.01radial
0.01proptax

0.03stratio0.03lowstat 

VII. Check multicollinearity and heteroscedasticity
1. Multicollinearity
Multicollinearity is the high degree of correlation amongst the explanatory variables, which
may make it difficult to separate out the effects of the individual regressors, standard errors
may be overestimated and t-value depressed. The problem of Multicollinearity can be detected
by examining the correlation matrix of regressors and carry out auxiliary regressions amongst
them. In Stata, the vif command is used, which stand for variance inflation factor.
Exhibit 6 shows the result.
Exhibit 6: Multicollinearity test

9


Group 7 Econometric Report
Variable
proptax
radial
nox
dist
lowstat
rooms
crime

The problem of Heteroskedasticity can be detected by plotting the residuals against each of
the regressors, most popularly the White’s test. It can be remedied by respecifying the model –
look for other missing variables. In Stata, the imtest white command is used, which
stands for information matric test.
Exhibit 7 shows the result.
Exhibit 7: Heteroskedasticity test

. imtest, white
White's test for Ho: homoskedasticity
against Ha: unrestricted heteroskedasticity
chi2(44)
=
Prob > chi2 =
0.0000

235.31

Cameron & Trivedi's decomposition of IM-test

S o u r
c e

c h i 2

d f

p

H e t e r o s k e d
2 3 5 . 3 1

T o t
a l

2 8 1 . 8 9
0 . 0 0 0 0

5 3

At the 5% significance level, there is enough evidence to reject the null hypothesis and conclude
that this set of data meets the problem of Heteroskedasticity.
Another way to test if Heteroskedasticity exists is to graph the residual-versus-fitted plot, which
can be generated using the rvfplot, yline (0) line command in Stata.
The result is shown in Exhibit 8.
Exhibit 8: Residual-versus-fitted plot of the Housing Price model

In a well-fitted model, there should be no pattern to the residuals plotted against the fitted
values - something not true of our model. Ignoring the outliers at the top center of the graph,
we see curvature in the pattern of the residuals, suggesting a violation of the assumption that
price is linear in our independent variables. We might also have seen increasing or decreasing
variation in the residuals— heteroskedasticity.
To fix the problem, robust standard errors are used to relax the assumption that errors are both
independent and identically distributed. In Stata, regression is rerun with the robust option,
using the command: reg lprice crime nox rooms dist radial proptax stratio
lowstat, robust

Exhibit 9 shows the result.
Exhibit 9: Correcting heteroskedasticity
Linear regression

Number of obs =

.1992

R o b u s t
[
c
.
1
r
.
0
r
.
.
1
0
8
0
1

r
0
5
o
0
0
a
0
0

9

1 3
s
0 0
8 4
9 9
9

r i c e
C o e f .
S t d . E r r .
t
P > |
%
C
o
n
f
.
I
n
t
e
r
v
a
l
. 0 1 1 1 8 2 5
. 0 0 1 9 0 3 5
5 . 8 7
0 . 0

. 0 0 6 2 1
6 4 1
4 . 5 5
0 . 0 0 0
. 0 0 8 8 9 3 5
. 0 0 3 5
t r a t i o
. 0 4 1 3 3 2 7
. 0 0 4 2 3 2 2
9 .
. 0 4 9 6 4 7 8
. 0 3 3 0 1 7 5
l o w s t a t
.
. 0 0 3 5 8 4
7 . 8 2
0 . 0 0 0
. 0 3 5 0 8 0 1
6 7
_ c o n s
1 1 . 1 9 5 0 7
. 2 6 7 2 8 0 6
0 . 0 0 0
1 0 . 6 6 9 9 3
1 1 . 7 2 0 2 1

t |
]
0 0
. 0

the model's explanatory power
Alternative Hypothesis: At least one of the independent variables in the subset is useful in
explaining/predicting lprice

Ho :1 6 7  0
which is expressed as: H1 :atleastonej 


0

In Stata, the test statistic F is calculated using the command:
test crime proptax lowstat stratio

12


Group 7 Econometric Report

The result is shown in Exhibit 10.
Exhibit 10: Hypothesis testing of multiple regression model of neighborhood factors

( 1)

crime = 0

( 2)

proptax = 0

( 3)


The initial assumption is that the subset does not contribute to
the model's explanatory power
Alternative Hypothesis: At least one of the independent variables in the subset is useful in
explaining/predicting lprice

Ho :4 5  0
which is expressed as: H1 :atleastonej 


0

In Stata, the test statistic F is calculated using the command:
test dist radial

The result is shown in Exhibit 11.
Exhibit 11: Hypothesis testing of multiple regression model of accessibility factors

( 1)
13

dist = 0


Group 7 Econometric Report

( 2)

radial = 0
F(


X. Conclusion
This report is completed on the dedicated contribution of each member and the knowledge
from our study in Econometrics. This also provides us with a good opportunity to practice
what we have learned and to get a deeper understanding of data analysis and relevant testing.
From this useful application, we hope that our work can somehow suggest the relationship
between the housing prices and structure, neighborhood, accessibility, air pollution factors.
Again, due to the limitation of understanding and resources, our report may contain
misinterpretations. We hope that Dr. Le Thanh Binh and readers can give us constructive
comments on the report so that we would improve ourselves and do better in the future.
Sincerely,
Group 7

15


Group 7 Econometric Report

XI. References
1.
2.
3.
4.

16

/> /> />D.A. Belsey, E. Kuh, and R. Welsch, Regression Diagnostics: Identifying Influential Data and
Sources of Collinearity, New York: Wiley (1990).



Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Music ♫

Copyright: Tài liệu đại học © DMCA.com Protection Status