A study on the validity of the current final English test for the 2nd semester non-English majors at Hanoi University of Industry = Nghiên cứu về tính xác thực - Pdf 26

VIETNAM NATIONAL UNVERSITY, HANOI
UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES
FACULTY OF POST-GRADUATE STUDIES

NGUYEN VAN BAC A STUDY ON THE VALIDITY OF THE CURRENT FINAL
ENGLISH TEST FOR THE 2ND SEMESTER NON-ENGLISH
MAJORS AT HANOI UNIVERSITY OF INDUSTRY
(Nghiên cứu về tính xác thực của bài thi tiếng Anh cuối học kỳ thứ
hai hiện nay dành cho sinh viên không chuyên tiếngAnh tại trường
Đại học Công Nghiệp Hà Nội)

M.A. Minor Programme Thesis
Field: Methodology
Code: 60 14 10 Hanoi, 2010

iv
TABLE OF CONTENTS CANDIDATE’S STATEMENT………………………………………………….
i

ACKNOWLEDGEMENT………………………………………………………
ii

ABSTRACT……………………………………………………………………….
iii

TABLE OF CONTENTS …………………………………………………
iv

LIST OF ABBREVIATIONS ……………………………………………
vii

LIST OF TABLES AND CHARTS ……………………………………………
viii

CHAPTER 1: INTRODUCTION……………………………………………….
1

1.1. RATIONALE……………………………………………………………
1

1.2. SCOPE OF STUDY………………………………………………………

2.2.3. The current trends in language testing…………………………
9

2.3. QUALITIES OF A GOOD TEST……………………………………….
10

2.3.1. Reliability…………………………………………………………
10

2.3.2. Validity……………………………………………………………
11

v

2.3.3. Practicality…………………………………………………………
11

2.4. VALIDITY……………………………………………………………….
12

2.4.1. Content or face validity………………………………………….
12

2.4. 2. Response Validity………………………………………………
13

2.4.3. Concurrent validity and predictive validity…………………….
13

2.4.4. Construct validity………………………………………………….

3.2.2. Interview…………………………………………………………….
21

3.2.3. Document analysis………………………………………………….
21

3.3. DATA COLLECTION PROCEDURE………………………………….
22

CHAPTER 4: FINDINGS AND DISCUSSIONS……………………………….
23

4.1. DATA ANALYSIS………………………………………………………
23

4.1.1. Data analysis of students and teachers survey questionnaires
and interviews……………………………………………………………

23

4.1.2. Data analysis of students’ score……………………………
32

4.2. DISCUSSIONS……………………………………………………………
34

vi

4.3. SUGGESTIONS FOR IMPROVING THE QUALITY OF THE

Appendix 2A………………………………………………………………………
VII

Appendix 2B………………………………………………………………………
X

Appendix 3………………………………………………………………………
XIII

Appendix 4………………………………………………………………………
XIV

Appendix 5………………………………………………………………………
XV

Appendix 6………………………………………………………………………
XVII

vii
LIST OF ABBREVIATIONS
1. EC
Economic and Computer Science Students Group
2. ESP
English for Specific Purposes
3. HaUI
Hanoi University of Industry
4. LT

Chart 2:
Teachers’ comments on time allowance of the test
Chart 3:
Appropriateness of the final test in student’s opinion
Chart 4:
Teachers’ comment on the test appropriateness
Chart 5:
Test items best measure students’ true ability in students’ perception
Chart 6:
Teachers’ opinions on test items best measuring your students’ true ability
Chart 7:
Student’s comments on the level of Grammar and Vocabulary test
Chart 8:
Teachers’ comments on Grammar and Vocabulary test
Chart 9:
Student’s comment on the difficulty level of Reading Comprehension test
Chart 10:
Teachers’ comment on the difficulty level of Reading commprehension test
Chart 11:
Student’s comment on the appropriateness of Writing test
Chart 12:
Teachers’ comment on the writing test
Chart 13:
Student’s comment on Listening commprehension test
Chart 14:
Teachers’ comment on the construct of Listening comprehension test

that test writers often choose the test items somewhere else, but not based on the course
book and the syllabus given at the beginning of the course.

2
One more reason why I choose this topic for my research is that the test evaluation and
assessment at Hanoi University of Industry appear not to receive proper attention. Being a
teacher of English, I have also involved in designing many kinds of test for non English
major students at HaUI, but there are no formal discussions, no systematic and
comprehensive assessments, and no research on the appropriateness of the tests.
With above mentioned reasons, I have decided to choose the research topic: “A study on
the validity of the current final English test for the 2nd semester non-English Majors at
Hanoi University of Industry.” It is believed that this study will be helpful for English
teachers in English Faculty of Hanoi University of Industry who often participate in
designing the progress tests and final achievement exams.

1.2. SCOPE OF STUDY
The scope of this minor thesis is limited to a study on examining the validity of the current
final test for second semester non English major students in terms of its validity for the
non-English majors at Hanoi University of Industry.
Due to the limitations of time, the author cannot send the questionnaires to all non-English
students of Hanoi University of Industry. However, to achieve a broad view from the
teachers and students of Hanoi University of Industry about the final test in terms of its
validity, the author tries his best to give the questionnaires to the students of 5 Faculties
including Economic Faculty, Chemistry Technology Faculty, Electronic Technology
Faculty, Mechanical Technology Faculty, and Electrical Technology Faculty. The students
questioned are all university students, not covering the college students. The author also
cannot conduct the survey and interview with all the teachers of English Department;
instead, he selects the experienced ones who regularly involve in designing tests for non
English major students and those who are currently involving in teaching first year
students in their 2

1. Does the final test for non English major students in the 2
nd
semester give a true picture
of truly the students’ English Competence according to the view of teacher and students?
2. Does the test measure what is purported to measure (i.e. its validity)?
3. How can the test be made valid? In what way should the current final test be improved?

1.6. DESIGN OF STUDY
The study consists of five chapters, organized as follows:
Chapter 1- Introduction- provides background to the study, identifies the problems,
states the aim, purpose and significance of the study, the scope, the methods, the research
questions and the design of the study.
Chapter 2 - Literature review- Presents a review of related literature that provides
the theoretical background of the testing and evaluation in general and the test validity in

4
particular. This review also provides an overview of other studies related to testing,
evaluation, especially the evaluation of tests in terms of its validity.
Chapter 3 - The Study- Provides information about the subjects of the study. It then
describes the data collection instruments and data collection procedure. The rationale for
choosing such data collection instruments is also provided.
Chapter 4 - Findings and Discussions- Analyses and discusses the data collected to
reveal the real results and the validity of the final 2
nd
semester exam for non English major
students of HaUI. The causes for any problems if any and some implications for effective
final achievement tests will be also discovered.
Chapter 5 - Conclusion- Summarizes the major findings that are hoped to find the
appropriate way to enhance the validity of final achievement tests to non English majors.
Limitation of the study and suggestions for further research are also given in this chapter.

relation. Teaching and learning provide a great source of language materials for test and in
turn, testing reinforces, improves and encourages the teaching and learning process.
“teaching and testing are so closely related that it is virtually impossible to work in either
field without being constantly concerned with the other.” (Heaton, 1988:5).

6
2.2. LANGUAGE TESTING
2.2.1. Purpose of language testing
Shohamy (1985: 6) made a distinction between classroom tests and external tests.
Classroom tests are written and administered by teachers while external tests are designed
and submitted by an external agency. The purposes of classroom tests are to find out
whether what was taught in the program was also successful acquired; evaluate and
improve instruction; obtain information on students’ progress and language knowledge;
help organize learning/ teaching materials; provide information for grades; help diagnose
students’ strengths and weaknesses in the language and motivate students to learn.
External tests, however, evaluate proficiency; decide whether to accept students to a
certain program; provide information for administrative decision- special treatment to
certain group, assist in selection and grouping; help evaluate the curriculum; serve research
purposes and obtain information for grading.
Having some ideas similar to Shohamy, but Henning explains the purpose of language
testing in a different way. According to him, language tests aim to deal with the diagnosis
and feedback, screening and selection, placement, program evaluation, providing research
criteria, and assessment of attitudes and sociopsychological differences. (Henning, 1987,
pp 1-4).
He states that the most common aim of language test is to find out strengths and
weaknesses in students’ learning ability. In this sense, the use of diagnostic tests provides
critical information to the student, teacher as well as administrator that should make the
leaning process more efficient.
Language tests can also be used to decide whether students should be allowed to
participate in a particular program of instruction. To make fair selection and decision, the

A proficiency test aims at assessing the student’s ability to apply in actual situations what
he or she has learnt. The test is not usually related to any particular course because it is
concerned with the student’s current standing in relation to his or her future needs.
The following is the summary of types of language test established by Harrison:

8

Category
Content
Purpose
Considerations
Placement
General reference
forward to future
learning
Grouping
speed of results
variety of tests
interview
Diagnostic
Detailed reference back
to class work
Motivation
Remedial work
Short term objectives
New examples of the
materials taught

domain referenced vs. norm referenced or standardized tests, speed tests vs. power test and
other test categories. (Henning, 1987, pp 4-9).
The objective vs. subjective tests are distinguished on the basis of the manner in which
they are scored. An objective test may be scored by comparing examinee responses with an
established set of acceptance responses or scoring key. The example of this kind of test is
multiple choice test. On the other hand, a subjective test may be scored by opinionated
judgment based on insight and expertise of the scorer. The example of this type would be

9
free composition or cloze tests which permit all grammatical acceptable responses to
systematic deletions from a context.
Direct tests are said to test language performance directly whereas indirect tests indirectly
tap true language performance. The direct tests are usually in the forms of spoken tests
which are the ratings of language use in real communication situations. The indirect tests
are usually in the forms of written tests such as multiple choice, cloze tests.
Discrete point tests, as a variety of diagnostic tests, are designed to measure knowledge or
performance in very restricted areas of the target language. Integrative tests, on the other
hand are used to assess a greater variety of language abilities.
Aptitude tests are usually used to measure the suitability of a specific program of
instruction or a particular kind of employment. Achievement tests are used to measure the
extend of what students have already learnt. Proficiency tests are the most often global
measures of ability in language or other context area.
For the criterion-or domain-referenced test, the instructions are designed after the test are
created. The tests must match teaching objectives perfectly and they are useful when
objectives are under constant revision. Such kinds of test are useful with small and/or
unique group for whom norms are not available. The norm- referenced or standardized
tests, on the other hand, must have been administered to a larger number of examinee from
the target population. Acceptable standards of achievement can only be found by reference
to the mean or average score.
Speed test is the test in which the items are easy but the time seem to be insufficient. In

2.3.1. Reliability
The test is reliable if it consistently provides accurate measures of abilities at all times,
with different students and/or different testers . According to Harrison (1991: 10), “the
reliability of the test is its consistency.” He confirmed that it is very important that the
students’ score should be the same or nearly the same whether the test taker takes one test
or another, and the same result the test taker obtain whether the test is marked by one
person or another, and a test should measures the same thing all the time. “There are three
aspects to reliability: the circumstances in which the test is taken, the way in which it is
marked and the uniformity of the assessment it makes.” Harrison (1991: 11).
Henning (1987: 74) supposes “Reliability is thus a measure of accuracy, consistency,
dependability, or fairness of scores resulting form administration of a particular
examination.” He added that if reliability is concerned with accuracy of measurement,
reliability may increase when the error of measurement is made to minimize. Therefore,

11
we should take care of the amount of error present in our measurement so that the
reliability could be quantified.
The term reliability, according to Bachman and Palmer (1996), refers to consistency of
measurement. Elaborately, they say that a reliable test score is consistent across different
characteristics of the testing situation. Moreover, if test scores are inconsistent, they
provide no information about the ability being measured. Because it is impossible to
eliminate inconsistencies on the whole, we try to reduce variations in the test's task
features.

2.3.2. Validity
The test is valid if it measures what is intended to measure. According to Bachman and
Palmer (1996), the term construct validity refers to the extent to which people can interpret
a given test score as an indicator of the abilities or constructs that people want to measure.
However, no test is entirely valid because validation is an ongoing process (Weir, 2005).

2.4.1. Content or face validity
Commonly, testing specialists consider content and face validity to be synonyms
(Magnusson, 1967). Of course, some others make distinction between them and suppose
that face validity, unlike content validity, is often determined impressionistically.
Content or face validity is intuitive and logical but usually lacks an empirical basis. The
name of this kind of validity shows that it is concerned with whether or not the content of
the test is sufficiently representative and comprehensive for the test to be a valid measure
of what it is supposes to measure.
The test content must be selective. For example, the achievement test’s content should be
bound to the content of instruction which in turn is constrained by the instructional
objectives.
According to Bachman (1990), there are two aspects of content validity including content
relevance and content coverage. The content relevance requires the specification of the
behavioral domain in question and the attendant specification of the task or test
domain. Content coverage is the extent to which the tasks required in the test adequately
represent the behavioral domain in question. Demonstrating that a test is relevant to
and covers a given area of content or ability is therefore a necessary part of validation.

13
2.4.2. Response Validity
Response validity refers to the extent to which examinees respond in the manner expected
by the test developer. It mentions the response manner of the test takers and the instruction
of the test. For example, if the test takers respond in a difficult and unreflective manner,
their obtained score may not represent their actual ability. Moreover, if the instruction of
the test is unclear and the test format is unfamiliar to the examinees, their response may not
reflect their true ability. The two cases mentioned above may be said to be lack of response
validity

Construct validity concerns the extent to which performance on tests is consistent with
predictions that we make on the basis of a theory of abilities, or constructs. (Bachman
1990: 255). In order to examine the construct validation, it is necessary to exam patterns
of correlations among item scores and test scores, and between characteristics of items and
tests and scores on items and tests; analyze and model the processes underlying test
performance; study group differences; study changes over time, or investigate the effects
of experimental treatment (Messick 1989).
15
CHAPTER 3: THE STUDY

3.1. THE SUBJECT AND THE CONTEXT OF ENGLISH TEACHING AND
LEARNING AT HAUI
3.1.1. English teaching and learning context at HaUI.
English faculty is one of the biggest faculties of Hanoi University of Industry. There are
more than 150 teachers of English who are divided into three divisions. One division is in
charge of teaching English for students of English, the other one is in charge of teaching
English for secondary and vocational student, and the biggest one teaches English for all
college and university non English major students. All students of Hanoi University of
Industry study English as their foreign language.
According to the objectives given in the syllabus , the teaching aims of the English course
for the non English students in the second semester are stated as follows:
In general, it helps enhance the knowledge and skills students have studied at the
elementary level (the 1
st
term), as well as improve General English level of student up to
pre-intermediate level.
In details, it aims to provide students with knowledge of vocabulary, grammar,

1
Unit 1: Getting to know you!
4
4 2
Unit 2: The way we live
4
4 3
Unit 3: It all went wrong
4
4 4
Unit 4: Let’s go shopping!
4
4 5
Stop and check 1 + Progress test 1

1
1

11
Unit 9: Going places
4
4 12
Unit 10: Scared to death
4
4 13
Unit 11: Things that changed the
world
4
4 14
Unit 12: Dreams and reality
4
4 15
Stop and check 3 + Progress test 2

1
1

intermediate (2000) by Liz and John Soars. Besides, students are recommended to use

17
another reference book named English Grammar in Use by Murphy, R. In the teaching
process, teachers also use other materials to present and recycle the basic structures of
English to develop students’ proficiency in using these structures in certain contexts. The
focus is also placed on reinforcing and improving students’ knowledge of vocabulary and
students’ ability of communication.
The major teaching points of the course book for the second semester are presented in
appendix 5.

3.1.2. English Testing for non English majors at HaUI
For each semester, students are required to take at least three progress tests and one final
achievement test. During my teaching at HaUI, I reckon that testing is not the main
concern of teachers. Testing has not been paid proper attentions and carefully studied in
terms of its validity, reliability, format and practicality.
Within the scope of this thesis, the study focuses on investigating the validity of the final
achievement English test (for the second semester) for non English major students who
have been learning English for 120 class hours covering all 14 units of New Headway Pre-
intermediate. Hereunder is the testing format registered to the second semester non English
Majors named Test 2. or the final achievement test.
Test 2 with the time allowance is 60 minutes has total score of 100 points and consists of
the following parts:
Section A (20 points): Grammar and Vocabulary. This section includes 20 multiple choice
questions and is marked 20 points.
Section B (20 points): Reading comprehension. This section contains 2 short reading
passages with 10 multiple choice questions.
Section C (20 points): Listening. In the listening section, students are required to listen to
several short conversation or short talk and then answer the questions. There are 10
multiple choice questions and 10 true/ false questions with 1 point for each correct answer.

Economics and Computer Science (75 students, hereinafter referred to as EC) and the
second one includes other students (75 students hereinafter referred to as OM)
participated in this survey.

3.1.3.2. Teachers
The English Department is one of the biggest departments of HaUI in terms of its
staff number. There are more than 140 teachers of English who are in charge of teaching
English for almost all students of HaUI including vocational students, college students and
university students. In this study, 15 teachers of English at HaUI are selected. They all

Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

A study on the validity of the current final English test for the 2nd semester non-English majors at Hanoi University of Industry = Nghiên cứu về tính xác thực - Pdf 26

Tài liệu, ebook tham khảo khác

Học thêm