Verbs in the Written English of Chinese Learners: A Corpus-based Comparison between Non-native Speakers and Native Speakers potx - Pdf 12

Verbs in the Written English of Chinese Learners:
A Corpus-based Comparison
between Non-native Speakers and Native Speakers

by
Xiaotian Guo
A thesis submitted to the University of Birmingham
for the degree of DOCTOR of PHILOSOPHY

Supervisor: Professor Susan Hunston

The Department of English The School of Humanities
The University of Birmingham October 2006

University of Birmingham Research Archive

individual forms of verbs and the findings suggest that there is less homogeneity in the learner
English than the NS English. Chapter Six extends the research to verb–noun relationships in
the learner English and the NS English and the result shows that the learners prioritise verbs
over nouns. Chapter Seven studies the learners’ preferences in using the patterns of KEEP
compared with those of the NSs, and finds that the learners have various problems in using
this simple verb. In this chapter, too, my reservations about the traditional use of ‘overuse’
and ‘underuse’ are expressed and a finer classification system is suggested. Chapter Eight
compares another frequently-occurring verb, TAKE, in the aspect of collocates and yields
similar findings that the learners have problems even with such simple vocabulary. In Chapter
Nine, the research findings from Chapter Four to Chapter Eight are revisited and discussed in
relation to the theme of the thesis. The concluding chapter, Chapter Ten, summarises the
previous chapters and envisages how learner language studies will develop in the coming few
years.
ii
Acknowledgements

First and foremost, I would like to thank my supervisor Professor Susan Hunston. She spent a
large amount of time on my thesis and guided me from the design of the research to the last
version of each chapter. As an experienced supervisor and teacher, she knows very well when
to leave me free exploring for something useful and when to bring my attention back to things
with value. She hardly tells me what to do, but offers suggestions, comments, and clues for
further development, leaving me enough time to reflect and digest. Undoubtedly, the
knowledge I obtained from her supervision will be the most valuable assets for my academic
career.

Secondly, my thanks should go to my beloved wife, Xiaorong (Wang). Actually, she sacrificed
so much for my PhD study that I can hardly find appropriate words to express my gratitude.

and his family for their encouragement and support. My special thanks go to my daughter
who accompanied me through the ups and downs of the years, especially when my wife had
to work in another place. She also helped me with the proofreading of the Chinese pin-yin
(the remaining errors still belong to me, of course).

Furthermore, thanks are overdue to the Great Britain-China Education Trust and Sino-British
Fellowship Trust for the £1000 fellowship which was sent to me on the very day of the
Chinese Spring Festival of 2003. It was the only funding I gained throughout my PhD study.
Even though such an amount was far from liberating me from the financial strains, the very
act of providing such a grant justified my study and greatly encouraged me to go through the
rest of the difficulties. It meant a lot to me.

Last but not least, I must thank the University of Birmingham, especially the staff members of
the Department of English, the School of Humanities, the Information Service, the Academic
Office and the International Office for their unfailing and patient support.
iv
Table of Contents
INTRODUCTION 1

1.1 T
HE THEME AND AIM OF THE RESEARCH
1

1.2 I
NTRODUCING COMPUTER LEARNER CORPUS RESEARCH
1

8

2.1.1 Error analysis recalled 8

2.1.2 Second language acquisition reviewed 11

2.1.3 Conclusion 11

2.2 C
OMPUTER LEARNER CORPORA
:
A NEW ERA
12

2.2.1 The International Corpus of Learner English 13

2.2.2 The Longman Learners’ Corpus 13

2.2.3 The Hong Kong University of Science and Technology Learner Corpus 14

2.2.4 The Chinese Learner English Corpus 14

2.2.5 Computer learner English studies as a ‘newborn baby’ of applied linguistics 15

2.3 T
YPOLOGY OF
CLC
DATA
16

2.6.2 Quantitative plus qualitative: approaching CLC data 22

2.7 L
EARNER
E
NGLISH FEATURES
23

2.7.1 The informal and speechlike features of written learner English 24

2.7.2 Small vocabulary range, overuse of general vocabulary and the ‘teddy bear
principle’ 28

2.7.3 More open-choice-principled than idiom-principled 30

2.7.4 Proficiency level and fossilised errors 31

2.7.5 The essential role of L1 in L2 production 33

2.7.6 A narrower range of senses in the use of vocabulary 34

2.8. A
PPLICATIONS OF RESEARCH RESULTS
35

2.8.1 TeleNex 35

2.8.2 CALL Tools 36

CHAPTER THREE 50

THE DATA AND THE TOOLS 50

3.1 I
NTRODUCTION
50

3.2 T
HE DATA
50

3.2.1 The Learner Corpus – COLEC 50

3.2.2 The Native Speaker Corpus - LOCNESS 52

3.2.3 The back-up resources 56

vi
3.2.3.1 The Bank of English 56

3.2.3.2 The Google search engine 57

3.3 T
HE
W
ORD

4.2.3 The difficulties in making a verb lemma list 68

4.2.4 Two approaches to making a verb list 69

4.3 M
AKING TWO VERB LEMMA LISTS
70

4.3.1 The lemma list archetype 70

4.3.2 Tagging the corpora 72

4.3.3 Editing the raw verb lemma lists 74

4.3.3.1 Dealing with small-frequency lemmas 75

4.3.3.2 Detecting wrongly used lemmas 75

4.4 M
AKING SENSE OF THE TWO VERB LEMMA LISTS
76

4.4.1 A rational study 76

4.4.1.1 Some explorations in semantic theory applications in vocabulary teaching 76

4.4.1.2 Some pioneering work concerning the presentation of vocabulary to learners 81

4.4.1.3 Some explorations in verb classification based on syntactic constructions 82

118

CHAPTER FIVE 120

VERBS IN DIFFERENT FORMS COMPARED 120

5.1 I
NTRODUCTION
120

5.2 A
GENERAL VIEW OF THE TOTAL FREQUENCY OF THE DIFFERENT FORMS OF VERBS
121

5.3 T
HE TOP
20
VERBS IN THEIR DIFFERENT FORMS IN
LOCNESS
AND
COLEC 122

5.3.1 The top 20 verbs in their different forms in LOCNESS 123

5.3.2 The top 20 verbs in their different forms in COLEC 124

5.4 T
HE DIFFERENT FORMS OF THE TOP
20
VERBS COMPARED

5.5.7 Some remarks in summary 145

5.6 S
OME PEDAGOGICAL IMPLICATIONS
146

5.6.1 Significance for the writer of teaching materials 146

viii
5.6.2 Significance for the teacher and the learner 147

5.6.3 Significance for learner English level evaluation 148

5.6.4 Implications for further corpus design, construction and comparison 148

5.6.5 Some problems revealed concerning CLC studies 149

5.7 C
ONCLUSION
150

CHAPTER SIX 151

BETWEEN VERBS AND NOUNS 151

6.1 I
NTRODUCTION

7.1 I
NTRODUCTION
174

7.2 I
NTRODUCING THE RATIO RELATIONSHIPS BETWEEN THE TWO CORPORA
175

7.3 D
EFINING
‘
PATTERN
’
AND
‘
PHRASE
’ 179

7.4 L
OOKING AT THE PATTERNS OF
KEEP
IN
COLEC
AND
LOCNESS 180

7.4.1 Interpreting the frequency relationships between COLEC and LOCNESS 180

7.4.1.1 A large frequency in COLEC vs. a large frequency in LOCNESS 182

7.5.3 Providing information for learner English gradation 194

7.6 C
ONCLUSION
194

CHAPTER EIGHT 196

USING COLLOCATES TO INTERPRET LEARNER ENGLISH 196

8.1 I
NTRODUCTION
196

8.2 S
OME THEORETICAL UNDERPINNINGS
196

8.3 T
WO RECENT STUDIES OF LEARNER
E
NGLISH IN COLLOCATION
197

8.4 M
AKING A TABLE OF COLLOCATES FROM THE TWO CORPORA
199

8.5 A
DETAILED LOOK AT SOME LARGE

220

CHAPTER NINE 221

DISCUSSIONS 221

9.1 I
NTRODUCTION
221

9.2 T
HE METHODOLOGY OF THIS RESEARCH REVIEWED
221

x

9.2.1 The quantitative approach and the qualitative approach in corpus studies 221

9.2.2 My research methodology 222

9.2.3 Identifying the similarities and disparities between the NNS English and the NS
English 223

9.3 T
HE FUNCTIONS OF A
NNS
VS

9.5.2 A systematic study of all POS words 245

9.5.3 A study of a learner translation corpus 245

9.5.4 A study of learner spoken English 246

9.6 Conclusion 246

CHAPTER TEN 247

CONCLUSION 247

10.1 A
SUMMARY OF THE RESEARCH
247

10.2 S
OME LIMITATIONS OF THE RESEARCH
249

10.3 T
HE NEXT FEW YEARS OF LEARNER CORPUS STUDIES ENVISAGED
250

10.4 F
INAL REMARKS
251

LIST OF REFERENCES 252

APPENDIX 7: THE CONCORDANCES OF ‘V UP’ IN LOCNESS 319 xii
List of Tables

T
ABLE
2. 1

A
SAMPLE OF SOME STUDIES WHICH HAVE NO COMPARABILITY BETWEEN EACH
OTHER
44

T
ABLE
3. 1

C
OMPARISON OF SOME PARAMETERS OF
COLEC
AND
LOCNESS (C
OMP
=

90

T
ABLE
4. 4

A
CATEGORISATION OF THE VERB LEMMA LISTS BY NEIGHBOURING GROUPS
(1) 92

T
ABLE
4. 5

A
CATEGORISATION OF THE VERB LEMMA LISTS BY NEIGHBOURING GROUPS
(2) 96

T
ABLE
4. 6

A
CATEGORISATION OF THE VERB LEMMA LISTS BY NEAR ANTONYMOUS GROUPS
100

T
ABLE
4. 7

T
ABLE
5. 1

T
HE RAW FREQUENCY AND THE PERCENTAGE OF EACH FORM OF VERBS IN
COLEC
121

T
ABLE
5. 2

T
HE RAW FREQUENCY AND THE PERCENTAGE OF EACH FORM OF VERBS IN
LOCNESS
121

T
ABLE
5. 3

T
HE DISTRIBUTION OF THE TOP
20
VERBS IN THEIR DIFFERENT FORMS IN
LOCNESS
123

T

)
IN
LOCNESS
AND
COLEC 127

T
ABLE
5. 7 T
HE TOP
20
THIRD PERSON SINGULAR FORMS
(V-
S
)
IN
LOCNESS
AND
COLEC 128

T
ABLE
5. 8 T
HE TOP
20 V-
ING FORMS IN
LOCNESS
AND
COLEC 130

T
ABLE
5. 12 A
SUMMARY OF THE VERB FORMS THAT ARE NOT SHARED BY THE
COLEC
WRITERS
xiii
IN THE TOP
20
VERBS
135

T
ABLE
5. 13 A
SAMPLE OF A MATCHED LIST OF
V-
N FORMS IN
COLEC
AND
LOCNESS 136

T
ABLE
5. 14 A
LL THE
V-

4) 140

T
ABLE
5. 17 A
LL THE
V-
ING FORMS OCCURRING ONLY IN
LOCNESS (
FREQUENCY
≥

4) 141

T
ABLE
5. 18 A
LL THE
V-
ED FORMS OCCURRING ONLY IN
LOCNESS (
FREQUENCY
≥

4) 142

T
ABLE
5. 19 A

OF
COLEC
AND
LOCNESS 145

T
ABLE
5. 22 T
HE FIRST
20
VERB FORMS THAT ONLY OCCUR IN
LOCNESS (
FREQUENCY
≥

4) 146

T
ABLE
5. 23 A
SUMMARY OF THE VERB FORMS THAT OCCUR ONLY IN
LOCNESS (
FREQUENCY
≥

4) 146

T
ABLE
6. 1T

/N
OUN
) 154

T
ABLE
6. 4 T
HE TOP TEN NORBS THAT ARE MAINLY USED AS NOUNS IN
COLEC (R
ATIO
= N
OUN
/
V-
TOTAL
) 154

T
ABLE
6. 5 T
HE TOTAL FREQUENCY OF VERBS IN TOTAL AND NOUNS IN
COLEC
AND
LOCNESS
155

T
ABLE
6. 6 T
HE TOTAL FREQUENCY OF VERB USE AND NOUN USE OF

T
ABLE
6. 9 T
HE VERB FORMS AND NOUN FORMS OF
25 V-N
PAIRS
162

T
ABLE
6. 10 T
HE FREQUENCIES OF
25
VERBS AND THEIR EQUIVALENT NOUNS IN
COLEC
AND
LOCNESS 162

T
ABLE
6. 11 T
HE TOTAL FREQUENCIES OF VERB USE AND NOUN USE OF THE
25 V-N
PAIRS AND
THEIR RATIOS IN
COLEC
AND
LOCNESS 163

T

PREPOSITIONAL PHRASE STRUCTURE
(
IN
+ NOUN +
OF
) 168

T
ABLE
6. 15 T
HE TOTAL FREQUENCIES OF VERB USE AND NOUN USE IN PREPOSITIONAL PHRASES
OF
15 V-N
PAIRS AND THEIR RATIOS IN
COLEC
AND
LOCNESS 168

T
ABLE
7. 1 T
HE FREQUENCIES OF
KEEP
IN ITS PATTERNS AND PHRASES
181

T
ABLE
7. 2 T
HE MAJORITY OF THE NOUNS IN THE PATTERN

OMPARATIVE FREQUENCIES OF
CONTINUE
AND
MAINTAIN
IN
COLEC
AND
LOCNESS 192

T
ABLE
7. 6 S
OME EXAMPLES OF USING DIFFERENT PATTERNS TO MEAN THE SAME THING
193

T
ABLE
8. 1 A
TABLE OF COLLOCATES OF
TAKE
IN
LOCNESS
AND
COLEC 200

T
ABLE
8. 2 S
OME FIGURES OF THREE VARIETIES OF THE COLLOCATE
TAKE ACTION

T
ABLE
9. 4 S
OME EXAMPLES OF THE CORRECT USE AND INCORRECT USE OF
KEEP
IN TOUCH
WITH IN
COLEC 232

xv

List of Figures

F
IGURE
3. 1

A
SCREENSHOT OF THE PATTERN OF TAKE
(
FROM
LOCNESS)
BY
W
ORD
S

A
SCREENSHOT OF THE
C
ONCORDANCE
S
ETTINGS BOX OF
W
ORD
S
MITH
63

F
IGURE
4. 1

D
IFFERENT FORMS OF
TAKE
TAGGED BY
CLAWS7 72

F
IGURE
4. 2 C
HANNELL
’
S COMPONENTIAL ANALYSIS OF
SURPRISE, ASTONISH, AMAZE,

BY
G
ODMAN
(1982: 47) 79

F
IGURE
4. 5

A
SEMANTIC FIELD CHART OF THE GROUP HEADED BY
BREAK
BY
G
ODMAN
(1982:
49) 79

F
IGURE
4. 6

T
HE VERBS AND PHRASES THAT SHARE THE
‘V
THAT CLAUSE
’
STRUCTURE BY
F
RANCIS ET AL

LOCNESS
IN
T
ABLE
4.6 105

F
IGURE
4. 10

T
HE VERB LEMMAS THAT OCCUR ONLY IN
LOCNESS
IN
T
ABLE
4.7 109

F
IGURE
4. 11

T
HE VERB LEMMAS THAT ONLY OCCUR IN
LOCNESS
IN
T
ABLE
4.8 109

F
IGURE
5. 2

T
HE VERBS THAT ARE ONLY FOUND IN
LOCNESS
IN THE TOP
20 V-
E WORD FORMS
127

F
IGURE
5. 3

T
HE VERBS THAT ARE ONLY FOUND IN
LOCNESS
IN THE TOP
20 V-
S WORD FORMS
129

F
IGURE
5. 4

T
HE VERBS THAT ARE ONLY FOUND IN

N FORMS IN
LOCNESS
AND
COLEC 133

F
IGURE
5. 7

S
OME OF THE LINES OF THINKS FROM
COLEC 149

F
IGURE
6. 1 T
HE CONCORDANCES OF IN SEARCH OF FROM
LOCNESS 170

F
IGURE
7. 1 A
LL THE CORRECTLY USED CASES OF
‘KEEP
UP WITH N
’
IN
COLEC 184

F

F
IGURE
8. 4 A
LL THE CONCORDANCES OF THE COLLOCATE
TAKE ACTION
IN
LOCNESS 208

F
IGURE
8. 5 A
LL THE CONCORDANCES OF
TAKE ACTION
IN
COLEC 209

F
IGURE
8. 6 S
ENSE
O
NE
:
DECIDE TO DO STH
;
UNDERTAKE STH
213

F

ENSE
F
OUR
:
EMPLOY SB
;
ENGAGE SB
213

F
IGURE
8. 10 S
ENSE
O
NE
:
DECIDE TO DO STH
;
UNDERTAKE STH
214

F
IGURE
8. 11 S
ENSE
T
WO
:
BEGIN TO HAVE
(

OME EXAMPLES OF
“TAKE
A CLASS
/
CLASSES
”
FROM
LOCNESS 217

F
IGURE
8. 15 A
LL THE CONCORDANCES OF THE COLLOCATE
TAKE

…
SERIOUSLY AND ITS
VARIETIES IN
LOCNESS 218

F
IGURE
8. 16 T
WENTY EXAMPLES OF THE COLLOCATE
CHANGE TAKE
PLACE FROM THE
B
O
E
219

xvii
F
IGURE
9. 4 T
HE CONCORDANCES OF THE VERB
DEEM
IN
LOCNESS 235

F
IGURE
9. 5 T
HE CONCORDANCES OF THE VERB
(
LEMMA
) COMPARE
IN
LOCNESS 238

F
IGURE
9. 6 T
HE CONCORDANCES OF THE NOUN
COMPARISON

(
BOTH SINGULAR AND PLURAL

BoE The Bank of English
BNC The British National Corpus
CA Contrastive Analysis
CCED Collins Cobuild English Dictionary
CIA Contrastive Interlanguage Analysis
CLC Computer Learner Corpus
CLEC The Chinese Learner English Corpus
COLEC The Chinese College Learner English Corpus
DDL Data-Driven Learning
EA Error Analysis
EL English language
ELT English language teaching
GSL A General Service List of English Words
ICLE The International Corpus of Learner English
IL interlanguage
KWIC key word in context
L1 first language
L2 second language
LEA The Longman Essential Activator
LLC The Longman Learners’ Corpus
LOCNESS Louvain Corpus of Native English Essays
NL native language
NNS non-native speaker
NS native speaker
POS part of speech
SL second language
SLA Second Language Acquisition
TL target language
Unlike the previous learner language studies such as contrastive analysis (CA) and error
analysis (EA) which will be reported in Section 1.3 of this chapter, this new approach to
learner language study treats learner language as an entity in its own right. As Leech (1998:
2

xvii) insightfully summarises:
“It enables us to investigate the non-native speaking learners’ language (in relation to the native
speakers’) not only from a negative point of view (what did the learner get wrong?) but from a
positive one (what did the learner get right?). For the first time it also allows a systematic and
detailed study of the learners’ linguistic behaviour from the point of view of ‘overuse’ (what
linguistic features does the learner use more than a native speaker?) and ‘underuse’ (what features
does the learner use less than a native speaker?)”.
Apart from this, the new approach allows us to see the similarity and disparity between
learner English and NS English when the learner English data and the NS English data are
compared. On the whole, similarity points to, though it does not necessarily lead to, a degree
of mastery by the learners, while disparity points to, but does not necessarily lead to, a kind of
non-mastery by them. The features which are used by the NSs, but not by the learners, would
be necessary for the learners to acquire if they wish to achieve the naturalness and
‘nativeness’ of the NS English (if the influence of the difference in topics between the two
corpora is ignored for the moment).

1.3 The background to this research
A detailed review of the earlier studies concerning learner language will be found in Chapter
Two. This section briefly relates the current research to the background from which CLC has
emerged.

Earlier research in learner language may be traced to EA. It was generally maintained before

Leech (1998: xix).

1.4 The impetus of this research
As mentioned above, even though there have been some advances in our understanding of
how L2 acquisition takes place, obviously some important problems remain unsolved. EA
was over-dependent on the error aspect of learner language, and therefore it is impossible for
EA researchers to draw up a more complete profile of learner language as it is. As far as SLA
is concerned, it is hard to find answers to questions concerning the nature of the language
produced by a group of learners since its research focus is on the individual mind rather than
on the output of the group. I would argue that in a world where English is mostly taught and
learned in classes and groups, it is the information on group learner English that requires most
of the attention of language researchers and teachers. If we wish to probe into the needs of
learners, it is imperative that we examine the English produced by a group of learners rather
than by individuals. If we suppose teachers wish to tailor their teaching to the needs of their
students and help them to achieve a target level which is similar to the norm they have
selected, there are some questions that must be solved first before any remedial work is
carried out. What does it mean for learners to extend their vocabulary? What is the overall
4

size of the learners’ vocabulary? Learners very often express their intention to expand their
vocabulary and teachers strive hard to help their students to attain this end, but before students
try to expand their vocabulary, the question arises: have they reached the full degree of
vocabulary use for each word they think they know, especially the commonly used simple
words? Among the different senses of polysemous and multiple part-of-speech (POS) words,
to what level of complexity can the students operate? In a new approach to learner language
studies, all these questions are likely to have an answer.

WordSmith Tools (4.0) (Scott 2004) where necessary. In cases where the reference corpus is
found insufficient for some enquiries, a larger and general NS corpus, the Bank of English
(BoE) is used. In addition, the Google search engine (henceforward Google) is occasionally
used to back up some intuitions about a particular usage.

In the cline of quantitative research and qualitative research in CLC, critical remarks by
Nesselhauf (2004: 136) are worth noting:
Many studies are exclusively or primarily quantitative. … While such studies can be interesting
starting points for further quantitative analyses, they do not usually in themselves contribute
much to language learner analysis, let alone to language teaching. If progress is to be made, it is
imperative that this current stage is left behind and that more qualitative analyses are carried out.
Bearing this in mind, my research employs a method which is a combination of both the
quantitative and the qualitative approaches. It is my belief that only by taking both approaches
can we take full advantage of the current computer technology as well as the insightful
practice and theories in corpus linguistics and other relevant areas such as English language
teaching (ELT) (see 9.2.1 for more discussion of the quantitative versus the qualitative
approach in corpus linguistics).

1.7 Two assumptions behind this research
In this thesis it is assumed, as is usual in this newly-born field of learner language study, that
the NS English in the reference corpus can be regarded as a norm for the learners and the state
of NS English is regarded as the ideal or target state for the learners to arrive at. Another
assumption I need to make is that learners of English from the same background (L1, culture,
age, education system, etc.) share similarities in their production of L2. This is also implied in
the practice of learner corpora researchers. In other words, what appears to be frequent in the
group is considered to be a commonly held characteristic of the majority of the group. To look
at the question of similarity among learners with a similar background, refer to Raupach
(1984) (cited in Hasselgren 2002: 154-55).

Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Verbs in the Written English of Chinese Learners: A Corpus-based Comparison between Non-native Speakers and Native Speakers potx - Pdf 12

Tài liệu, ebook tham khảo khác

Học thêm