Return to search

Automatic error detection in non-native English

This thesis describes the development of Dapper (`Determiner And PrePosition Error Recogniser'), a system designed to automatically acquire models of occurrence for English prepositions and determiners to allow for the detection and correction of errors in their usage, especially in the writing of non-native speakers of the language. Prepositions and determiners are focused on because they are parts of speech whose usage is particularly challenging to acquire, both for students of the language and for natural language processing tools. The work presented in this thesis proposes to address this problem by developing a system which can acquire models of correct preposition and determiner occurrence, and can use this knowledge to identify divergences from these models as errors. The contexts of these parts of speech are represented by a sophisticated feature set, incorporating a variety of semantic and syntactic elements. DAPPER is found to perform well on preposition and determiner selection tasks in correct native English text. Results on each preposition and determiner are discussed in detail to understand the possible reasons for variations in performance, and whether these are due to problems with the structure of DAPPER or to deeper linguistic reasons. An in-depth analysis of all features used is also offered, quantifying the contribution of each feature individually. This can help establish if the decision to include complex semantic and syntactic features is justified in the context of this task. Finally, the performance of DAPPER on non-native English text is assessed. The system is found to be robust when applied to text which does not contain any preposition or determiner errors. On an error correction task, results are mixed: DAPPER shows promising results on preposition selection and determiner confusion (definite vs. indefinite) errors, but is less successful in detecting errors involving missing or extraneous determiners. Several characteristics of learner writing are described, to gain a clearer understanding of what problems arise when natural language processing tools are used with this kind of text. It is concluded that the construction of contextual models is a viable approach to the task of preposition and determiner selection, despite outstanding issues pertaining to the domain of non-native writing.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:496846
Date January 2008
CreatorsDe Felice, Rachele
ContributorsPulman, Stephen
PublisherUniversity of Oxford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://ora.ox.ac.uk/objects/uuid:5192f0cb-6e4d-4730-bb54-a97a73d603ed

Page generated in 0.0017 seconds