Global ETD Search

Return to search

A Corpus of Second Language Attrition Data

This report addresses the lack of progress in the field of Second Language Attrition (L2A). Review of L2A history and literature show this to be cause by lack of appropriate data. Five criteria for appropriate data are suggested and a corpus of L2A data (57,000 words, spoken Spanish) which meets the criteria is presented. The history of the corpus is explained in detail, including subject selection, instruments and methods of collection, and markup -- XML was used to annotate the corpus with nineteen categories of speech errors, adapted from Nation's (2001) "Learning Vocabulary in Another Language." An example analysis of how the corpus can be used for L2A research is provided with step-by-step instructions on writing scripts for data extraction and post-processing in the Perl language. Source code is included in the text. Complete beginners tutorials on the XML and Perl languages are included in the appendices. The report also introduces a website, developed specifically to host the corpus, where researchers may register, download the corpus and share work they have done with the corpus. All files used in the example project, as well as this report, are available for download at the website. Findings from the example analysis support Plateau Phases, the Regression Hypothesis and suggest the Threshold Hypothesis does not apply to marked forms. This shows the corpus to be of great value to the L2A research community.

L2A

Second Language Attrition

Language Attrition

attrition of second language

second language

language loss

loss of language

loss of language skills

language skill

language skills

SLA

Second Language Acquisition

Linguistics

Identifer	oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-2287
Date	04 December 2007
Creators	Smith, Derrell R.
Publisher	BYU ScholarsArchive
Source Sets	Brigham Young University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	http://lib.byu.edu/about/copyright/

Page generated in 0.0025 seconds

A Corpus of Second Language Attrition Data

Description

Links & Downloads

Tags

Additional Fields