Return to search

A Lexical Comparison Using Word Embedding Mapping from an Academic Word Usage Perspective

This thesis applies the word embedding mapping approach to make a lexical comparison from academic word usage perspective. We aim to demonstrate the differences in academic word usage between a corpus of student writings and a corpus of academic English, as well as a corpus of student writings and social media texts. The Vecmap mapping algorithm, commonly used in solving cross-language mapping problems, was used to map academic English vector space and social media text vector space into the common student writing vector space to facilitate the comparison of word representations from different corpora and to visualize the comparison results. The average distance was defined as a measure of word usage differences of 420 typical academic words between each two corpora, and principal component analysis was applied to visualize the differences. A rank-biased overlap approach was adopted to evaluate the results of the proposed approach. The experimental results show that the usage of academic words of student writings corpus is more similar to the academic English corpus than to the social media text corpus.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-425266
Date January 2020
CreatorsCai, Xuemei
PublisherUppsala universitet, Institutionen för lingvistik och filologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0018 seconds