Global ETD Search

Return to search

Is Simple Wikipedia simple? : – A study of readability and guidelines

Creating easy-to-read text is an issue that has traditionally been solved with manual work. But with advancing research in natural language processing, automatic systems for text simplification are being developed. These systems often need training data that is parallel aligned. For several years, simple Wikipedia has been the main source for this data. In the current study, several readability measures has been tested on a popular simplification corpus. A selection of guidelines from simple Wikipedia has also been operationalized and tested. The results imply that the following of guidelines are not greater in simple Wikipedia than in standard Wikipedia. There are however differences in the readability measures. The syntactical structures of simple Wikipedia seems to be less complex than those of standard Wikipedia. A continuation of this study would be to examine other readability measures and evaluate the guidelines not covered within the current work.

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-161890

corpus

readability

Wikipedia

automatic text simplification

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-161890
Date	January 2018
Creators	Isaksson, Fabian
Publisher	Linköpings universitet, Institutionen för datavetenskap
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds

Is Simple Wikipedia simple? : – A study of readability and guidelines

Description

Links & Downloads

Tags

Additional Fields