Return to search

A Study in Describing Complex Words Using Wikipedia's Categorisation System : Adding Descriptive Terms to Increase the Comprehension of Swedish Texts / En studie i att förklara komplexa ord med hjälp av Wikipedias kategoriseringssystem

This thesis offers new input in the field of generating epithets to aid the comprehension of Swedish texts. For whatever reason, a reader might find certain words in a text difficult to understand. For example, they may never have come across the term moussaka before; however, by the simple expedient of assigning an explanatory epithet – in this case, “the dish” moussaka – they can hopefully continue reading uninterrupted. To do this, obscure phrases are identified and extracted based on word class, shallow token features and the Pareto Principle. An algorithm then extracts appropriate epithets for each word using the Wikipedia categorisation system. Although the algorithm developed for the study achieved underwhelming results when extracting obscure phrases, it did prove excellent at assigning appropriate epithets to nouns and proper nouns. With further research, this process can hopefully be utilised as a tool for improving the readability of any text.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-193074
Date January 2023
CreatorsRagnarsson, Sebastian
PublisherLinköpings universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0019 seconds