• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Controllable sentence simplification in Swedish : Automatic simplification of sentences using control prefixes and mined Swedish paraphrases

Monsen, Julius January 2023 (has links)
The ability to read and comprehend text is essential in everyday life. Some people, including individuals with dyslexia and cognitive disabilities, may experience difficulties with this. Thus, it is important to make textual information accessible to diverse target audiences. Automatic Text Simplification (ATS) techniques aim to reduce the linguistic complexity in texts to facilitate readability and comprehension. However, existing ATS systems often lack customization to specific user needs, and simplification data for languages other than English is limited. This thesis addressed ATS in a Swedish context, building upon novel methods that provide more control over the simplification generation process, enabling user customization. A dataset of Swedish paraphrases was mined from a large amount of text data. ATS models were then trained on this dataset utilizing prefix-tuning with control prefixes. Two sets of text attributes and their effects on performance were explored for controlling the generation. The first had been used in previous research, and the second was extracted in a data-driven way from existing text complexity measures. The trained ATS models for Swedish and additional models for English were evaluated and compared using SARI and BLEU metrics. The results for the English models were consistent with results from previous research using controllable generation mechanisms, although slightly lower. The Swedish models provided significant improvements over the baseline, in the form of a fine-tuned BART model, and compared to previous Swedish ATS results. These results highlight the efficiency of using paraphrase data paired with controllable generation mechanisms for simplification. Furthermore, the different sets of attributes provided very similar results, pointing to the fact that both these sets of attributes manage to capture aspects of simplification. The process of mining paraphrases, selecting control attributes and other methodological implications are discussed, leading to suggestions for future research.

Page generated in 0.0676 seconds