The purpose of this thesis was the further development of a rule set used in an automatic text simplification system, and the exploration of whether it is possible to improve the performance of a rule based text simplification system by manual training. A first rule set was developed from a thor- ough literature review, and the rule refinement was performed by manually adapting the first rule set to a set of training texts. When there was no more change added to the set of rules, the training was considered to be completed, and the two sets were applied to a test set, for evaluation. This thesis evaluated the performance of a text simplification system as a clas- sification task, by the use of objective metrics: precision and recall. The comparison of the rule sets revealed a clear improvement of the system, since precision increased from 45% to 82%, and recall increased from 37% to 53%. Both recall and precision was improved after training for the ma- jority of the rules, with a few exceptions. All rule types resulted in a higher score on correctness for R2. Automatic text simplification systems target- ing real life readers need to account for qualitative aspects, which has not been considered in this thesis. Future evaluation should, in addition to quantitative metrics such as precision, recall, and complexity metrics, also account for the experience of the reader.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-120001 |
Date | January 2015 |
Creators | Rennes, Evelina |
Publisher | Linköpings universitet, Institutionen för datavetenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0018 seconds