Background Healthcare professionals spend large amounts of time on documentation tasks in contemporary healthcare. One such documentation task is the discharge summary which summarizes a care episode. However, research shows that many discharge summaries written today are of lacking quality. One method which has the po- tential to alleviate the situation is natural language processing, specifically text summarization, as it could automatically summarize patient notes into a discharge summary. Aim This thesis aims to provide initial knowledge on the topic of summarization of Swedish clinical text into discharge summaries. Furthermore, this thesis aims to provide knowledge specifically on performing summarization using the Stockholm EPR Gastro ICD-10 Pseudo Corpus II dataset, consisting of Swedish electronic health record data. Method Using the design science framework, an artefact was produced in the form of a model, based on a pre-trained Swedish BART model, which can summarize patient notes into a discharge summary. This model was developed using the Hugging Face library and evaluated both via ROUGE scores as well as via a manual evaluation performed by a now retired healthcare professional. Results The discharge summaries produced from a test set by the artefact model achieved ROUGE-1/2/L/S scores of 0.280/0.057/0.122/0.068. The manual evaluation im- plies that the artefact is prone to fail to accurately include clinically important information, that the artefact produces text with low readability, and that the artefact is very prone to produce severe hallucinations. Conclusion The artefact’s performance is worse than the results of previous studies on the topic of summarization of patient notes into discharge summaries, in terms of ROUGE scores. The manual evaluation of the artefact performance suggests sev- eral shortcomings in its capabilities to accurately summarize a care episode. Since this was the first major work conducted on the topic of text summarization using the Stockholm EPR Gastro ICD-10 Pseudo Corpus II dataset, there are many possible directions for future works.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:su-228684 |
Date | January 2023 |
Creators | Berg, Nils |
Publisher | Stockholms universitet, Institutionen för data- och systemvetenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0019 seconds