Global ETD Search

Return to search

Abbreviation Expansion in Swedish Clinical Text : Using Distributional Semantic Models and Levenshtein Distance Normalization

In the medical domain, especially in clinical texts, non-standard abbreviations are prevalent, which impairs readability for patients. To ease the understanding of the physicians' notes, abbreviations need to be identified and expanded into their original forms. This thesis presents a distributional semantic approach to find candidates of the original form of the abbreviation, which is combined with Levenshtein distance to choose the correct candidate among the semantically related words. The method is applied to radiology reports and medical journal texts, and a comparison is made to general Swedish. The results show that the correct expansion of the abbreviation can be found in 40% of the cases, an improvement by 24 percentage points compared to the baseline (0.16), and an increase by 22 percentage points compared to using word space models alone (0.18).

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-226235

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-226235
Date	January 2014
Creators	Tengstrand, Lisa
Publisher	Uppsala universitet, Institutionen för lingvistik och filologi
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0118 seconds

Abbreviation Expansion in Swedish Clinical Text : Using Distributional Semantic Models and Levenshtein Distance Normalization

Description

Links & Downloads

Tags

Additional Fields