Return to search

Unsupervised Method for Disease Named Entity Recognition

Diseases take a central role in biomedical research; many studies aim to enable access to disease information, by designing named entity recognition models to make use of the available information. Disease recognition is a problem that has been tackled by various approaches of which the most famous are the lexical and supervised approaches. However, the aforementioned approaches have many drawbacks as their performance is affected by the amount of human-annotated data set available. Moreover, lexicalapproachescannotdistinguishbetweenrealmentionsofdiseasesand mentionsofotherentitiesthatsharethesamenameoracronym. Thechallengeofthis project is to find a model that can combine the strengths of the lexical approaches and supervised approaches, to design a named entity recognizer. We demonstrate that our model can accurately identify disease name mentions in text, by using word embedding to capture context information of each mention, which enables the model todistinguishifitisarealdiseasementionornot. Weevaluateourmodelusingagold standard data set which showed high precision of 84% and accuracy of 96%. Finally, we compare the performance of our model to different statistical name entity recognition models, and the results show that our model outperforms the unsupervised lexical approaches.

Identiferoai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/659966
Date06 November 2019
CreatorsAlmutairi, Abeer N.
ContributorsHoehndorf, Robert, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Moshkov, Mikhail, Laleg-Kirati, Taous-Meriem
Source SetsKing Abdullah University of Science and Technology
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0018 seconds