Implant terms are terms like "pacemaker" which indicate the presence of artifacts in the body of a human. These implant terms are key to determining if a patient can safely undergo Magnetic Resonance Imaging (MRI). However, to identify these terms in medical records is time-consuming, laborious and expensive, but necessary for taking the correct precautions before an MRI scan. Automating this process is of great interest to radiologists as it ideally saves time, prevents mistakes and as a result saves lives. The electronic medical records (EMR) contain the documented medical history of a patient, including any implants or objects that an individual would have inside their body. Information about such objects and implants are of great interest when determining if and how a patient can be scanned using MRI. This information is unfortunately not easily extracted through automatic means. Due to their sparse presence and the unusual structure of medical records compared to most written text, makes it very difficult to automate using simple means. By leveraging the recent advancements in Artificial Intelligence (AI), this thesis explores the ability to identify and extract such terms automatically in Swedish EMRs. For the task of identifying implant terms in medical records a generally trained Swedish Bidirectional Encoder Representations from Transformers (BERT) model is used, which is then fine-tuned on Swedish medical records. Using this model a variety of approaches are explored two of which will be covered in this thesis. Using this model a variety of approaches are explored, namely BERT-KDTree, BERT-BallTree, Cosine Brute Force and unsupervised NER. The results show that BERT-KDTree and BERT-BallTree are the most rewarding methods. Results from both methods have been evaluated by domain experts and appear promising for such an early stage, given the difficulty of the task. The evaluation of BERT-BallTree shows that multiple methods of extraction may be preferable as they provide different but still useful terms. Cosine brute force is deemed to be an unrealistic approach due to computational and memory requirements. The NER approach was deemed too impractical and laborious to justify for this study, yet is potentially useful if not more suitable given a different set of conditions and goals. While there is much to be explored and improved, these experiments are a clear indication that automatic identification of implant terms is possible, as a large number of implant terms were successfully discovered using automated means.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-181885 |
Date | January 2021 |
Creators | Jerdhaf, Oskar |
Publisher | Linköpings universitet, Institutionen för datavetenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.003 seconds