Pharmacovigilance, also referred to as drug safety, is an important science for identifying risks related to medicine intake. Side effects of medicine can be caused by for example interactions, high dosage and misuse. In order to find patterns in what causes the unwanted effects, information needs to be gathered and mapped to predefined terms. This mapping is today done manually by experts which can be a very difficult and time consuming task. In this thesis the aim is to automate the process of mapping side effects by using machine learning techniques. The model was developed using information from preexisting mappings of verbatim expressions of side effects. The final model that was constructed made use of the pre-trained language model BERT, which has received state-of-the-art results within the NLP field. When evaluating on the test set the final model performed an accuracy of 80.21%. It was found that some verbatims were very difficult for our model to classify mainly because of ambiguity or lack of information contained in the verbatim. As it is very important for the mappings to be done correctly, a threshold was introduced which left for manual mapping the verbatims that were most difficult to classify. This process could however still be improved as suggested terms were generated from the model, which could be used as support for the specialist responsible for the manual mapping.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-426916 |
Date | January 2020 |
Creators | Wallner, Vanja |
Publisher | Uppsala universitet, Institutionen för informationsteknologi |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | UPTEC IT, 1401-5749 ; 20046 |
Page generated in 0.0016 seconds