The use of electronic health records (EHR) from various sources like text, images and time-series data to make predictions or diagnosis have been researchedpreviously. Many previous methods have used separate models either for sepa-rate modalities or for distinct tasks. Recently, models trained to make medicalpredictions using multimodal input have emerged, as a unified approach wouldbe beneficial for health practitioners. We present a single model to make medicalpredictions for several tasks, using diverse input from different modalities. Wedemonstrate the effectiveness of using an autoencoder method to project (EHR)data from three different modalities – images, text and time-series data – into thesmall language model Gemma-2B. 6 projector models are used together with the small language model to perform multi-label prediction for 12 different medicalprediction tasks. Results show that a jointly trained model using asymmetric loss,a loss function that dynamically emphasises positives that are poorly predicted,shows good performance and predicts evenly across tasks.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-531045 |
Date | January 2024 |
Creators | Martin Björkdahl, Liv |
Publisher | Uppsala universitet, Institutionen för lingvistik och filologi |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0023 seconds