Computed Radiography exams are rarely performed by the same physicians who will interpret the image. Therefore, if the image does not help the physician diagnose the patient, the image can be rejected by the interpreting physician. The rejection normally happens after the patient has already left the hospital meaning that they will have to return to retake the exam. This leads to unnecessary work for the physicians and for the patient. In order to solve this problem we have explored deep learning algorithms to automatically analyze the images and distinguish between usable and unusable images. The deep learning algorithms include convolutional neural networks, vision transformers and fusion networks utilizing different types of data. In total, seven architectures were used to train 42 models. The models were trained on a dataset of 61 127 DICOM files containing images and metadata collected from a clinical setting and labeled based on if the images were deemed usable in the clinical setting. The complete dataset was used for training generalized models and subsets containing specific body parts were used for training specialized models. Three architectures were used for classification using images only, where two architectures used a ResNet-50 backbone and one architecture used a ViT-B/16 backbone. These architectures created 15 specialized models and three generalized models. Four architectures implementing joint fusion created 20 specialized models and four generalized models. Two of these architectures had a backbone of ResNet-50 and the other two utilized a ViT-B/16 backbone. For each of the backbones used, two types of joint fusion were implemented, type I and type II, which had different structures. The two modalities utilized were images and metadata from the DICOM files. The best image only model had a ViT-B/16 backbone and was trained on a specialized dataset containing hands and feet. This model reached an AUC score of 0.842 and MCC of 0.545. The two fusion models trained on the same dataset reached an AUC score of 0.843 and 0.834 respectively and an MCC of 0.547 and 0.546 respectively. We concluded that it is possible to perform automatic rejections with deep learning models even though the results of this study are not good enough for clinical use. The models using ViT-B/16 performed better than the ones using ResNet-50 for all models. The generalized and specialized models performed equally well in most cases with the exception of the smaller subsets of the full dataset. Utilizing metadata from the DICOM files did not improve the models compared to the image only models.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-204207 |
Date | January 2024 |
Creators | Wårdemark, Erik, Unell, Olle |
Publisher | Linköpings universitet, Institutionen för medicinsk teknik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0128 seconds