Application of deep learning in radiology has the potential to automate workflows, support radiologists with decision support, and provide patients a logic-based algorithmic assessment. Unfortunately, medical datasets are often not uniformly distributed due to a naturally occurring imbalance. For this research, a multi-classification of liver MRI sequences for imaging of hepatocellular carcinoma (HCC) was conducted on a highly imbalanced clinical dataset using deep convolutional neural network. We have compared four multi classification classifiers which were Model A and Model B (both trained using imbalanced training data), Model C (trained using augmented training images) and Model D (trained using under sampled training images). Data augmentation such as 45-degree rotation, horizontal and vertical flip and random under sampling were performed to tackle class imbalance. HCC, the third most common cause of cancer-related mortality [1], can be diagnosed with high specificity using Magnetic Resonance Imaging (MRI) with the Liver Imaging Reporting and Data System (LI-RADS). Each individual MRI sequence reveals different characteristics that are useful to determine likelihood of HCC. We developed a deep convolutional neural network for the multi-classification of imbalanced MRI sequences that will aid when building a model to apply LI-RADS to diagnose HCC. Radiologists use these MRI sequences to help them identify specific LI-RADS features, it helps automate some of the LIRADS process, and further applications of machine learning to LI-RADS will likely depend on automatic sequence classification as a first step. Our study included an imbalanced dataset of 193,868 images containing 10 MRI sequences: in- phase (IP) chemical shift imaging, out-phase (OOP) chemical shift imaging, T1-weighted post contrast imaging (C+, C-, C-C+), fat suppressed T2 weighted imaging (T2FS), T2 weighted imaging, Diffusion Weighted Imaging (DWI), Apparent Diffusion Coefficient map (ADC) and In phase/Out of phase (IPOOP) imaging. Model performance for Models A, B, C and D provided a macro average F1 score of 0.97, 0.96, 0.95 and 0.93 respectively. Model A showed higher classification scores than models trained using data augmentation and under sampling. / Thesis / Master of Science (MSc)
Identifer | oai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/25780 |
Date | January 2020 |
Creators | Trivedi, Aditya |
Contributors | Doyle, Thomas, eHealth |
Source Sets | McMaster University |
Language | English |
Detected Language | English |
Type | Thesis |
Page generated in 0.0018 seconds