Return to search

ADVANCED MACHINE LEARNING MODELS IN PREDICTION OF MEDICAL CONDITIONS

The primary goal of Machine learning (ML) models in the prediction of medical conditions is to accurately predict (classify) the occurrence of a disease, or therapy. Many ML models, traditional and deep, have been utilized for the prediction of disease diagnosis, or prediction of the most optimal therapeutic approach. Almost all categories of medical conditions were subject to ML analysis. When creating predictive ML algorithms in medicine, it is pivotal to consider what problems are intended to be solved and how much and what types of training data are available. For challenging prediction (classification) problems, the understanding of disease pathogenesis makes the selection of an adequate ML model and accurate prediction more likely. The hypothesis of the research was to demonstrate that the optimal and adequate selection of model inputs as well as the selection and design of adequate ML methods improves the prediction accuracy of occurrence of diseases and their outcomes. The effectiveness and accuracy of created deep learning and traditional methods have been analyzed and compared. The impact of different medical conditions and different medical domains on optimal selection and performance of ML models was also studied. The effectiveness of advanced ML models was tested on four different diseases: Alzheimer’s disease (AD), Diabetes Mellitus type 2 (DM2), Influenza, and Colorectal cancer (CRC).
The objective of the first part of the thesis (AD study) was to determine could prediction of AD from Electronic medical records (EMR) data alone be significantly improved by applying domain knowledge in positive dataset selection rather than setting naïve filters. Selected Clinically Relevant Positive (SCRP) datasets were used as inputs to a Long-Short-Term Memory (LSTM) Recurrent Neural Network (RNN) deep learning model to predict will the patient develop AD. The LSTM RNN method performed significantly better when learning from the SCRP dataset than when datasets were selected naïvely. Accurate prediction of AD is significant in the identification of patients for clinical trials, and a better selection of patients who need imaging diagnostics.
The objective of the DM2 research was to predict if patients with DM2 would develop any of ten selected complications. RNN LSTM and RNN Gated Recurrent Units (GRU) models were designed and compared to Random Forest and Multilayer Perceptron traditional models. The number of hospitalizations registered in the EMR data was an important factor for the prediction accuracy. The prediction accuracy of complications decreases over time. The RNN GRU model was the best choice for EMR type of data, followed by the RNN LSTM model. An accurate prediction of the occurrence of complications of DM2 is important in the planning of targeted measures aimed to slow down or prevent their development.
The objective of the third part of the thesis was to improve the understanding of spatial spreading of complicated cases of influenza that required hospitalizations, by constructing social network models. A novel approach was designed, which included the construction of heatmaps for geographic regions in New York state and power-law networks, to analyze the distribution of hospitalized flu cases. The methodology constructed in the study allowed to identify critical hubs and routes of spreading of Influenza, in specific geographic locations. Obtained results could enable better prediction of the distribution of complicated flu cases in specific geographic regions and better prediction of required resources for prevention and treatment of hospitalized patients with Influenza.
The fourth part of the thesis proposes approaches to discover risk factors (comorbidities and genes) associated with the development of CRC, which can be used for future ML models to predict the influence of risk factors on prognosis and outcomes of cancer and other chronic diseases. A novel social network and text mining model was developed to study specific risk factors of CRC. Identified associations between comorbidities, CRC, and shared genes can have important implications on early discovery, and prognosis of CRC, which can be subject to predictive ML models in the future.
Prediction ML models could help physicians to select the most effective diagnostic, preventive and therapeutic choices available. These ML models can provide recommendations to select suitable patients for clinical trials, which is very important in searching for medical solutions in health emergencies. Successful ML models can make medicine more efficient, improve outcomes, and decreases medical errors. / Computer and Information Science

Identiferoai:union.ndltd.org:TEMPLE/oai:scholarshare.temple.edu:20.500.12613/6587
Date January 2021
CreatorsLjubic, Branimir, 0000-0002-3287-3741
ContributorsObradovic, Zoran, Vucetic, Slobodan, Shi, Xinghua Mindy, Rubin, Daniel J.
PublisherTemple University. Libraries
Source SetsTemple University
LanguageEnglish
Detected LanguageEnglish
TypeThesis/Dissertation, Text
Format153 pages
RightsIN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available., http://rightsstatements.org/vocab/InC/1.0/
Relationhttp://dx.doi.org/10.34944/dspace/6569, Theses and Dissertations

Page generated in 0.0053 seconds