Return to search

A Study of Machine Learning Approaches for Integrated Biomedical Data Analysis

This thesis consists of two projects in which various machine learning approaches and statistical analysis for the integration of biomedical data analysis were explored, developed and tested. Integration of different biomedical data sources allows us to get a better understating of human body from a bigger picture. If we can get a more complete view of the data, we not only get a more complete view of the molecule basis of phenotype, but also possibly can identify abnormality in diseases which were not found when using only one type of biomedical data. The objective of the first project is to find biological pathways which are related to Duechenne Muscular Dystrophy(DMD) and Lamin A/C(LMNA) using the integration of multi-omics data. We proposed a novel method which allows us to integrate proteins, mRNAs and miRNAs to find disease related pathways. The goal of the second project is to develop a personalized recommendation system which recommend cancer treatments to patients. Compared to the traditional way of using only users' rating to impute missing values, we proposed a method to incorporate users' profile to help enhance the accuracy of the prediction. / Master of Science / There are two existing major problems in the biomedical field. Previously, researchers only used one data type for analysis. However, one measurement does not fully capture the processes at work and can lead to inaccurate result with low sensitivity and specificity. Moreover, there are too many missing values in the biomedical data. This left us with many questions unanswered and can lead us to draw wrong conclusions from the data. To overcome these problems, we would like to integrate multiple data types which not only better captures the complex biological processes but also leads to a more comprehensive characterization. Moreover, utilizing the correlation among various data structures also help us impute missing values in the biomedical datasets.

For my two research projects, we are interested in integrating multiple biological data to identify disease specific pathways and predict unknown treatment responses for cancer patients. In this thesis, we propose a novel approach for pathways identification using the integration of multi-omics data. Secondly, we also develop a recommendation system which combines different types of patients’ medical information for missing treatment responses’ prediction. Our goal is that we would find disease related pathways for the first project and enhance missing treatment response’s prediction for the second project with the methods we develop.

The findings of my studies show that it is possible to find pathways related to muscular dystrophies using the integration of multi-omics data. Moreover, we also demonstrate that incorporating patient’s genetic profile can improve the prediction accuracy compared to using the treatment responses matrix alone for imputation.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/83813
Date29 June 2018
CreatorsChang, Yi Tan
ContributorsElectrical and Computer Engineering, Yu, Guoqiang, Mili, Lamine M., Wang, Yue J.
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0015 seconds