Return to search

Boosting Gene Expression Clustering with System-Wide Biological Information and Deep Learning

Gene expression analysis provides genome-wide insights into the transcriptional activity of a cell. One of the first computational steps in exploration and analysis of the gene expression data is clustering. With a number of standard clustering methods routinely used, most of the methods do not take prior biological information into account. Here, we propose a new approach for gene expression clustering analysis. The approach benefits from a new deep learning architecture, Robust Autoencoder, which provides a more accurate high-level representation of the feature sets, and from incorporating prior system-wide biological information into the clustering process. We tested our approach on two gene expression datasets and compared the performance with two widely used clustering methods, hierarchical clustering and k-means, and with a recent deep learning clustering approach. Our approach outperformed all other clustering methods on the labeled yeast gene expression dataset. Furthermore, we showed that it is better in identifying the functionally common clusters than k-means on the unlabeled human gene expression dataset. The results demonstrate that our new deep learning architecture can generalize well the specific properties of gene expression profiles. Furthermore, the results confirm our hypothesis that the prior biological network knowledge is helpful in the gene expression clustering.

Identiferoai:union.ndltd.org:wpi.edu/oai:digitalcommons.wpi.edu:etd-theses-2284
Date24 April 2019
CreatorsCui, Hongzhu
ContributorsDmitry Korkin, Advisor, Carolina Ruiz, Reader, Craig E. Wills, Department Head
PublisherDigital WPI
Source SetsWorcester Polytechnic Institute
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceMasters Theses (All Theses, All Years)

Page generated in 0.0022 seconds