Return to search

Supervised Inference of Gene Regulatory Networks

A gene regulatory network (GRN) records the interactions among transcription
factors and their target genes. GRNs are useful to study how transcription factors (TFs) control
gene expression as cells transition between states during differentiation and development.
Scientists usually construct GRNs by careful examination and study of the literature. This
process is slow and painstaking and does not scale to large networks. In this thesis, we study
the problem of inferring GRNs automatically from gene expression data. Recent data-driven
approaches to infer GRNs increasingly rely on single-cell level RNA-sequencing (scRNA-seq)
data. Most of these methods rely on unsupervised or association based strategies, which
cannot leverage known regulatory interactions by design. To facilitate supervised learning,
we propose a novel graph convolutional neural network (GCN) based autoencoder to infer
new regulatory edges from a known GRN and scRNA-seq data. As the name suggests, a
GCN-based autoencoder consists of an encoder that learns a low-dimensional embedding
of the nodes (genes) in the input graph (the GRN) through a series of graph convolution
operations and a decoder that aims to reconstruct the original graph as accurately as possible.
We investigate several GCN-based architectures to determine the ideal encoder-decoder
combination for GRN reconstruction. We systematically study the performance of these
and other supervised learning methods on different mouse and human scRNA-seq datasets
for two types of evaluation. We demonstrate that our GCN-based approach substantially
outperforms traditional machine learning approaches. / Master of Science / In multi-cellular living organisms, stem cells differentiate into multiple cell types.
Proteins called transcription factors (TFs) control the activity of genes to effect these transitions.
It is possible to represent these interactions abstractly using a gene regulatory network
(GRN). In a GRN, each node is a TF or a gene and each edge connects a TF to a gene or
TF that it controls. New high-throughput technologies that can measure gene expression
(activity) in individual cells provide rich data that can be used to construct GRNs. In this
thesis, we take advantage of recent advances in the field of machine learning to develop
a new computational method for computationally constructing GRNs. The distinguishing
property of our technique is that it is supervised, i.e., it uses experimentally-known interactions
to infer new regulatory connections. We investigate several variations of this approach
to reconstruct a GRN as close to the original network as possible. We analyze and provide
a rationale for the decisions made in designing, evaluating, and choosing the characteristics
of our predictor. We show that our predictor has a reconstruction accuracy that is superior
to other supervised-learning approaches.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/114037
Date09 September 2021
CreatorsSen, Malabika Ashit
ContributorsComputer Science, Murali, T. M., Reddy, Chandan K., Heath, Lenwood S.
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0023 seconds