This dissertation presents applications of machine learning and statistical approaches to infer protein-DNA bindings in the presence of epigenetic modifications. Epigenetic modifications are alterations to the DNA resulting in gene expression regulation where the structure of the DNA remains unaltered. It is a heritable and reversible modification and often involves addition or deletion of certain chemical compounds to the DNA. Histone modification is an epigenetic change that involves alteration of the histone proteins – thus changing the chromatin (DNA wound around histone proteins) structure – or addition of methyl-groups to the Cytosine base adjacent to a Guanine base. Epigenetic factors often interfere in gene expression regulation by promoting or inhibiting protein-DNA bindings. Such proteins are known as transcription factors. Transcription is the first step of gene expression where a particular segment of DNA is copied into the messenger-RNA (mRNA). Transcription factors orchestrate gene activity and are crucial for normal cell function in any organism. For example, deletion/mutation of certain transcription factors such as MEF2 have been associated with neurological disorders such as autism and schizophrenia. In this dissertation, different computational pipelines are described that use mathematical models to explain how the protein-DNA bindings are mediated by histone modifications and DNA-methylation affecting different regions of the brain at different stages of development. Multi-layer Markov models, Inhomogeneous Poisson analyses are used on data from brain to show the impact of epigenetic factors on protein-DNA bindings. Such data driven approaches reinforce the importance of epigenetic factors in governing brain cell differentiation into different neuron types, regulation of memory and promotion of normal brain development at the early stages of life. / Doctor of Philosophy / A cell is the basic unit of any living organism. Cells contain nucleus that contains DNA, self replicating material often called the blueprint of life. For sustenance of life, cells must respond to changes in our environment. Gene expression regulation, a process where specific regions of the DNA (genes) are copied into messenger RNA (mRNA) molecules and then translated into proteins, determines the fate of a cell. It is known that various environmental (such as diet, stress, social interaction) and biological factors often indirectly affect gene expression regulation. In this dissertation, we use machine learning approaches to predict how certain biological factors interfere indirectly with gene expression by changing specific properties of DNA. We expect our findings will help in understanding the interplay of these factors on gene expression.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/94393 |
Date | 07 October 2019 |
Creators | Banerjee, Sharmi |
Contributors | Electrical Engineering, Tokekar, Pratap, Wu, Xiaowei, Baumann, William T., Kim, Inyoung, Vullikanti, Anil Kumar S. |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Detected Language | English |
Type | Dissertation |
Format | ETD, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0024 seconds