Since the dawn of the genomics era, the genetics of numerous human disorders has been understood which has led to improvements in targeted therapeutics. However, the focus of most research has been primarily on protein coding genes, which account for only 2% of the entire genome, leaving much of the remaining genome relatively unstudied. In particular, repetitive sequences, called microsatellites (MST), which are tandem repeats of 1 to 6 bases, are known to be mutational hotspots and have been linked to diseases, such as Huntington disease and Fragile X syndrome. This work represents a significant effort towards closing this knowledge gap. Specifically, we developed a next generation sequencing based enrichment method along with the supporting computational pipeline for detecting novel MST sequences in the human genome. Using this global MST enrichment protocol, we have identified 790 novel sequences. Analysis of these novel sequences has identified previously unknown functional elements, demonstrating its potential for aiding in the completion of the euchromatic DNA.
We also developed a disease risk diagnostic using a novel target specific enrichment method that produces high resolution MST sequencing data that has the potential to validate, for the first time, the link between MST genotype variation and cancer. Combined with publicly available exome datasets of non-small cell lung cancer and 1000 genomes project, the target specific MST enrichment method uncovered a signature set of 21 MST loci that can differentiate between lung cancer and non-cancer control samples with a sensitivity ratio of 0.93.
Finally, to understand the molecular causes of MST instability, we analyzed genomic variants and gene expression data for an autosomal recessive disorder, Fanconi anemia (FA). This first of its kind study quantified the heterogeneity of FA cells and demonstrated the possibility of utilizing the DNA crosslink repair dysfunctional FA cells as a suitable system to further study the causes of MST instability. / Ph. D. / The field of genetics has enjoyed substantial growth since the conclusion of the human genome project, which was declared complete in the year 2003. The human genome project produced the first framework for the human DNA sequence, the human genome. With the availability of this framework, the understanding of the genetic basis for a number of diseases has significantly grown, which has resulted in better methods of clinical diagnosis and treatment. While the current focus on understanding the genomic regions that are responsible for making proteins has inarguably helped, it has also created a gap in knowledge. Protein coding regions of the human genome account only for 2% of the entire human genome and a large part (47%) of the genome is occupied by repetitive DNA. DNA sequences can be complex, with the nucleotides arranged in no particular order, e.g. ATCGTACGA, or DNA sequences can be repetitive, e.g. ATATATATAT. Repetitive sequences, which have repeating units of 1 to 6 bases, are called microsatellites (MST). MSTs have been shown to be unstable and they have been linked to diseases such as Huntington disease and Fragile X syndrome. This work helps to close this knowledge gap by developing molecular methods and computational tools focused on identifying MST variations. Research conducted with this aim has resulted in three major accomplishments. One, we developed novel molecular and computational methods which we used to detect 790 previously unknown sequences in the human genome. This work proved the ability of our method to uncover functional elements in the human genome that can potentially answer numerous biological questions. Two, we developed another novel method for the production of high resolution MST sequence data that not only can improve MST research in general but also shows the potential for the development of new genetic diagnostics and cancer therapeutics. We identified a signature set of 21 MST sequences that can differentiate between lung cancer patient genomes and non-cancer control genomes. These results represent the first potential validation for a proposed link between MST sequence length (genotype) variation and cancer. Three, we attempt to understand a possible molecular cause and consequences of MST instability in a disease called Fanconi anemia. The results from this work not only, for the first time, quantify the effects of this disease on the genome but also establishes Fanconi anemia as a suitable system for studying MST instability in detail.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/85506 |
Date | 02 May 2017 |
Creators | Velmurugan, Karthik Raja |
Contributors | Animal and Poultry Sciences, Bevan, David R., Garner, Harold Ray, Lawrence, Christopher B., Michalak, Pawel |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Detected Language | English |
Type | Dissertation |
Format | ETD, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0023 seconds