High-throughput sequencing has become pervasive in all facets of genomic analysis. I developed computational methods to analyze high-throughput sequencing data and derive biological conclusions in two research areas -- transcriptional regulation in mammals and evolution of virus under immune pressure.
To investigate transcriptional regulation, I integrated data from multiple experiments performed by the ENCODE consortium. First, my analysis revealed that Transcription Factors (TFs) prefer to bind GC-rich, histone-depleted regions. By comparing in vivo and in vitro nucleosome dynamics, I observed that while histones have an innate preference for binding GC-rich DNA, TF binding overrides this preference and produces a negative correlation between GC content and histone enrichment. In the next project, I found that the binding events of multiple TFs co-occur at genomic regions enriched in activating histone marks that are typically associated with gene enhancers and promoters, suggesting that these regions may be enhancers or have TSS-distal transcription. Lastly, I used supervised machine learning techniques to train histone enrichment signals and sequence features to predict transcriptional enhancers to be validated in mouse-transgenic assays.
In a post-clinical trial exploratory analysis of Hepatitis C Virus (HCV), I traced the evolutionary path of the envelope proteins E1 and E2 in HCV-infected liver transplant patients, in response to a novel antibody. I developed a systematic amino acid-level analysis pipeline that quantifies differences in amino acid frequencies in each position between two time points. Upon applying this method across all positions in the E1/E2 region and comparing pre-liver-transplant and post-viral-rebound time points, mutations in two positions emerged as being key to antibody evasion. Both these mutations--N415K/D and N417S--were in the epitope targeted by the antibody, but surprisingly, did not co-occur. In post-rebound viral genomes that contain the N417S mutation but retain the wild-type variant at 415, N-linked glycosylation of 415 is another possible escape mechanism. Using the same analysis pipeline, I also identified additional candidate escape mutations outside the epitope, which could be potential therapeutic targets.
Identifer | oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/16067 |
Date | 08 April 2016 |
Creators | Iyer, Sowmya |
Source Sets | Boston University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Page generated in 0.0019 seconds