Return to search

Analysis of genomic data to derive biological conclusions on (1) transcriptional regulation in the human genome and (2) antibody resistance in hepatitis C virus

High­-throughput sequencing has become pervasive in all facets of genomic analysis. I developed computational methods to analyze high­-throughput sequencing data and derive biological conclusions in two research areas -- transcriptional regulation in mammals and evolution of virus under immune pressure.
To investigate transcriptional regulation, I integrated data from multiple experiments performed by the ENCODE consortium. First, my analysis revealed that Transcription Factors (TFs) prefer to bind GC-­rich, histone­-depleted regions. By comparing in vivo and in vitro nucleosome dynamics, I observed that while histones have an innate preference for binding GC-­rich DNA, TF binding overrides this preference and produces a negative correlation between GC content and histone enrichment. In the next project, I found that the binding events of multiple TFs co-­occur at genomic regions enriched in activating histone marks that are typically associated with gene enhancers and promoters, suggesting that these regions may be enhancers or have TSS-­distal transcription. Lastly, I used supervised machine ­learning techniques to train histone enrichment signals and sequence features to predict transcriptional enhancers to be validated in mouse-­transgenic assays.
In a post­-clinical trial exploratory analysis of Hepatitis C Virus (HCV), I traced the evolutionary path of the envelope proteins E1 and E2 in HCV-infected liver transplant patients, in response to a novel antibody. I developed a systematic amino acid­-level analysis pipeline that quantifies differences in amino acid frequencies in each position between two time points. Upon applying this method across all positions in the E1/E2 region and comparing pre-­liver­-transplant and post­-viral­-rebound time points, mutations in two positions emerged as being key to antibody evasion. Both these mutations--N415K/D and N417S--were in the epitope targeted by the antibody, but surprisingly, did not co­-occur. In post­-rebound viral genomes that contain the N417S mutation but retain the wild-­type variant at 415, N-­linked glycosylation of 415 is another possible escape mechanism. Using the same analysis pipeline, I also identified additional candidate escape mutations outside the epitope, which could be potential therapeutic targets.

Identiferoai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/16067
Date08 April 2016
CreatorsIyer, Sowmya
Source SetsBoston University
Languageen_US
Detected LanguageEnglish
TypeThesis/Dissertation

Page generated in 0.0179 seconds