1 |
Computational Approaches to Study Post Transcriptional Regulation EventsFahmi, Naima Ahmed 01 January 2024 (has links) (PDF)
A simplistic understanding of the central dogma falls short in correlating the number of genes in the genome to the number of proteins in the proteome. Post-transcriptional regulation, including alternative splicing and alternative polyadenylation contribute to the complexity of the proteome and is critical in understanding gene expression. In this dissertation, we aim to provide genome-wide detection and visualization of the transcript variants and quantify their significance in gene regulation. First, we propose a robust computational program to identify alternative splicing events from RNA-seq data, called AS-Quant. Our extensive experiments on simulated and real datasets demonstrate that AS-Quant can accurately quantify the splicing events among different biological conditions, as well as outperforming the other widely used baselines. The mammalian target of rapamycin (mTOR) pathway is crucial in energy metabolism and cell proliferation. We further interrogated the mTOR-activated transcriptome and found that hyperactivation of mTOR promotes transcriptome-wide exon skipping/exclusion, producing short isoform transcripts from genes. Among the RNA processing factors differentially regulated by mTOR signaling, we found that SRSF3 mechanistically facilitates exon skipping in the mTOR-activated transcriptome. This analysis reveals the role of mTOR in AS regulation and demonstrates that widespread AS is a multifaceted modulator of the mTOR-regulated functional proteome.
Alternative Polyadenylation (APA) can occur either in the coding region or 3'-untranslated region (3'-UTR) of a transcript. 3'-UTR often serves as a binding platform for microRNAs and RNA-binding proteins. APA events in the 3'-UTR produce transcripts with shorter 3'-UTR, therefore provides a means to regulate gene expression at the post-transcriptional level and is known to promote translation. Current bioinformatics pipelines have limited capability in profiling 3'-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3'-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3'-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations. Additionally, we target to capture the novel APA events within the coding region boundary, specifically which occur in the introns of a transcript, referred to as Intronic PolyAdenylation (IPA). IPA is a key mechanism that can significantly alter a transcript's coding potential by truncating its translation region, thereby enhancing transcriptome and proteome diversity. This truncation can produce novel protein isoforms from the same gene with altered peptide sequences, which are linked to disease development, including cancer. To detect and quantify the de-novo IPA events, we developed a comprehensive computational pipeline for the precise identification and assessment of unannotated IPA events, named IPScan. IPScan has been benchmarked against other methods using simulated samples, data from various human and mouse cell lines, and TCGA breast cancer patient's data. Therefore, this dissertation aims to provide a comprehensive analysis to the researchers through extensive methodologies and experimental observations on the transcript variants and their functionalities.
|
Page generated in 0.0982 seconds