11 |
Analysis of nonsense-mediated decay targeted RNA (nt-RNA) in high-throughput sequencing data / CUHK electronic theses & dissertations collectionJanuary 2015 (has links)
Nonsense-mediated mRNA decay (NMD) is an important protective mechanism to guard against erroneous transcripts particularly mRNA transcripts containing premature termination codons (PTC). In classical teaching, such erroneous transcripts (called nonsense-mediated decay targeted RNA, nt-RNA here) are considered as incidental non-specific side-products of the cellular transcription machinery and they are rapidly cleared by NMD and thus they exists in scanty quantity inside a cell (i.e. at a very low steady state abundance). As a side product of stochastic transcriptional error, they are also commonly considered to carry no biologic function. / By analysis of a large collection of RNA-seq data in TCGA (over 4000 samples and the hard disk storage was over 50 TB), it was found that nt-RNA were produced in large amount for some genes, sometimes, they were even more abundant than the normal transcripts of the corresponding genes. / Based on the hypothesis that some nt-RNA are specifically produced by a biological process (in contrast to a process happened by chance), the aims of this work are: 1) To quantify the expression of nt-RNA (survey of the spectrum); 2) To examine the relationship between nt-RNA and protein expression (biological roles); 3) To detect nt-RNAs that affect prognosis of cancer (biological roles); 4) To apply nt-RNA as diagnostic biomarkers for cancer (application); 5) To identify nt-RNAs to classify tumors for unknown primary (CUP, application). / Firstly, nt-RNA were defined from Gene databases and all PTC containing transcripts were compared to their corresponding normal transcripts to locate specific signature tags (both short segments of sequences and splice junctions) for each of the nt-RNA. And the presence and counts of these nt-RNA signature tag were searched in all RNA reads of RNA-seq datasets. Such search and counting produced the read counts of each nt-RNA signature tag and all RNA-read containing such tags are targets for NMD. RNA-seq datasets used in this study included TCGA normal samples, TCGA tumor samples and cancer cell lines for 13 cancer types. / In the example of KIRC, it was found that most differentially expressed nt-RNA (tumor vs control) were related to differential expression of the corresponding normal transcripts. However, nt-RNA were produced in 900 genes which were independent of higher production of the normal transcripts. In the example of KIRC, collection of 12 genes in the proteasome ubiquitination pathway standed out among the highly produced nt-RNA. This finding is very interesting as VHL-HIF1A is a key oncogenesis mechanism in KIRC and normal HIF1A degradation required proteasomal ubiquitination pathway. GO analysis was highly significant at p-value<4.11E-05. And the nt-RNA producing genes included PSMB4, PSMD14, PSMC6, PSMD13, PSMB1, VCP, ANAPC5, PSMA4, PSMD3, ANAPC7, OS9, GCLC. / Secondly, some nt-RNA retarded translation of the normal transcripts. By using proteome data, the relationship between quantity of nt-RNA unique tags and normal protein product were analyzed by ANOVA comparison of linear models. It was found that 422 nt-RNA unique tags influenced the expression of proteins, which suggested a potential biological action of these nt-RNA. PTEN also produced nt-RNA in KIRC and tumor cells with higher PTEN nt-RNA had a lower PTEN protein level (p-value of ANOVA comparison of linear models: 0.017). Survival analysis results showed that PTEN nt-RNA levels affected survival, which suggested that it can be used as biomarker for prognosis. Furthermore, survival analysis were done for other nt-RNA unique tags which affected protein expression using clinical data. / Thirdly, the application of nt-RNA as diagnostic markers and markers to define tumor origin in CUP were examined. nt-RNA were identified in different types of tumors. Here, only nt-RNA that were independent of the normal gene transcripts in term of differential expression were used as biomarkers. By comparing tumor samples with normal samples, nt-RNAs as diagnostic markers were detected. Unsupervised clustering was performed for these nt-RNAs and heat maps showed high degree of separation of tumor and normal samples. For studying tumor origin in CUP, in both cross-validation study in the training dataset (N=541) and independent sample set external validation (N=2462), a highly discriminating sets of nt-RNAs were defined for most cancers examined (400 nt-RNA seq. tags). Unsupervised clustering was performed for the 400 nt-RNA seq. tags and heat maps showed its power to define tumor origin in CUP. And then the significance of classifier formed by 400 nt-RNA seq. tags was measured by performing 100 resampling of the training set. The results for the 100 resampling showed that the correctly classified instance rate for training set had 96.4895% ± 0.75% (mean ± standard deviation); for validation set had 91.0239% ± 1.032611%. / In conclusion, this study showed nt-RNA can have important biological function and be used for various applications. It’s a potential biomarker for diagnosis and prognosis of diseases. And it can also be used to decide the origin site of tumors, which indicates that nt-RNA will provide great information for potential application in diagnosis of cancer and determining the origin in cancer of unknown primary site (CUP). [With diagram] / 無意介導的mRNA降解(NMD)是一種重要的保護機制,它可以防止錯誤的轉錄本,特別是含有提前終止密碼子的轉錄本。在經典的教學里,這種錯誤的轉錄本(這裡稱為無意介導的mRNA降解所靶向的轉錄本,記為nt-RNA)被認為是細胞轉錄過程中偶然產生的非特異性的副產物,它們很快被NMD清除,因此它們在細胞內的表達很少(即穩態時它們的表達量很少)。作為隨機的轉錄錯誤的一個副產物,它們通常被認為是沒有生物功能的。 / 通過分析大量的來自TCGA的RNA-seq的數據(超過4000個樣本,存儲空間超過50TB),我們發現一些基因的nt-RNA有很高的表達量,有的甚至超過同一個基因的正常轉錄本的表達量。 / 我們的假設是一些nt-RNA是由某個生物過程特定產生的,而不是偶然產生的。基於這一假設,本研究的目標有:(1)量化nt-RNA的表達(表達譜的調查);(2)探索nt-RNA與蛋白質表達的關係(生物功能);(3)尋找可以影響癌症預後的nt-RNA(生物功能);(4)用nt-RNA作為癌症診斷的生物標記物(應用);(5)識別可以用來區分原发灶不明的癌症的nt-RNA(應用)。 / 首先,通過基因的數據庫定義nt-RNA,并將這些nt-RNA與相應的正常的轉錄本進行比較,找到每個nt-RNA特有的標簽(包括系列的片段和剪接位点)。進而在RNA-seq數據所有的讀段中搜索這些nt-RNA特有的標簽并記數。通過這樣的搜索和記數,產生了每個nt-RNA特有標簽的讀段數目,而包含這些標簽的讀段就是NMD的靶標。本研究中使用的RNA-seq數據包含13種癌症的TCGA正常和癌症樣本,以及癌細胞系的樣本數據。 / 在腎癌的例子中,大多數差異表達(癌症與正常比較)的nt-RNA和它相應的正常的轉錄本的差異表達是有關聯的。然而,900个基因產生的nt-RNA與正常轉錄本的高表達是獨立的。我們發現與白酶體泛素化通路相關的12個基因高表達nt-RNA。這個發現是很有意思的,因為VHL-HIF1A是KIRC的一個重要的致癌機制,而正常的HIF1A的降解需要通過白酶體泛素化通路。白酶體泛素化通路在基因富集分析中是顯著的(p值<4.11E-05)。這12個基因分別是PSMB4,PSMD14,PSMC6,PSMD13,PSMB1,VCP,ANAPC5,PSMA4,PSMD3,ANAPC7,OS9,GCLC。 / 其次,一些nt-RNA可以降低正常轉錄本的翻譯。利用蛋白組數據,我們用ANOVA比較線性模型的方法研究了nt-RNA特有的標簽與正常的蛋白產物的關係。結果發現,422个nt-RNA特有的標簽影響蛋白質的表達,這說明nt-RNA具有潛在的生物作用。PTEN也在KIRC裡產生nt-RNA,PTEN的nt-RNA表達越高的樣本,含有越少的PTEN蛋白產物(ANOVA比較線性模型的p值=0.017)。生存分析的結果顯示PTEN的nt-RNA影響生存率,這說明PTEN的nt-RNA可以作為癌症預後的生物標記物。進一步,對其他的影響蛋白表達的nt-RNA特有的標簽也做了生存分析。 / 最後,我檢查了nt-RNA作為診斷標記物和用來定義原发灶不明的癌症(CUP)的起源的標記物的兩大應用。只有在差異表達方面獨立於正常轉錄本的那些nt-RNA會被用作生物標記物。通過比較癌症和正常的樣本,檢查了哪些nt-RNA可以作為診斷標記物。利用無監督的聚類分析和熱圖顯示了這些nt-RNA可以很明顯地將癌症和正常樣本分開。在研究原发灶不明的癌症(CUP)的起源中,通過對訓練集(N=541)和獨立的外部驗證集(N=2462)進行交叉驗證學習,定義了一個可以識別大多數癌症樣本的nt-RNA標簽集(400個nt-RNA特有的片段標簽)。無監督的聚類分析和熱圖顯示了用這些nt-RNA定義原发灶不明的癌症(CUP)的起源的能力。隨後,通過從訓練集的樣本隨機抽樣100次,檢查了由400個nt-RNA特有的片段標簽組成的分類器的顯著性。100次隨機抽樣的結果顯示:對訓練集,樣本準確分類率的均值和標準差分別是96.4895%和0.75%;對驗證集,樣本準確分類率的均值和標準差分別是91.0239%和1.032611%。 / 總之,本研究顯示了nt-RNA有重要的生物功能和多種應用。它是癌症診斷和預後的潛在的生物標記物。它也可以被用來決定癌症的原发灶,這意味著nt-RNA將會為癌症診斷和決定原发灶不明的癌症的原发灶的這些潛在應用提供很好的信息。[附圖] / Hu, Fuyan. / Thesis Ph.D. Chinese University of Hong Kong 2015. / Includes bibliographical references (leaves 173-211). / Abstracts also in Chinese. / Title from PDF title page (viewed on 12, October, 2016). / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only.
|
12 |
Solving repeat problems in shotgun sequencing /Arner, Erik, January 2006 (has links)
Diss. (sammanfattning) Stockholm : Karolinska institutet, 2006. / Härtill 3 uppsatser.
|
13 |
Small intron definition of MVM pre-mRNAs /Haut, Donald David, January 1998 (has links)
Thesis (Ph. D.)--University of Missouri--Columbia, 1998. / "July 1998." Typescript. Vita. Includes bibliographical references (leaves 111-119). Also available on the Internet.
|
14 |
Protein sequence constraintsLavelle, Daniel Thor. January 2009 (has links)
Thesis (Ph. D.)--University of Virginia, 2009. / Title from title page. Includes bibliographical references. Also available online through Digital Dissertations.
|
15 |
DNA Sequences Involved in Immunoglobulin Germ-line C [alpha] Gene Transcription: a ThesisLin, Yi-chaung A. 01 June 1992 (has links)
Expression of germ-line α transcripts precedes class switching to IgA, and therefore study of the regulation of germ-line α RNA transcription is important for understanding the class switching process. Transforming growth factor β1 (TGFβ1) increases the transcription of the Ig constant region a gene and class switching to IgA in normal B cells and in the I.29μ B lymphoma cell line. The structure of germ-line α transcripts in I.29μ cells was analyzed by RNase protection and primer extension assays. Two initiation sites for germ-line α transcripts were identified 2 kb upstream to the α switch region. No TATA or Sp1 elements are found near the RNA initiation sites. The DNA segment located 5' to the initiation sites of germ-line α RNA can drive expression of a luciferase reporter gene when transiently transfected into I.29μ (subclone 22D) and A20.3 cell lines. Full constitutive expression requires no more than 106 bp of the 5' flanking segment. In deletion and substitution mutation studies, an ATF/CRE site residing within this region is very important for constitutive expression of the germ-line α promoter, but mutation of this motif does not diminish TGFβ1 inducibility. Induction by TGFβ1 requires additional sequences residing between -128 to -106 relative to the first RNA initiation site. Two copies of a tandemlyrepeated sequence 5' CACAG(G) CCAGAC 3' (termed Igα TGFβ-RE) are located in the region from -127 to -105. An oligonucleotide containing multimers of these repeats could confer TGFβ1 inducibility to a heterologous promoter. An additional copy of the TGFβ-RE was identified at -41/-30 and its deletion reduced the TGFβ1 response. Thus, tandem repeats of a novel TGFβ-RE are the positive regulatory elements for the TGFβ1 response. Gel mobility shift assays demonstrated specific binding to the TGFβ-RE by nuclear factors but the binding activity was not enhanced by TGFβ1. This study supports previously published evidence that TGFβ1 directs class switching to IgA through induction of germ-line Cα gene transcription.
|
16 |
NMR investigations of strand slippage in CTG repeat expansion and primer-template misalignment in low fidelity DNA replication. / CUHK electronic theses & dissertations collectionJanuary 2007 (has links)
CTG repeat is one of the most common triplet repeat sequences that have been found to form slipped-strand structures leading to self-expansion during DNA replication. The lengthening of these repeats causes the onset of neurodegenerative diseases such as myotonic dystrophy. Through designing a series of CTG repeat sequences with high hairpin populations, systematic analysis of imino and methyl proton spectra study has been carried out to investigate the length and structural roles of CTG repeats in affecting the propensity of hairpin formation. Direct NMR evidence has been obtained to support three types of hairpin structures in sequences containing one to ten CTG repeats. The differences in loop structures and extent of interactions observed in the hairpins account for the differences in hairpin formation propensity and explain how slippage occurs that lead to triplet repeat expansion. / DNA has been found to adopt unusual structures leading to different types of mutations, which can ultimately cause genetic diseases and cancers. In this thesis, investigations on (i) structural role of CTG repeats in trinucleotide repeat expansion, (ii) primer-template structures in strand slippage during low fidelity replication and (iii) sequence effect of nucleotide downstream of thymine templates on primer-template structures have been carried out using NMR spectroscopy. / In addition, NMR structural investigations have also been carried out to determine solution structures of primer-template models. NMR evidence confirms misalignment can occur in primer-templates upon misincorporation of dNTP opposite a template sequence, leading to bulge formation in the primer-template. Depending on the template sequence, further incorporation of dNTP can bring about either realignment or further stabilization of the primer-template structure. Consequently, either mismatch or deletion errors will occur, leading to base substitution or frameshift mutation. These results imply that DNA sequences do not only play a passive role to store genetic information in the replication process, they also play an active structural role in governing the types of mutation during low-fidelity DNA replication. / Some of the results in this thesis have been reported in the following peer-reviewed journals: (1) Chi, L. M. and Lam, S. L. (2005) Structural roles of CTG repeats in slippage expansion during DNA replication. Nucleic Acids Res, 33, 1604-1617. (2) Chi, L. M. and Lam, S. L. (2006) NMR investigation of DNA primer-template models: structural insights into dislocation mutagenesis in DNA replication. FEBS Lett. , 580, 6496-6500. (3) Chi, L. M. and Lam, S. L. (2007) NMR investigation of primer-template models: structural effect of sequence downstream of a thymine template on mutagenesis in DNA replication. Biochemistry, 46, 9292-9300. / Chi, Lai Man. / "August 2007." / Adviser: Lam Sik Lok. / Source: Dissertation Abstracts International, Volume: 69-02, Section: B, page: 0877. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (p. 102-112). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307.
|
17 |
Triplex formation as monitored by EPR spectroscopy and molecular dynamics studies of spin-probe labeled DNAsDarian, Eva. January 2002 (has links)
Thesis (Ph. D.)--West Virginia University, 2002. / Title from document title page. Document formatted into pages; contains xi, 121 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 113-115).
|
18 |
An analysis of genetic determinants that govern exon definition and alternative splicing of minute virus of mice (MVM) pre-mRNAs /Gersappe, Anand January 1998 (has links)
Thesis (Ph. D.)--University of Missouri--Columbia, 1998. / "July 1998." Typescript. Vita. Includes bibliographical references (leaves 215-225). Also available on the Internet.
|
19 |
Small intron definition of MVM pre-mRNAsHaut, Donald David, January 1998 (has links)
Thesis (Ph. D.)--University of Missouri--Columbia, 1998. / Typescript. Vita. Includes bibliographical references (leaves: 111-119). Also available on the Internet.
|
20 |
Identification and characterization of mitochondrial genome concatemers in AIDS-associated lymphomas and lymphoma cell lines /Bedoya, Felipe. January 2009 (has links)
Dissertation (Ph.D.)--University of South Florida, 2009. / Includes vita. Includes bibliographical references.
|
Page generated in 0.015 seconds