Spelling suggestions: "subject:"encoding RNAs"" "subject:"ancoding RNAs""
41 |
Faktory ovlivňující odpověď kolorektálního karcinomu na chemoterapeutickou léčbu / The study of the factors affecting colorectal cancer chemotherapyDolníková, Alexandra January 2019 (has links)
Application of cytotoxic chemotherapy still remains the essential treatment strategy in advanced colorectal cancer. The intrinsic and acquired drug resistance represents one of the reasons that may even lead to failure of cancer therapy. The DNA damage response pathways have been shown to play an important role in the development of chemoresistance. There is sufficient evidence showing the high-frequency deregulated expression of many DNA repair genes across multiple cancer types. An example of such gene in colorectal cancer is MRE11, which encodes protein known as a sensor of DNA double-strand breaks. In year 2016, there was a substantial study published by our group at The Department of Molecular Biology of Cancer (IEM CAS, Prague), the study analysed the association of polymorphisms in predicted microRNA target sites of double-strand breaks (DSBs) repair genes, including MRE11, and clinical outcome and efficacy of chemotherapy in colorectal cancer. Our hypothesis, based on the mentioned study, is that specifically and exactly defined microRNAs with ability to regulate certain DNA repair proteins may not only affect the survival of colorectal cancer cells, but also the sensitivity to chemotherapy. In practical part of the submitted thesis we have identified miR-140 as a potential regulator of...
|
42 |
Identificação de RNAs longos não-codificadores de proteínas regulados por micro-RNAs / Identification of long non-protein coding RNAs regulated by micro-RNAsMurilo Sena Amaral 18 December 2013 (has links)
Estudos recentes têm revelado que a maior parte dos transcritos gerados em células humanas é composta por RNAs não-codificadores de proteínas (ncRNAs). Uma parte desses ncRNAs compreende a classe de RNAs curtos, que possuem menos que 200 nucleotídeos. Os micro-RNAs (miRNAs) fazem parte dessa classe e têm sido alvo de grande interesse, pois são preditos como possíveis reguladores de mais de 60% dos RNAs mensageiros (mRNAs) humanos. Outra classe dos ncRNAs é composta por ncRNAs longos (lncRNAs, com mais de 200 nucleotídeos), que são transcritos a partir de regiões intergênicas e intrônicas do genoma humano e possuem várias funções, muitas delas relacionadas ao controle da expressão de mRNAs. Recentemente, os lncRNAs têm sido caracterizados quanto à sua estrutura e função. No entanto, muito pouco se sabe sobre os mecanismos pelos quais os lncRNAs são regulados. Este trabalho teve como objetivo avaliar se lncRNAs são regulados por miRNAs em células humanas. Para tanto, identificamos lncRNAs ligados ao complexo de silenciamento induzido por RNA (RISC) em células da linhagem HeLa, utilizando um método aqui desenvolvido de geração de bibliotecas de cDNA direcionadas para sequenciamento em larga escala na plataforma 454/Roche. Em paralelo, sequenciamos os miRNAs ligados ao RISC nestas mesmas células. Os resultados obtidos mostram que centenas de lncRNAs de diversas classes se ligam ao RISC em células HeLa, juntamente com milhares de mRNAs e várias centenas de miRNAs. Entre os miRNAs, encontramos 37 que são preditos como alvejando os lncRNAs detectados. Estes miRNAs constituem possíveis reguladores dos lncRNAs e, portanto, nosso trabalho estabelece um mapa experimental de interações diretas entre lncRNAs e miRNAs. Dentre os lncRNAs identificados ligados ao RISC neste trabalho, destaca-se o TUG1, lincRNA sabidamente envolvido na regulação de genes relacionados à apoptose e ao ciclo celular. Mostramos por ensaio de super-expressão de miRNAs e qPCR que TUG1 é regulado pelo miRNA-148b, um dos miRNAs por nós detectados que possui um sítio alvo altamente conservado em mamíferos localizado na extremidade 3\' de TUG1. Em conjunto, este trabalho contribui para o entendimento da regulação dos níveis de expressão de lncRNAs em células humanas e abre perspectivas para a modulação de miRNAs como estratégia de regulação dos níveis e das funções de lncRNAs. / Recent studies have revealed that the largest fraction of the transcripts generated in human cells is composed of non-protein coding RNAs (ncRNAs). A portion of these RNAs encompasses the class of short RNAs, which are less than 200 nucleotides in length. Micro-RNAs (miRNAs) are part of this class and are of great interest, as they are predicted to target over 60% of the human messenger RNAs (mRNAs). Another class of ncRNAs is composed of long ncRNAs (lncRNAs, longer than 200 nucleotides), which are transcribed from intergenic and intronic regions of the human genome and have several functions, many of them related to the control of the mRNA expression. Recently, the structure and function of lncRNAs have been characterized. However, little is known about the mechanisms involved in lncRNA regulation. This work aimed to evaluate whether lncRNAs are regulated by miRNAs in human cells. For this purpose, we identified lncRNAs bound to the RNA-induced silencing complex (RISC) in HeLa cells using a method developed here for the generation of strand-specific cDNA libraries for large scale RNA-sequencing in the 454/Roche plataform. In parallel, we sequenced the miRNAs bound to RISC in these cells. Our results show that hundreds of lncRNAs from diverse classes are bound to RISC in HeLa cells, along with thousands of mRNAs and several hundred miRNAs. Among the miRNAs we identified 37 that are predicted to target the detected lncRNAs. These miRNAs are possible regulators of the lncRNAs, and therefore our work establishes an experimental map of direct interactions between lncRNAs and miRNAs. The lncRNA TUG1, a lincRNA involved in the regulation of genes related to apoptosis and cell cycle, was identified among the lncRNAs bound to RISC. We showed by miRNA over-expression and qPCR that TUG-1 is regulated by the miRNA-148b, which is one of the miRNAs detected in our sequencings and has a binding site highly conserved in mammals located at the TUG1 3` end. Taken together, our results contribute to the understanding of the regulation of the lncRNA expression levels in human cells and open perspectives for the modulation of miRNAs as a strategy to regulate the levels and functions of lncRNAs.
|
43 |
Genome-wide analysis of the hypoxic breast cancer transcriptome using next generation sequencingChoudhry, Hani January 2014 (has links)
Hypoxia pathways are associated with the pathogenesis of both ischaemic and neoplastic diseases. In response to hypoxia the transcription factor hypoxia‐inducible factor (HIF) induces the expression of hundreds of genes with diverse functions. These enable cells to adapt to low oxygen availability. To date, pan-genomic analyses of these transcriptional responses have focussed on protein-coding genes and microRNAs. However, the role of other classes of non-coding RNAs, in particular lncRNAs, in the hypoxia response is largely uncharacterised. My thesis aimed at improving understanding of the transcriptional regulation of the non-coding transcriptome in hypoxia. I performed an integrated genomic analysis of both non-coding and coding transcripts by massively parallel sequencing. This was interfaced with pan-genomic analyses of DNAse hypersensitivity and HIF, H3k4me3 and RNApol2 binding in hypoxic cells. These analyses have revealed that hypoxia profoundly regulated all RNA classes. snRNAs and tRNAs are globally downregulated in hypoxia, whilst miRNAs, mRNAs and lncRNAs are both up- and downregulated with an overall trend towards slight upregulation. In addition, a significant number of previously non-annotated (and largely hypoxia upregulated) transcripts were identified, including novel intergenic transcripts and natural antisense transcripts. HIF bound close to genes for mRNAs, miRNAs and lncRNAs that were upregulated by hypoxia, but was excluded from binding at genes for RNA classes that showed global downregulation. This suggests that HIF acts as a transcriptional activator (but not repressor), of lncRNAs as well as mRNAs and miRNAs. Consistent with direct regulation by HIF, many of these hypoxia-inducible, HIF-binding lncRNAs were downregulated following HIF knockdown. Analysis of RNApol2 binding and DNAse HSS signals at HIF transcriptional target genes indicated that HIF-dependent transcriptional activation occurs through release of RNApol2 that is pre-bound to open promoters of lncRNAs as well as mRNAs. In these datasets, NEAT1 was the most hypoxia-upregulated, HIF-targeted lncRNA in MCF-7 cells and, despite binding of both HIF-1 and HIF-2 isoforms at its promoter, was selectively regulated by HIF-2 alone. Furthermore, NEAT1 was induced by hypoxia in a wide range of breast cancer cell lines and in hypoxic xenograft models. Functionally, NEAT1 is required for the assembly of nuclear paraspeckle structures. Increased nuclear paraspeckle formation was observed in hypoxia and was dependent on both NEAT1 and HIF-2. Knockdown of hypoxia-induced NEAT1 significantly reduced cell proliferation and survival and induced apoptosis. Finally, high expression of NEAT1 correlated with poor clinical outcome in a large cohort of breast cancer patients. These findings extend the role of the hypoxic transcriptional response in cancer into the spectrum of non-coding transcripts and provide new insights into molecular roles of hypoxia-regulated lncRNAs, which may provide the basis for novel therapeutic targets in the future.
|
44 |
Rôle fonctionnel des longs ARN non codants dans l'adaptation et la pluripotence des cellules souches en culture. / Functional roles of long non coding RNAs in pluripotency and adaptation of stem cells in culture.Bouckenheimer, Julien 16 December 2016 (has links)
Les applications des cellules souches pluripotentes humaines (CSP) dans le domaine biomédical sont particulièrement prometteuses, aussi bien au niveau expérimental qu’au niveau clinique. Leur utilisation comme source inépuisable de cellules permettant de tester et développer de nouvelles molécules thérapeutiques (notamment par modélisation de pathologies in vitro, criblage haut-débit et tests de cytotoxicité) s’ajoute à l’important potentiel qu’elles présentent en médecine régénérative et en thérapie cellulaire. Utilisables comme matériel biologique permettant de restaurer partiellement ou totalement un organe ou un tissu défaillant, il reste essentiel de vérifier l’intégrité génétique des lignées cellulaires utilisées afin de garantir une utilisation sécurisée pour le patient. Parmi les facteurs responsables de l’apparition d’anomalies génétiques chez les CSP, les conditions cultures jouent un rôle essentiel. Des techniques de culture inadaptées peuvent facilement provoquer l’émergence d’une instabilité génomique. Toute altération doit être détectée et documentée afin de pouvoir définir des critères d’acceptation préalable à leur utilisation clinique.Les CSP sont des cellules particulièrement sensibles au stress qui peut résulter de techniques de repiquage inappropriées. La dérive génétique qui découle de ce stress peut être précoce et apparaître dès les premiers passages des lignées cultivées. Notre équipe a pu tester de nombreuses méthodes de repiquage sur différentes lignées cellulaires pluripotentes. Nous avons notamment observé que des anomalies génétiques majeures caryotypiques (trisomies) et infra-caryotypiques (SNPs) ainsi que des changements phénotypiques (survie augmentée, acquisition de mobilité) apparaissaient rapidement suite à l’utilisation de techniques de repiquage basées sur l’utilisation d’enzyme de dissociation (TryPLE). Ces altérations apparaissent dans des lignées qui s’adaptent progressivement à la dissociation en cellules uniques (dissociation « single-cell ») provoquées par ces enzymes.Notre équipe étudie les conséquences cellulaires liées à ce phénomène d’adaptation des CSP provoquée par la dissociation « single-cell ». Grâce à des techniques de séquençage dernière génération (RNA-Seq), nous avons comparé les profils transcriptomiques de CSP repiquées par des techniques standard (comme le passage mécanique) et par des techniques basées sur la dissociation « single-cell » (comme le passage enzymatique par TryPLE). Cette comparaison a montré au niveau transcriptionnel une surexpression spectaculaire d’ARNs non codants appartenant à une classe récemment décrite : les longs ARNs non codants (lncRNAs).L’objectif principal de ce travail de thèse a été d’évaluer le niveau d’implication de ces lncRNAs dans le processus d’adaptation des CSP en culture, et leur rôle fonctionnel potentiel. Nous avons ainsi dans un premier temps déterminé in silico quels lncRNAs étaient différentiellement exprimés dans les CSP adaptées, et après validation expérimentale par biologie moléculaire des candidats les plus prometteurs, nous avons utilisé des tests fonctionnels (notamment par RNA interférence (siRNA)) afin de déterminer le rôle de ces lncRNAs dans la machinerie cellulaire et la pluripotence des CSP. Autour de ce projet principal, nous avons essayé de comprendre les mécanismes régissant les changements phénotypiques et comportementaux provoquées par la dissociation « single-cell ». Nous avons notamment pu suggérer la mise en place d’un phénomène de transition épithélio-mésenchymateuse (EMT) chez des CSP dissociées. Enfin, l’attractivité que représente un sujet d’étude récent comme les lncRNAs et la disponibilité croissante de publications les concernant nous ont poussé à publier une revue approfondie ainsi qu’une méta-analyse sur l’implication des longs ARN non codants dans le développement précoce de l’embryon et dans les cellules souches pluripotentes. / The actual and future applications of human pluripotent stem cells (PSC) in the biomedical field are highly promising. Their use for the discovery of new therapeutic drugs through the development of high-throughput screening tests, cytotoxicity tests and in vitro disease modeling has been added to their tremendous interests in regenerative medicine and cellular therapy. As a source of biological material that can be used to restore partially or totally the lost functions of a damaged organ or tissue, or as a source of normal cells to study human development or test putative new drugs, their genomic integrity has to be thoroughly assessed. Therefore, an effective optimization of their culture conditions has to be considered, in order to control the absence of genomic instability and prevent their potential emergence. Any genetic or epigenetic alteration resulting from cell culturing must be detected in order to define and characterize acceptance criteria for scientific and medical purposes.PSC are particularly sensitive to stress resulting from unappropriated passaging techniques, which cause rapid genetic drift. Indeed, our team observed that many genomic abnormalities arise from aggressive single cell, enzymatic based, passaging methods, and that substantial phenotypical changes such as increased survival after cell dissociation and variation in cell shape can then occur.In order to understand the mechanisms governing the emergence of those adverse alterations, the team focused on the consequences resulting from the adaptation of PSC to single-cell dissociation. By using new generation sequencing techniques as RNA-Seq, we compared transcriptomics of PSC passaged by standard techniques (such as mechanical passaging) versus single-cell enzymatic dissociation (such as TRyPLE-based single-cell passaging). This comparison showed that the most striking difference in the gene expression pattern between adapted and non adapted cells concerned the dramatic overexpression of RNAs from a recently discovered class: long non-coding RNAs (lncRNAs).The aim of this thesis work was to determine to which extent some of these lncRNAs were functionally linked to adaptation of PSC. In order to address this matter, we first investigated in silico which lncRNAs were upregulated by single-cell dissociation, and after experimental validation of lncRNA candidates by molecular biology, we performed functional in vitro analysis (notably by siRNA-mediated loss of function) and sought their cellular localization in order to decipher their role in the cellular machinery and their level of implication. Beside this main project, other auxiliary projects were grafted. The observation of major changes in cell phenotype and behavior led to the investigation of the global mechanisms governing these modifications, underlining the potential role of epithelial-to-mesenchymal transition provoked by single-cell dissociation. Finally, the global attractiveness of lncRNAs and the emergence of exponential documentation concerning non-coding RNAs prompted the writing of an extensive review and meta-analysis concerning the implications of lncRNAs during embryo development and in pluripotent stem cells.
|
45 |
Identificação de perfis de expressão de RNAs codificadores e não codificadores de proteína como preditores de recorrência de câncer de próstata / Identification of protein-coding and non-coding RNA expression profiles as prognostic marker of prostate cancer biochemical recurrenceYuri José de Camargo Barros Moreira 27 August 2010 (has links)
O câncer de próstata é o quinto tipo mais comum de câncer no mundo e o mais comum em homens. Fatores clínicos e anatomopatológicos atualmente usados na clínica não são capazes de distinguir entre a doença indolente e a agressiva. Existe uma grande necessidade de novos marcadores de prognóstico, a fim de melhorar o gerenciamento clínico de pacientes de câncer de próstata. Além das anormalidades em genes codificadores de proteínas, alterações em RNAs não codificadores (ncRNAs) contribuem para a patogênese do câncer e, portanto, representam outra fonte potencial de biomarcadores de câncer de próstata. Entretanto, até o momento, poucos estudos de perfis de expressão de ncRNAs foram publicados. Este projeto teve como principal objetivo identificar perfis de expressão de genes codificadores e não codificadores de proteína correlacionados com recorrência de tumor de próstata, a fim de gerar um perfil prognóstico com potencial uso como biomarcadores e elucidar o possível papel de ncRNAs no desenvolvimento do câncer. Para isso, foram analisados os perfis de expressão de genes codificadores e não codificadores de proteína de um conjunto de 42 amostras de tecido tumoral de câncer de próstata de pacientes de amostras de pacientes submetidos à prostatectomia radical, com longo acompanhamento clínico (cinco anos) e conhecida evolução da doença Nós utilizamos microarranjos por nós desenhados e fabricados pela Agilent sob encomenda, interrogando aproximadamente 18.709 transcritos não codificadores longos (>500 nt), sem evidência de splicing, que mapeiam em regiões intrônicas dentro de 5.660 loci genômicos. Os dados de expressão foram extraídos de cada arranjo, normalizados entre todas as 42 amostras de pacientes. Usando uma estratégia de múltipla amostragem, foi identificado um perfil de expressão de mau prognóstico, contendo 51 transcritos intrônicos não codificadores de proteína. O perfil prognóstico de ncRNAs foi aplicado a um conjunto teste independente de 22 pacientes, classificando corretamente 82% das amostras. Uma análise de Kaplan-Meier dos pacientes do conjunto teste indicou que as curvas de sobrevida dos grupos de alto e baixo risco foram significativamente distintas (Log-rank test p = 0,0009; Hazard ratio = 23,4, 95% CI = 3,62 a 151,2), confirmando assim que este classificador é útil para identificar pacientes com alto risco de recorrência. Além disso, estas descobertas indicam um potencial papel destes RNAs intrônicos não codificadores na progressão do tumor de próstata e apontam para os RNAs intrônicos como potenciais novos marcadores de câncer / Prostate cancer is the fifth most common type of cancer in the world, and the most common in men. Clinical and anatomo-pathological factors currently used in clinic are not able to distinguish between the indolent and the aggressive disease. There is a major need of new prognostic makers in order to improve the clinical management of prostate cancer patients. Apart from abnormalities in protein-coding genes, changes in non-coding RNAs (ncRNAs) contribute to the pathogenesis of cancer and thus represent another potential source of prostate cancer biomarkers. However, few studies of expression profiles of ncRNAs have been published. This project aimed to identify expression profiles of protein-coding and non-coding genes correlated to prostate cancer biochemical recurrence. For this, we analyzed the expression profile of 42 prostate cancer samples from patients undergoing radical prostatectomy, with long follow-up (five years), and know disease outcome. We used a custom microarray designed by us and printed by Agilent, that probes 18,709 long (>500 nt) ncRNAs mapping to intronic regions within 5,660 genomic loci. The expression data were extracted from each array and normalized across all 42 samples. Using a multiple random sampling validation strategy, we identified an expression profile of poor prognosis, comprising 51 ncRNAs. The prognostic profile of ncRNAs was applied to an independent test set of 22 patients, correctly classifying 82% of the samples. A Kaplan-Meier analysis of the test set of patients indicated that the survival curves of high and low risk groups were significantly different (Log-rank test p = 0.0009, Hazard ratio = 23.4, 95% CI = 3.62 to 151.2) thus confirming that this classifier is useful for identifying patients at high risk of recurrence. Furthermore, these findings indicate a potential role of these intronic non-coding RNAs in the progression of prostate tumors and points to the intronic ncRNAs as potential new markers of cancer.
|
46 |
Biochemical Mechanism of Gene Expression Silencing by piRNA-directed PIWI-Clade ArgonautesArif, Amena 10 August 2021 (has links)
Argonaute proteins are small DNA/RNA-guided endonucleases found in all domains of life. In animals, small RNAs of length 21–35 nucleotides direct the PIWI-clade of Argonautes to silence complementary target RNAs; these are called PIWI-interacting RNAs (piRNAs). During spermatogenesis in mice, piRNA-guided PIWI proteins, MIWI2, MILI, and MIWI, silence transposons, regulate expression of protein-coding genes and are necessary for fertility. A working endonuclease activity of MIWI and MILI is essential to complete spermatogenesis. Yet, both MIWI and MILI produce weak and slow target cleavage in vitro, thwarting biochemical examination of the silencing step. Here, we find that PIWI proteins require an auxiliary protein to efficiently cleave their targets, unlike any other known Argonaute. Gametocyte Specific Factor 1 (GTSF1) is a conserved zinc-finger protein essential for fertility and piRNA-directed silencing. We show GTSF1 accelerates the pre-steady-state rate of target cleavage by MIWI and MILI; this role of GTSF1 is also preserved in insects. A critical step in GTSF1 mechanism entails binding RNA. GTSF1 allowed detailed kinetic analyses of catalytic PIWIs: they require extensive 3′ complementarity between the guide and target to efficiently cleave them, but this base-pairing also limits turnover. Interestingly, within a species, different PIWI proteins have unique kinetic properties. In sum, our findings provide molecular mechanisms of GTSF1 function and target silencing by PIWIs as well as a useful method for future studies.
|
47 |
Adaptive Evolution of Long Non-Coding RNAsWalter Costa, Maria Beatriz 07 December 2018 (has links)
Chimpanzee is the closest living species to modern humans. Although the differences in phenotype are striking between these two species, the difference in genomic sequences is surprisingly small. Species specific changes and positive selection have been mostly found in proteins, but ncRNAs are also involved, including the largely uncharacterized class of long ncRNAs (lncRNAs). A notable example is the Human Accelerated Region 1 (HAR1), the region in the human genome with the highest number of human specific substitutions: 18 in 118 nucleotides. HAR1 is located in a pair of overlapping lncRNAs that are expressed in a crucial period for brain development. Importantly, structural rather then sequence constraints lead to evolution of many ncRNAs. Different methods have been developed for detecting negative selection in ncRNA structures, but none thus far for positive selection.
This motivated us to develop a novel method: the SSS-test (Selection on the Secondary Structure test). This novel method uses an excess of structure changing changes as a means of identifying positive selection. This is done using reports from RNAsnp, a tool that quantifies the structural effect of SNPs on RNA structures, and by applying multiple correction on the observations to generate selection scores. Insertions and deletions (indels) are dealt with separately using rank statistics and a background model. The scores for SNPs and indels are combined to calculate a final selection score for each of the input sequences, indicating the type of selection. We benchmarked the SSS-test with biological and synthetic datasets, obtaining coherent signals. We then applied it to a lncRNA database and obtained a set of 110 human lncRNAs as candidates for having evolved under adaptive evolution in humans.
Although lncRNAs have poor sequence conservation, they have conserved splice sites, which provide ideal guides for orthology annotation. To provide an alternative method for assigning orthology for lncRNAs, we developed the 'buildOrthologs' tool. It uses as input a map of ortholog splice sites created by the SpliceMap tool and applies a greedy algorithm to reconstruct valid ortholog transcripts. We applied this novel approach to create a well-curated catalog of lncRNA orthologs for primate species.
Finally, to understand the structural evolution of ncRNAs in full detail, we added a temporal aspect to the analysis. What was the order of mutations of a structure since its origin? This is a combinatorial problem, in which the exact mutations between ancestral and extant sequences must be put in order. For this, we developed the 'mutationOrder' tool using dynamic programming. It calculates every possible order of mutations and assigns probabilities to every path. We applied this novel tool to HAR1 as a case study and saw that the co-optimal paths that are equally likely to have occured share qualitatively comparable features. In general, they lead to stabilization of the human structure since the ancestral. We propose that this stabilization was caused by adaptive evolution.
With the new methods we developed and our analysis of primate databases, we gained new knowledge about adaptive evolution of human lncRNAs.
|
48 |
Genetic and Functional Studies of LociAssociated with Atrial FibrillationGore Panter, Shamone Robinette January 2014 (has links)
No description available.
|
49 |
Biogênese, estabilidade e localização sub-celular de RNAs não-codificadores longos expressos em regiões intrônicas do genoma humano / Biogenesis, stability and sub-cellular localization of long non-coding RNAs expressed in intronic regions of the human genomeOliveira, Ana Carolina Ayupe de 26 March 2012 (has links)
Trabalhos recentes indicam que a maior parte do transcriptoma de células de mamíferos é composto por RNAs não-codificadores de proteínas (ncRNAs). Nosso grupo tem identificado e caracterizado ncRNAs longos (>200 nt), sem splicing, expressos em regiões intrônicas de genes codificadores de proteína. Contudo, a biogênese, processamento e localização sub-celular desta classe de RNAs permanecem desconhecidos. Este trabalho teve como objetivos i) investigar a contribuição da RNA Polimerase II (RNAP II) na transcrição de ncRNAs intrônicos, ii) avaliar a meia-vida destes ncRNAs em relação a mRNAs, e iii) verificar a distribuição sub-celular de ncRNAs intrônicos. Os resultados obtidos indicaram que ncRNAs intrônicos são predominantemente transcritos pela RNAP II a partir de regiões promotoras funcionalmente semelhantes as que controlam a transcrição de mRNAs. Ensaios de estabilidade revelaram que, em média, ncRNAs intrônicos possuem meia-vida igual ou maior (3,4h a 4,2h) do que mRNAs (3,1h). A maior parte dos ncRNAs intrônicos possui estrutura cap 5\', sugerindo que sejam estabilizados para desempenhar papéis na biologia da célula que não dependam de um rápido turnover. A maior parte dos ncRNAs intrônicos é exportada para o citoplasma, indicando que devam exercer alguma função biológica neste compartimento. Em conjunto, este trabalho fornece informações novas a respeito da biogênese, estabilidade e localização sub-celular ncRNAs intrônicos expressos em células humanas, contribuindo para avançar o conhecimento sobre esta classe de transcritos celulares. / Recent studies have shown that most of the mammalian transcriptome is comprised of non-coding RNAs (lncRNAs). Our group has identified and characterized long (>200 nt), unspliced lncRNAs expressed in intronic regions of protein coding genes. However, the biogenesis, processing, stability and subcellular localization of members from this RNA class remain unknown. The aims of this work were i) to investigate the contribution of RNA Polymerase II (RNAP II) to the transcription of intronic, ii) to evaluate the half-life of these ncRNAs relative to mRNAs, and iii) determine their subcellular distribution. Our results indicate that intronic ncRNAs are predominantly transcribed by RNAP II from promoter regions functionally similar to those that control the transcription of mRNAs. Stability assays revealed that intronic ncRNAs have an average half-life equal or greater (3.4h to 4.2h) than mRNAs (3.1h). The majority of intronic ncRNAs have 5\' cap modification suggesting that these transcripts are stabilized, possibly to exert roles in the biology of the cell that does not depend on a rapid turnover. Although intronic ncRNAs do not encode proteins, most of these transcripts are transported to the cytoplasm which indicates that they may perform some biological function in this compartment. Altogether, this study reveals with novel information regarding the biogenesis, stability and subcellular localization of intronic ncRNAs expressed in human cells, thus contributing to advance the knowledge on this class of cellular transcripts.
|
50 |
Métodos de validação tradicional e temporal aplicados à avaliação de classificadores de RNAs codificantes e não codificantes / Traditional and time validation methods applied to the evaluation of coding and non-coding RNA classifiersSá, Clebiano da Costa 23 March 2018 (has links)
Os ácidos ribonucleicos (RNAs) podem ser classificados em duas classes principais: codificante e não codificante de proteína. Os codificantes, representados pelos RNAs mensageiros (mRNAs), possuem a informação necessária à síntese proteica. Já os RNAs não codificantes (ncRNAs) não são traduzidos em proteínas, mas estão envolvidos em várias atividades celulares distintas e associados a várias doenças tais como cardiopatias, câncer e desordens psiquiátricas. A descoberta de novos ncRNAs e seus papéis moleculares favorece avanços no conhecimento da biologia molecular e pode também impulsionar o desenvolvimento de novas terapias contra doenças. A identificação de ncRNAs é uma ativa área de pesquisa e um dos correntes métodos é a classificação de sequências transcritas utilizando sistemas de reconhecimento de padrões baseados em suas características. Muitos classificadores têm sido desenvolvidos com este propósito, especialmente nos últimos três anos. Um exemplo é o Coding Potential Calculator (CPC), baseado em Máquinas de Vetores de Suporte (SVM). No entanto, outros algoritmos robustos são também reconhecidos pelo seu potencial em tarefas de classificação, como por exemplo Random Forest (RF). O método mais utilizado para avaliação destas ferramentas tem sido a validação cruzada k-fold. Uma questão não considerada nessa forma de validação é a suposição de que as distribuições de frequências dentro do banco de dados, em termos das classes das sequências e outras variáveis, não se alteram ao longo do tempo. Caso essa premissa não seja verdadeira, métodos tradicionais como a validação cruzada e o hold-out podem subestimar os erros de classificação. Constata-se, portanto, a necessidade de um método de validação que leve em consideração a constante evolução dos bancos de dados ao longo do tempo, para proporcionar uma análise de desempenho mais realista destes classificadores. Neste trabalho comparamos dois métodos de avaliação de classificadores: hold-out temporal e hold-out tradicional (atemporal). Além disso, testamos novos modelos de classificação a partir da combinação de diferentes algoritmos de indução com características de classificadores do estado da arte e um novo conjunto de características. A partir dos testes das hipóteses, observamos que tanto a validação hold-out tradicional quanto a validação hold-out temporal tendem a subestimar os erros de classificação, que a avaliação por validação temporal é mais fidedigna, que classificadores treinados a partir de parâmetros calibrados por validação temporal não melhoram a classificação e que nosso modelo de classificação baseado em Random Forest e treinado com características de classificadores do estado da arte e mais um novo conjunto de características proporcionou uma melhora significativa na discriminação dos RNAs codificantes e não codificantes. Por fim, destacamos o potencial do algoritmo Random Forest e das características utilizadas, diante deste problema de classificação, e sugerimos o uso do método de validação hold-out temporal para a obtenção de estimativas de desempenho mais fidedignas para os classificadores de RNAs codificantes e não codificantes de proteína. / Ribonucleic acids (RNAs) can be classified into two main classes: coding and non-coding of protein. The coding, represented by messenger RNAs (mRNAs), has the necessary information for protein synthesis. Non-coding RNAs (ncRNAs) are not translated into proteins but are involved in several distinct cellular activities associated with various diseases such as heart disease, cancer and psychiatric disorders. The discovery of new ncRNAs and their molecular roles favors advances in the knowledge of molecular biology and may also boost the development of new therapies against diseases. The identification of ncRNAs is an active area of research and one of the current methods is the classification of transcribed sequences using pattern recognition systems based on their characteristics. Many classifiers have been developed for this purpose, especially in the last three years. An example is the Coding Potential Calculator (CPC), based on Supporting Vector Machines (SVM). However, other robust algorithms are also recognized for their potential in classification tasks, such as Random Forest (RF). The most commonly used method for evaluating these tools has been cross-validation k-fold. An issue not considered in this form of validation is the assumption that frequency distributions within the database, in terms of sequence classes and other variables, do not change over time. If this assumption is not true, traditional methods such as cross-validation and hold-out may underestimate classification errors. The need for a validation method that takes into account the constant evolution of databases over time is therefore needed to provide a more realistic performance analysis of these classifiers. In this work we compare two methods of evaluation of classifiers: time hold-out and traditional hold-out (without considering the time). In addition, we tested new classification models from the combination of different induction algorithms with state-ofthe-art classifier characteristics and a new set of characteristics. From the hypothesis tests, we observe that both the traditional hold-out validation and the time hold-out validation tend to underestimate the classification errors, that the time validation evaluation is more reliable, than classifiers trained from parameters calibrated by time validation did not improve classification and that our Random Forest-based classification model trained with state-of-the-art classifier characteristics and a new set of characteristics provided a significant improvement in the discrimination of the coding and non-coding RNAs. Finally, we highlight the potential of the Random Forest algorithm and the characteristics used, in view of this classification problem, and we suggest the use of the time hold-out validation method to obtain more reliable estimates of the protein coding and non-coding RNA classifiers.
|
Page generated in 0.0876 seconds