Spelling suggestions: "subject:"long noncoding RNAs"" "subject:"long noncodings RNAs""
1 |
Development of bioinformatic tools for massive sequencing analysisFurió Tarí, Pedro 19 October 2020 (has links)
[EN] Transcriptomics is one of the most important and relevant areas of bioinformatics. It allows detecting the genes that are expressed at a particular moment in time to explore the relation between genotype and phenotype. Transcriptomic analysis has been historically performed using microarrays until 2008 when high-throughput RNA sequencing (RNA-Seq) was launched on the market, replacing the old technique. However, despite the clear advantages over microarrays, it was necessary to understand factors such as the quality of the data, reproducibility and replicability of the analyses and potential biases.
The first section of the thesis covers these studies. First, an R package called NOISeq was developed and published in the public repository "Bioconductor", which includes a set of tools to better understand the quality of RNA-Seq data, minimise the impact of noise in any posterior analyses and implements two new methodologies (NOISeq and NOISeqBio) to overcome the difficulties of comparing two different groups of samples (differential expression). Second, I show our contribution to the Sequencing Quality Control (SEQC) project, a continuation of the Microarray Quality Control (MAQC) project led by the US Food and Drug Administration (FDA, United States) that aims to assess the reproducibility and replicability of any RNA-Seq analysis.
One of the most effective approaches to understand the different factors that influence the regulation of gene expression, such as the synergic effect of transcription factors, methylation events and chromatin accessibility, is the integration of transcriptomic with other omics data. To this aim, a file that contains the chromosomal position where the events take place is required. For this reason, in the second chapter, we present a new and easy to customise tool (RGmatch) to associate chromosomal positions to the exons, transcripts or genes that could regulate the events.
Another aspect of great interest is the study of non-coding genes, especially long non-coding RNAs (lncRNAs). Not long ago, these regions were thought not to play a relevant role and were only considered as transcriptional noise. However, they represent a high percentage of the human genes and it was recently shown that they actually play an important role in gene regulation. Due to these motivations, in the last chapter we focus, first, in trying to find a methodology to find out the generic functions of every lncRNA using publicly available data and, second, we develop a new tool (spongeScan) to predict the lncRNAs that could be involved in the sequestration of micro-RNAs (miRNAs) and therefore altering their regulation task. / [ES] La transcriptómica es una de las áreas más importantes y destacadas en bioinformática, ya que permite ver qué genes están expresados en un momento dado para poder explorar la relación existente entre genotipo y fenotipo. El análisis transcriptómico se ha realizado históricamente mediante el uso de microarrays hasta que, en el año 2008, la secuenciación masiva de ARN (RNA-Seq) fue lanzada al mercado y comenzó a desplazar poco a poco su uso. Sin embargo, a pesar de las ventajas evidentes frente a los microarrays, resultaba necesario entender factores como la calidad de los datos, reproducibilidad y replicabilidad de los análisis así como los potenciales sesgos.
La primera parte de la tesis aborda precisamente estos estudios. En primer lugar, se desarrolla un paquete de R llamado NOISeq, publicado en el repositorio público "Bioconductor", el cual incluye un conjunto de herramientas para entender la calidad de datos de RNA-Seq, herramientas de procesado para minimizar el impacto del ruido en posteriores análisis y dos nuevas metodologías (NOISeq y NOISeqBio) para abordar la problemática de la comparación entre dos grupos (expresión diferencial). Por otro lado, presento nuestra contribución al proyecto Sequencing Quality Control (SEQC), una continuación del proyecto Microarray Quality Control (MAQC) liderado por la US Food and Drug Administration (FDA) que pretende evaluar precisamente la reproducibilidad y replicabilidad de los análisis realizados sobre datos de RNA-Seq.
Una de las estrategias más efectivas para entender los diferentes factores que influyen en la regulación de la expresión génica, como puede ser el efecto sinérgico de los factores de transcripción, eventos de metilación y accesibilidad de la cromatina, es la integración de la transcriptómica con otros datos ómicos. Para ello se necesita generar un fichero que indique las posiciones cromosómicas donde se producen estos eventos. Por este motivo, en el segundo capítulo de la tesis presentamos una nueva herramienta (RGmatch) altamente customizable que permite asociar estas posiciones cromosómicas a los posibles genes, transcritos o exones a los que podría estar regulando cada uno de estos eventos.
Otro de los aspectos de gran interés en este campo es el estudio de los genes no codificantes, especialmente los ARN largos no codificantes (lncRNAs). Hasta no hace mucho, se pensaba que estos genes no jugaban ningún papel fundamental y se consideraban como simple ruido transcripcional. Sin embargo, suponen un alto porcentaje de los genes del ser humano y se ha demostrado que juegan un papel crucial en la regulación de otros genes. Por este motivo, en el último capítulo nos centramos, en un primer lugar, en intentar obtener una metodología que permita averiguar las funciones generales de cada lncRNA haciendo uso de datos ya publicados y, en segundo lugar, generamos una nueva herramienta (spongeScan) que permite predecir qué lncRNAs podrían estar secuestrando determinados micro-RNAs (miRNAs), alterando así la regulación llevada a cabo por estos últimos. / [CA] La transcriptòmica és una de les àrees més importants i destacades en bioinformàtica, ja que permet veure quins gens s'expressen en un moment donat per a poder explorar la relació existent entre genotip i fenotip. L'anàlisi transcriptòmic s'ha fet històricament per mitjà de l'ús de microarrays fins l'any 2008 quan la tècnica de seqüenciació massiva d'ARN (RNA-Seq) es va fer pública i va començar a desplaçar a poc a poc el seu ús. No obstant això, a pesar dels avantatges evidents enfront dels microarrays, resultava necessari entendre factors com la qualitat de les dades, reproducibilitat i replicabilitat dels anàlisis, així com els possibles caires introduïts. La primera part de la tesi aborda precisament estos estudis. En primer lloc, es va programar un paquet de R anomenat NOISeq publicat al repositori públic "Bioconductor", el qual inclou un conjunt d'eines per a entendre la qualitat de les dades de RNA-Seq, eines de processat per a minimitzar l'impact del soroll en anàlisis posteriors i dos noves metodologies (NOISeq i NOISeqBio) per a abordar la problemàtica de la comparació entre dos grups (expressió diferencial). D'altra banda, presente la nostra contribució al projecte Sequencing Quality Control (SEQC), una continuació del projecte Microarray Quality Control (MAQC) liderat per la US Food and Drug Administration (FDA) que pretén avaluar precisament la reproducibilitat i replicabilitat dels anàlisis realitzats sobre dades de RNA-Seq. Una de les estratègies més efectives per a entendre els diferents factors que influïxen a la regulació de l'expressió gènica, com pot ser l'efecte sinèrgic dels factors de transcripció, esdeveniments de metilació i accessibilitat de la cromatina, és la integració de la transcriptómica amb altres dades ómiques. Per això es necessita generar un fitxer que indique les posicions cromosòmiques on es produïxen aquests esdeveniments. Per aquest motiu, en el segon capítol de la tesi presentem una nova eina (RGmatch) altament customizable que permet associar aquestes posicions cromosòmiques als possibles gens, transcrits o exons als que podria estar regulant cada un d'aquests esdeveniments regulatoris. Altre dels aspectes de gran interés en aquest camp és l'estudi dels genes no codificants, especialment dels ARN llargs no codificants (lncRNAs). Fins no fa molt, encara es pensava que aquests gens no jugaven cap paper fonamental i es consideraven com a simple soroll transcripcional. No obstant això, suposen un alt percentatge dels gens de l'ésser humà i s'ha demostrat que juguen un paper crucial en la regulació d'altres gens. Per aquest motiu, en l'últim capítol ens centrem, en un primer lloc, en intentar obtenir una metodologia que permeta esbrinar les funcions generals de cada lncRNA fent ús de dades ja publicades i, en segon lloc, presentem una nova eina (spongeScan) que permet predeir quins lncRNAs podríen estar segrestant determinats micro-RNAs (miRNAs), alterant així la regulació duta a terme per aquests últims. / Furió Tarí, P. (2020). Development of bioinformatic tools for massive sequencing analysis [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/152485
|
2 |
Regulation of drug metabolism and toxicity by multiple factors of genetics, epigenetics, lncRNAs, gut microbiota, and diseases: a meeting report of the 21 st International Symposium on Microsomes and Drug Oxidations (MDO)Yu, Ai-Ming, Ingelman-Sundberg, Magnus, Cherrington, Nathan J., Aleksunes, Lauren M., Zanger, Ulrich M., Xie, Wen, Jeong, Hyunyoung, Morgan, Edward M., Turnbaugh, Peter J., Klaassen, Curtis D., Bhatt, Aadra P., Redinbo, Matthew R., Hao, Pengying, Waxman, David J., Wang, Li, Zhong, Xiao-bo 03 1900 (has links)
Variations in drug metabolism may alter drug efficacy and cause toxicity; better understanding of the mechanisms and risks shall help to practice precision medicine. At the 21st International Symposium on Microsomes and Drug Oxidations held in Davis, California, USA, in October 2-6, 2016, a number of speakers reported some new findings and ongoing studies on the regulation mechanisms behind variable drug metabolism and toxicity, and discussed potential implications to personalized medications. A considerably insightful overview was provided on genetic and epigenetic regulation of gene expression involved in drug absorption, distribution, metabolism, and excretion (ADME) and drug response. Altered drug metabolism and disposition as well as molecular mechanisms among diseased and special populations were presented. In addition, the roles of gut microbiota in drug metabolism and toxicology as well as long non-coding RNAs in liver functions and diseases were discussed. These findings may offer new insights into improved understanding of ADME regulatory mechanisms and advance drug metabolism research. (C) 2017 Chinese Pharmaceutical Association and Institute of Materia Medica, Chinese Academy of Medical Sciences. Production and hosting by Elsevier B.V.
|
3 |
Control of cardiac remodelling during ageing and disease by epigenetic modifications and modifiersRobinson, Emma January 2018 (has links)
The mammalian heart is a remarkable organ in that it must provide for the cardiovascular needs of the organism throughout life, without pausing. Yet, through developmental growth to adulthood and into ageing, the mammalian heart undergoes extensive physiological, morphological and biochemical remodelling. Pivotal to the age-associated alterations in cardiac phenotype is a decline in the proliferative capacity of cardiac myocytes (CMs), which is insufficient to compensate for the basal rate of CM death over time. The terminally differentiated nature of adult CMs also underlies the inability of the heart to repair itself after myocardial damage, such as infarction. As a consequence, existing CMs mount a compensatory hypertrophic response to sustain cardiac output. In parallel, the proliferation rate of resident cardiac fibroblasts, which comprise approximately 60% of total cardiac cells, increases, replacing healthy myocardium with fibrotic scar tissue. Together, CM hypertrophy and fibroblast hyperplasia progressively reduces cardiac function and the ability of the heart to adapt to environmental stressors or damage. Under continued stress or through natural ageing, the heart progresses to a failing state in which cardiac output can no longer meet the demands of the body. The societal impact of ageing-associated decline in cardiac function is great, with heart failure affecting around 8% of over 65s and consuming approximately 2% of the NHS budget. These statistics are set to rise with an ageing population. The substantial phenotypic alterations characteristic of ageing and disease-associated cardiac remodelling requires a wholesale reprogramming of the CM transcriptome. In many biological systems, although yet to be established in adult myocytes, epigenetic mechanisms underlie the transcriptome changes that arise. I hypothesised that alterations in the epigenetic landscape of CMs mediate the transcriptome remodelling that determines the phenotypic transformations that occur in cardiac ageing, hypertrophy and disease. To test this hypothesis, I examined CM-specific changes in DNA cytosine modifications, long non-coding RNA (lncRNA) expression and histone tail lysine methylation marks – epigenetic marks with central roles in transcriptional regulation in many biological systems. I examined how these changes correlate with alterations in the CM transcriptome during disease and ageing. Understanding how alterations in the transcriptome and epigenome contribute to phenotypic changes using whole tissue data is confounded by the heterogeneous nature of the heart, coupled with ageing and disease-associated changes in relative cellular composition. To overcome this, I validated a method to isolate CM nuclei specifically from post-mortem heart tissue. This method also has the advantage that it could be applied to frozen tissue, allowing access to archived material. LncRNAs are functional RNA transcripts longer than 200 bases are emerging as important regulators of gene expression. Common mechanisms of gene expression regulation by lncRNAs include by antisense suppression, as guide/co-factor molecules to direct chromatin modifying components or splicing factors to locations in the genome. Transcriptome profiling in healthy and failing human CMs identified an increase in expression of the lncRNA MALAT-1, which was consistently observed in rodent models of pathology and in ageing. Loss-of-function investigations revealed a potential anti-hypertrophic function for this lncRNA. Specifically, MALAT-1 knock down in vitro in CMs incited spontaneous hypertrophy with features reflecting pathological remodelling in the heart and hypertrophy induced by pro-hypertrophic mediators in vitro. ix In addition, novel uncharacterised transcripts were identified as differentially expressed in cardiovascular disease, including a lncRNA at 4q35.2, which was found significantly downregulated in CMs from human failing hearts. DNA methylation is a stable epigenetic modification and is generally associated with transcriptional repression. It is established by de novo DNA methyltransferases (DNMTs) in early development to determine and maintain differentiated cell states and is ‘copied’ to daughter strands in DNA synthesis by the maintenance DNMT1. Methylcytosine (MeC) can be subject to further processing to hydroxymethylcytosine (hMeC) through a TET protein-mediated oxidation reaction. This serves as a means to actively remove methylation marks as well as hMeC being a novel epigenetic modification in its own right. For the first time, I identified the cardiac myocyte genome as having a high global level of hMeC, comparable with that in neurones. I also discovered an age-associated increase in gene body hMeC that coincided with the loss of proliferative capacity and plasticity of CMs. In parallel, gene body DNA MeC levels decrease in CM ageing. Both these phenomena in gene bodies corresponded with a non-canonical upregulation in expression of genes particularly relevant to cardiac function. This relationship between gene body methylation and transcription rate is strengthened with age in CMs. Recent work in the laboratory had identified the pervasive loss of euchromatic lysine 9 dimethylation on histone 3 (H3K9me2) as a conserved feature of pathological hypertrophy and associated with re-expression of foetal genes. Concurrently, expression and activity of the enzymes responsible for depositing H3K9me2, euchromatic histone lysine methyltransferases 1 and 2 (EHMT1/GLP and EHMT2/G9a) were reduced. Consistently, microRNA-217-induced genetic or pharmacological inactivation of Ehmts was sufficient to promote pathological hypertrophy and foetal gene re-expression, while suppression of this pathway protected from pathological hypertrophy both in vitro and in mice. In summary, I provide new insight into CM-specific epigenetic changes and suggest the epigenome as an important mediator in the loss of plasticity and cardiac health in ageing and disease. Epigenetic mediators and pathways identified as responsible for this remodelling of the CM epigenome suggests opportunities for novel therapy approaches.
|
4 |
Caractérisation structurale et fonctionnelle de l’ARN long non codant MEG3 / Structure-functional studies on lncRNA MEG3Uroda, Tina 09 May 2019 (has links)
Les ARNs long non codants (ARNlnc) jouent un rôle clé dans les processus cellulaires vitaux, notamment le remodelage de la chromatine, la réparation de l'ADN et la traduction. Cependant, la taille et la complexité des ARNlnc présentent des défis sans précédent pour les études moléculaires mécanistiques, de sorte qu'il s'est avéré difficile jusqu'à présent de relier l'information structurelle à la fonction biologique pour les ARNlnc.Le gène 3 humain exprimé maternellement (de l’anglais "maternally expressed gene 3", MEG3), est un ARNlnc abondant, soumis à empreinte parentale et épissé alternativement. Pendant l'embryogenèse, MEG3 contrôle les protéines Polycomb, régulant la différenciation cellulaire, et dans les cellules adultes, MEG3 contrôle p53, régulant la réponse cellulaire aux stress environnementaux. Dans les cellules cancéreuses, MEG3 est régulé négativement, mais la surexpression ectopique de MEG3 réduit la prolifération incontrôlée, ce qui prouve que MEG3 agit comme un suppresseur de tumeur. Les données suggèrent que les fonctions de MEG3 pourraient être régulées par la structure de MEG3. Par exemple, on pense que MEG3 se lie directement aux protéines p53 et Polycomb. De plus, les différents variants d'épissage de MEG3, qui comprennent différents exons et possèdent ainsi des structures potentiellement différentes, présentent des fonctions différentes. Enfin, la mutagenèse par délétion, basée sur une structure de MEG3 prédit in silico, a permis d’identifier un motif MEG3 supposé structuré impliqué dans l'activation de p53. Cependant, au début de mes travaux, la structure expérimentale de MEG3 était inconnue.Pour comprendre la structure et la fonction de MEG3, j'ai utilisé des sondes chimiques in vitro et in vivo pour déterminer la structure secondaire de deux variants humains de MEG3 qui diffèrent par leurs niveaux d'activation de p53. À l'aide d'essais fonctionnels dans les cellules et de mutagenèse, j'ai systématiquement analysé la structure de MEG3 et identifié le noyau activant p53 dans deux domaines (D2 et D3) qui sont conservés structuralement dans les variants humains et conservés dans l’évolution chez les mammifères. Dans D2-D3, les régions structurales les plus importantes sont les hélices H11 et H27, car dans ces régions, j’ai pu supprimer l'activation de p53 grâce à des mutations ponctuelles, un degré de précision jamais atteint pour les autres ARNlnc jusqu’ici. J'ai découvert de manière surprenante que H11 et H27 sont reliés par des boucles connectées l’une à l’autre (de l’anglais "kissing loops") et j'ai confirmé l'importance fonctionnelle de ces interactions de structure tertiaire à longue distance par mutagenèse compensatoire. Allant au-delà de l’état de l’art, j'ai donc essayé de visualiser la structure 3D d’une isoforme de MEG3 longue de 1595 nucléotides, par diffusion de rayons X à petit angle (SAXS), microscopie électronique (EM) et microscopie à force atomique (AFM). Alors que le SAXS et l’EM sont limités par des défis techniques actuellement insurmontables, l’imagerie par AFM m’a permis d’obtenir la première structure 3D à basse résolution de MEG3 et de révéler son échafaudage tertiaire compact et globulaire. Plus remarquable encore, les mêmes mutations qui perturbent la connexion entre les «boucles» H11-H27 et qui inhibent la fonction de MEG3, perturbent aussi la structure 3D de cet ARNlnc, fournissant ainsi le premier lien direct entre la structure 3D et la fonction biologique pour un ARNlnc.Sur la base de mes découvertes, je peux donc proposer un mécanisme de l’activation de p53 basé sur la structure de MEG3, avec des implications importantes pour la compréhension de la cancérogenèse. Plus généralement, mes travaux prouvent que les relations structure-fonction des ARNlnc peuvent être disséquées avec une grande précision et ouvrent la voie à des études analogues visant à obtenir des informations mécanistes pour de nombreux autres ARNlnc d’importance médicale. / Long non-coding RNAs (lncRNAs) are key players in vital cellular processes, including chromatin remodelling, DNA repair and translation. However, the size and complexity of lncRNAs present unprecedented challenges for mechanistic molecular studies, so that connecting structural information with biological function for lncRNAs has proven difficult so far.Human maternally expressed gene 3 (MEG3) is an abundant, imprinted, alternatively-spliced lncRNA. During embryogenesis MEG3 controls Polycomb proteins, regulating cell differentiation, and in adult cells MEG3 controls p53, regulating the cellular response to environmental stresses. In cancerous cells, MEG3 is downregulated, but ectopic overexpression of MEG3 reduces uncontrolled proliferation, proving that MEG3 acts as a tumour suppressor. Evidence suggests that MEG3 functions may be regulated by the MEG3 structure. For instance, MEG3 is thought to bind p53 and Polycomb proteins directly. Moreover, different MEG3 splice variants, which comprise different exons and thus possess potentially different structures, display different functions. Finally, deletion mutagenesis based on a MEG3 structure predicted in silico identified a putatively-structured MEG3 motif involved in p53 activation. However, at the beginning of my work, the experimental structure of MEG3 was unknown.To understand the MEG3 structure and function, I used chemical probing in vitro and in vivo to determine the secondary structure maps of two human MEG3 variants that differ in their p53 activation levels. Using functional assays in cells and mutagenesis, I systematically scanned the MEG3 structure and identified the p53-activating core in two domains (D2 and D3) that are structurally conserved across human variants and evolutionarily conserved across mammals. In D2-D3, the most important structural regions are helices H11 and H27, because in these regions I could tune p53 activation even by point mutations, a degree of precision never achieved for any other lncRNA to date. I surprisingly discovered that H11 and H27 are connected by “kissing loops”, and I confirmed the functional importance of these long-range tertiary structure interactions by compensatory mutagenesis. Going beyond state-of-the-art, I thus attempted to visualize the 3D structure of a 1595-nucleotide long MEG3 isoform by small angle X-ray scattering (SAXS), electron microscopy (EM), and atomic force microscopy (AFM). While SAXS and EM are limited by currently-insurmountable technical challenges, single particle imaging by AFM allowed me to obtain the first low resolution 3D structure of MEG3 and reveal its compact, globular tertiary scaffold. Most remarkably, functionally-disrupting mutations that break the H11-H27 “kissing loops” disrupt such MEG3 scaffold, providing the first direct connection between 3D structure and biological function for an lncRNA.Based on my discoveries, I can therefore propose a structure-based mechanism for p53 activation by human MEG3, with important implications in understanding carcinogenesis. More broadly, my work serves as proof-of-concept that lncRNA structure-function relationships can be dissected with high precision and opens the field to analogous studies aimed to gain mechanistic insights into many other medically-relevant lncRNAs.
|
5 |
Busca e análise de lncRNA (long non-coding RNAs) importantes para a tolerância ao etanol em Saccharomyces cerevisiaeMarques, Lucas Farinazzo January 2019 (has links)
Orientador: Guilherme Targino Valente / Resumo: A levedura Saccharomyces cerevisiae é o microrganismo mais utilizado para a produção de etanol devido a sua alta capacidade fermentativa e resistência aos estresses oriundos desse processo. Entretanto, a própria concentração de etanol é um dos fatores mais limitantes no processo de produção desse combustível. Os aspectos da genômica funcional relacionada à tolerância ao etanol são ainda pouco esclarecidos, e nem mesmo se sabe se os lncRNAs tem papel nesse processo. Poucos lncRNAs foram identificados em S. cerevisiae, e nem mesmo se conhece as redes lncRNAs-proteínas nessa espécie e nem se podem codificar micropeptídeos. Nesse contexto, este trabalho visa identificar lncRNAs em linhagens de S. cerevisiae com diferentes níveis de tolerância ao etanol. Para isso, foi realizado a montagem dos lncRNAs, predição de ligações lncRNA-proteínas, buscas de micropepetídeos, análises de conservação genômica, estrutural e funcional dos lncRNAs, avaliação da influência do lncRNAs em regular as expressões de seus vizinhos e comparação dos resultados entre linhagens mais e menos tolerantes ao etanol. As análises de enriquecimento ontológico apontam para uma relação próxima entre os lncRNAs e a tolerância ao etanol e uma conservação funcional, embora os dados não reportem nenhuma conservação nem genômica nem estrutural. Além disso, variados tipos de prováveis regulações foram sugeridas, sendo a regulação em trans majoritariamente inversa entre os lncRNAs e seus genes-alvo, diferentemente da ma... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: The yeast Saccharomyces cerevisiae is the most used microorganism for ethanol production due to its high fermentative capacity and resistance to different stressors along this process. However, the ethanol concentration is one of the most limiting factors of fuel production. The functional genomics aspects related to the ethanol tolerance are still unclear, and it is not clear if the lncRNAs really have a role in this process. Few lncRNAs were identified in S. cerevisiae, lncRNA-protein networks of this species are still unknown and also if they can code micropeptides. In this context, this thesis aims to identify lncRNAs and evaluate their roles in S. cerevisiae ethanol tolerance. Then, it was performed the assembling of lncRNAs, predictions of lncRNA-protein interactions, searches for potential micropeptides coding-lncRNAs, analysis of genomic, structural and functional conservation of lncRNAs, evaluation of the lncRNAs influence in regulating the expressions of their neighbors, and comparison between strains that are more and less tolerant to the ethanol. Moreover, many putative regulatory pathways were here suggested, being that most trans regulations act on an inversely manner between the expression of the lncRNAs and their target-genes, unlike observed in most of cis regulations. The current literature confirms the lncRNAs functional conservation here observed, and the role of these non-coding molecules as regulators. Finally, here we suggest that lncRNAs are acting to ... (Complete abstract click electronic access below) / Mestre
|
6 |
Novel insights into the function and regulation of coding and long non-coding RNAsde Bony, Eric James 15 March 2018 (has links) (PDF)
Le dogme central de la biologie repose sur la production de protéines à partir de notre ADN. L’ADN est d’abord transcrit en ARN et celui-ci est ensuite traduit en protéine. C’est donc en cette dernière qu’est localisé le “pouvoir exécutif” de la cellule, ce qui explique le fait que les protéines soient devenues le centre d’attention de la recherche. L’ARN, quant à lui, est donc depuis longtemps considéré comme une molécule intermédiaire, dont l’unique raison d’être est le transfert d’information entre l’ADN et les protéines. Pourtant, ces dernières années, les avancées technologiques ont révélé qu’une majeure partie de notre génome, notre ADN, est transcrit en ARNs dits « noncodants » ne donnant pas lieu à une protéine. Ceux-ci sont impliqués dans de nombreux processus cellulaires et de ce fait participent aux pathologies. D’autre part, de nouvelles technologies ont aussi mené à l’observation que le métabolisme des ARNs, codants ou non, est la cible de nouveaux mécanismes de régulation: les modifications chimiques des ribonucléosides. Analysées de manière conjointe, ces découvertes poussent à la révision du rôle des ARNs au sein des processus cellulaires. Dès lors, dans le cadre de cette thèse nous avons voulu mieux comprendre la fonction et la régulation des molécules d’ARN afin d’en révéler le rôle plus central qu’ils jouent dans les processus cellulaire et en particulier, la cancérogenèse. Pour ce faire cette thèse comporte deux parties, la première décrit comment certains ARNs, dit “longs ARNs non-codants” participent au développement et à l’hétérogénéité du cancer colorectal. En effet ces ARNs exercent des fonctions “exécutives” sans être la source d’une protéine. Nous avons identifié 282 long ARNs non-codants dont les profils d’expression reflètent les différentes caractéristiques rencontrées au travers des différents sous-types de tumeurs colorectales. De plus, nos analyses informatiques ont indiqué que ces ARNs font partie intégrante des réseaux de signalisations les plus importants et les plus souvent dérégulés dans les différents sous-types que présente ce cancer. Enfin, et ce via des expériences in vitro nous soutenons la validité de nos analyses informatiques en confirmant le rôle de lncBLID-5, un long ARN non-codant, dans la régulation du cycle cellulaire et de la transition épithéliale vers mésenchymale un processus cellulaire très important dans les cancers colorectaux. Dans la deuxième partie nous avons étudié la méthylation des cytosines de l’ARN, une modification très récemment identifiée. Nous avons découvert que la protéine SRSF2, un facteur général de l’épissage des ARNs, est capable de se lier aux cytosines méthylées et ce plus fortement qu’aux cytosines non-méthylées. Enfin, nous montrons que la mutation P95H de SRSF2, très fréquente chez les patients atteints de leucémie, empêche SRSF2 de favoriser sa liaison aux cytosines méthylées laissant entrevoir de nouvelles explications à l’épissage défectueux conduisant à ce type de cancer. En conclusion nos travaux apportent de nouvelles informations quant à l’implication et la régulation des ARNs codants et non-codants dans le cadre du cancer. Ces résultats devraient nous mener à revoir le rôle qu’occupe l’ARN au sein des processus cellulaires sains ainsi que pathologiques, ouvrant la porte sur une nouvelle dimension de cibles diagnostiques et thérapeutiques. / Doctorat en Sciences biomédicales et pharmaceutiques (Médecine) / info:eu-repo/semantics/nonPublished
|
7 |
Dans les abysses du transcriptome : découverte de nouveaux biomarqueurs de cellules souches mésenchymateuses par analyse approfondie du RNAseq / In the abyss of the transcriptome : discovery of new biomarkers of mesenchymal stem cells by in-depth analysis of RNAseqRiquier, Sébastien 04 February 2019 (has links)
Le développement du séquençage ARN, ou RNAseq, a permis l'essor de la recherche intensive de biomarqueurs dans de nombreux domaines de la biologie. L’information complète du transcriptome contenue dans les données de sorties, permet à un bioinformaticien assidu de dépasser les connaissances actuelles et d’accéder, grâce à des pipelines informatiques avancés, à d’innombrables signatures d’intérêts inédites. Dans cette thèse nous mettons en avant que ces marqueurs potentiels, essentiellement explorés pour répondre à des problématiques clinique en conditions pathologiques, peuvent être utilisés pour affiner la caractérisation de types de cellules sans marqueurs strictement spécifiques. Nous nous sommes intéressés aux cellules souches mésenchymateuses (MSCs), un type de cellules souches adultes multipotentes, fortement utilisées en clinique mais ne possédant pas de marqueurs positifs strictement spécifiques.Notre étude se concentre sur la recherche des ARN longs non-codants non annotés. Ces ARNs, aussi nommés "lncRNA", constituent une classe émergente de transcrits encore peu explorée à ce jour. De plus, cette catégorie démontre une spécificité conditionnelle et tissulaire élevée. Nous avons élaboré un pipeline d’analyse RNAseq optimisé pour la reconstruction et la quantification de lncRNAs non annotés.En utilisant les données publiques de RNAseq, venant de différentes sources de MSCs et d'autres types de cellules, nous avons identifié de nouveaux lncRNA non annotés exprimés spécifiquement dans les MSCs.Nous avons développé pour ce projet Kmerator.jl, un outil qui permet de décomposer un transcrit en sous séquences spécifiques (k-mers) afin de chercher et quantifier plus rapidement la signature de nos candidats dans un grand nombre de données RNAseq. Kmerator a également été utilisé dans d'autres applications pour tester la qualité des données RNA-seq disponibles en accés public.Après validation de ces nouveaux biomarqueurs de MSCs par qPCR, nous avons eu recours à plusieurs outils informatiques pour prédire leurs fonctions potentielles. Enfin, nous avons analysé des données RNAseq « single-cell » pour aborder l’hétérogénéité d’expression au sein des populations MSCs. / The development of RNA sequencing, or RNAseq, have opened the path of intensive biomarkers research in many areas of biology. The complete information of the transcriptome contained in the output data, allows a bioinformatician to surpass the current knowledge and to access, thanks to advanced computer pipelines, to signatures of new interest. In this thesis, we are showing that these potential markers, classically used in clinical and pathological conditions, can be used to characterize cell types without extensive markers profile. We have studied mesenchymal stem cells, a type of adult multipotent stem cells, strongly used in clinics but without strickly specific positive markers. Our study mainly focuses on the search for non-annotated, long non-coding RNAs. These RNAs, also called "lncRNA", constitute an emerging class of transcripts and are still lightly explored.In addition, this category presents a highly tissue-related specificity. We have developed an optimized RNAseq pipeline for the reconstruction and quantification of non-annotated lncRNAs.Using public data from RNAseq, coming from different sources of MSC and other cell types, we have identified new non-annotated lncRNAs clearly and specifically expressed in MSCs. to complete this project, we developed Kmerator.jl, a bioinformatical tool that allows to decompose a transcript in k-mer, and select specific sub-sequences, in order to search and quantify at a faster rate the signature of our candidates in a large number of RNAseq dataset. After validation of these new biomarkers of MSCs by qPCR, we used several computer tools to predict their potential functions. Finally, we analyzed single-cell RNAseq data to address the heterogeneity of expression within MSC populations.
|
8 |
The evolution, modifications and interactions of proteins and RNAsSurappa-Narayanappa, Ananth Prakash January 2017 (has links)
Proteins and RNAs are two of the most versatile macromolecules that carry out almost all functions within living organisms. In this thesis I have explored evolutionary and regulatory aspects of proteins and RNAs by studying their structures, modifications and interactions. In the first chapter of my thesis I investigate domain atrophy, a term I coined to describe large-scale deletions of core structural elements within protein domains. By looking into truncated domain boundaries across several domain families using Pfam, I was able to identify rare cases of domains that showed atrophy. Given that even point mutations can be deleterious, it is surprising that proteins can tolerate such large-scale deletions. Some of the structures of atrophied domains show novel protein-protein interaction interfaces that appear to compensate and stabilise their folds. Protein-protein interactions are largely influenced by the surface and charge complementarity, while RNA-RNA interactions are governed by base-pair complementarity; both interaction types are inherently different and these differences might be observed in their interaction networks. Based on this hypothesis I have explored the protein-protein, RNA-protein and the RNA-RNA interaction networks of yeast in the second chapter. By analysing the three networks I found no major differences in their network properties, which indicates an underlying uniformity in their interactomes despite their individual differences. In the third chapter I focus on RNA-protein interactions by investigating post-translational modifications (PTMs) in RNA-binding proteins (RBPs). By comparing occurrences of PTMs, I observe that RBPs significantly undergo more PTMs than non-RBPs. I also found that within RBPs, PTMs are more frequently targeted at regions that directly interact with RNA compared to regions that do not. Moreover disorderedness and amino acid composition were not observed to significantly influence the differential PTMs observed between RBPs and nonRBPs. The results point to a direct regulatory role of PTMs in RNA-protein interactions of RBPs. In the last chapter, I explore regulatory RNA-RNA interactions. Using differential expression data of mRNAs and lncRNAs from mouse models of hereditary hemochromatosis, I investigated competing regulatory interactions between mRNA, lncRNA and miRNA. A mutual interaction network was created from the predicted miRNA interaction sites on mRNAs and lncRNAs to identify regulatory RNAs in the disease. I also observed interesting relations between the sense-antisense mRNA-lncRNA pairs that indicate mutual regulation of expression levels through a yet unknown mechanism.
|
9 |
Régulation de l'expression du gène Igf2 : nouveaux promoteurs et implication de longs ARN non-codants / Regulation of Igf2 gene expression : novel promoters and involvement of long non-coding RNAsTran, Van Giang 30 June 2014 (has links)
L'expression du gène Igf2, qui est soumis à l'empreinte génomique parentale chez les mammifères, est hautement régulée au cours du développement embryonnaire et de la période périnatale grâce à divers mécanismes transcriptionnels et post-transcriptionnels. Ces mécanismes mettent à contribution de longs ARN non codants produits au sein même du locus, dont le plus connu est l'ARN H19. En utilisant une approche de complémentation génétique par un transgène H19 dans myoblastes H19 KO de souris, nous démontrons l'existence de plusieurs nouveaux promoteurs d'Igf2. L'un de ces promoteurs, qui est conservé chez l'homme, peut être activé par un ARN ectopique antisens d'H19 (lncARN 91H) en dépit d'une méthylation complète de la région de contrôle empreinte située en cis sur le même allèle. Nous montrons également que les lncARN 91H présentent une certaine spécificité tissulaire et que leur transcription peut être initiée à partir des séquences conservées CS4, CS5 et CS9 situées en aval du gène H19. Quant à l'ARN H19, qui est l'ARN non codant majeur du locus, il semble pouvoir réguler ses transcrits antisens dans les myoblastes H19 KO complémentés par le transgène H19, mais surtout il participe activement à la régulation post-transcriptionnelle du gène Igf2 chez la souris. Nous observons en effet qu'il favorise la coupure endoribonucléolytique de l'ARN Igf2 par un mécanisme qui reste à découvrir. Enfin, nous mettons en évidence l'existence d'un l'arrêt de l'élongation de la transcription du gène d'Igf2, pour lequel nous proposons un modèle de régulation faisant intervenir un autre long ARN non codant du locus: le lncARN PIHit. Au-delà des mécanismes qui restent à explorer, nos résultats renforcent l'idée que la structure tridimensionnelle de la chromatine participe à la régulation de l'expression des gènes chez les mammifères. / In mammals, the expression of the Igf2 gene, which is subject to parental genomic imprinting, is tightly regulated during embryonic development and the perinatal period through several transcriptional and post-transcriptional mechanisms. These mechanisms are involving long non-coding RNAs (lncRNAs) produced within the locus; among them the best known is probably the H19 RNA. Using a genetic complementation assay consisting in transfections of an H19 transgene into H19 KO myoblasts, we discovered several novel Igf2 promoters in the mouse. One of these promoters, that is conserved in the human, can be activated by ectopic H19 antisens RNAs (91H lncRNAs) despite a complete methylation of the Imprinting-Control Region located in cis on the same allele. We also show that the 91H lncRNAs possess some tissue-specific features and that their transcription can be initiated from the CS4, CS5 and CS9 conserved sequences located downstream of the H19 gene. On the other hand, the H19 RNA, that is the major lncRNA of the locus, appears to regulate its antisense transcripts in H19 KO myoblasts complemented with the H19 transgene, but its major function seems to be in regulating post-transcriptionally the Igf2 gene expression. Indeed, we have observed that it favours the endoribonucleolytic cleavage of the Igf2 messenger RNAs through a mechanism that remains to be elucidated. Finally, we reveal the existence of a premature transcriptional elongation stop of the Igf2 gene, for which we propose a regulation model involving another lncRNA of the locus: the PIHit lncRNA. Beyond the mechanisms that remain to be explored, our results strengthen the idea that, in mammals, the three-dimensional organization of the chromatin is involved in regulating gene expression.
|
10 |
Identificação de RNAs longos não-codificadores de proteínas regulados por micro-RNAs / Identification of long non-protein coding RNAs regulated by micro-RNAsAmaral, Murilo Sena 18 December 2013 (has links)
Estudos recentes têm revelado que a maior parte dos transcritos gerados em células humanas é composta por RNAs não-codificadores de proteínas (ncRNAs). Uma parte desses ncRNAs compreende a classe de RNAs curtos, que possuem menos que 200 nucleotídeos. Os micro-RNAs (miRNAs) fazem parte dessa classe e têm sido alvo de grande interesse, pois são preditos como possíveis reguladores de mais de 60% dos RNAs mensageiros (mRNAs) humanos. Outra classe dos ncRNAs é composta por ncRNAs longos (lncRNAs, com mais de 200 nucleotídeos), que são transcritos a partir de regiões intergênicas e intrônicas do genoma humano e possuem várias funções, muitas delas relacionadas ao controle da expressão de mRNAs. Recentemente, os lncRNAs têm sido caracterizados quanto à sua estrutura e função. No entanto, muito pouco se sabe sobre os mecanismos pelos quais os lncRNAs são regulados. Este trabalho teve como objetivo avaliar se lncRNAs são regulados por miRNAs em células humanas. Para tanto, identificamos lncRNAs ligados ao complexo de silenciamento induzido por RNA (RISC) em células da linhagem HeLa, utilizando um método aqui desenvolvido de geração de bibliotecas de cDNA direcionadas para sequenciamento em larga escala na plataforma 454/Roche. Em paralelo, sequenciamos os miRNAs ligados ao RISC nestas mesmas células. Os resultados obtidos mostram que centenas de lncRNAs de diversas classes se ligam ao RISC em células HeLa, juntamente com milhares de mRNAs e várias centenas de miRNAs. Entre os miRNAs, encontramos 37 que são preditos como alvejando os lncRNAs detectados. Estes miRNAs constituem possíveis reguladores dos lncRNAs e, portanto, nosso trabalho estabelece um mapa experimental de interações diretas entre lncRNAs e miRNAs. Dentre os lncRNAs identificados ligados ao RISC neste trabalho, destaca-se o TUG1, lincRNA sabidamente envolvido na regulação de genes relacionados à apoptose e ao ciclo celular. Mostramos por ensaio de super-expressão de miRNAs e qPCR que TUG1 é regulado pelo miRNA-148b, um dos miRNAs por nós detectados que possui um sítio alvo altamente conservado em mamíferos localizado na extremidade 3\' de TUG1. Em conjunto, este trabalho contribui para o entendimento da regulação dos níveis de expressão de lncRNAs em células humanas e abre perspectivas para a modulação de miRNAs como estratégia de regulação dos níveis e das funções de lncRNAs. / Recent studies have revealed that the largest fraction of the transcripts generated in human cells is composed of non-protein coding RNAs (ncRNAs). A portion of these RNAs encompasses the class of short RNAs, which are less than 200 nucleotides in length. Micro-RNAs (miRNAs) are part of this class and are of great interest, as they are predicted to target over 60% of the human messenger RNAs (mRNAs). Another class of ncRNAs is composed of long ncRNAs (lncRNAs, longer than 200 nucleotides), which are transcribed from intergenic and intronic regions of the human genome and have several functions, many of them related to the control of the mRNA expression. Recently, the structure and function of lncRNAs have been characterized. However, little is known about the mechanisms involved in lncRNA regulation. This work aimed to evaluate whether lncRNAs are regulated by miRNAs in human cells. For this purpose, we identified lncRNAs bound to the RNA-induced silencing complex (RISC) in HeLa cells using a method developed here for the generation of strand-specific cDNA libraries for large scale RNA-sequencing in the 454/Roche plataform. In parallel, we sequenced the miRNAs bound to RISC in these cells. Our results show that hundreds of lncRNAs from diverse classes are bound to RISC in HeLa cells, along with thousands of mRNAs and several hundred miRNAs. Among the miRNAs we identified 37 that are predicted to target the detected lncRNAs. These miRNAs are possible regulators of the lncRNAs, and therefore our work establishes an experimental map of direct interactions between lncRNAs and miRNAs. The lncRNA TUG1, a lincRNA involved in the regulation of genes related to apoptosis and cell cycle, was identified among the lncRNAs bound to RISC. We showed by miRNA over-expression and qPCR that TUG-1 is regulated by the miRNA-148b, which is one of the miRNAs detected in our sequencings and has a binding site highly conserved in mammals located at the TUG1 3` end. Taken together, our results contribute to the understanding of the regulation of the lncRNA expression levels in human cells and open perspectives for the modulation of miRNAs as a strategy to regulate the levels and functions of lncRNAs.
|
Page generated in 0.0818 seconds