• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Identification de caractéristiques communes et rares dans les ARN structurés dans la base de données Rfam

El Korbi, Amell 08 1900 (has links)
Les ARN non codants (ARNnc) sont des transcrits d'ARN qui ne sont pas traduits en protéines et qui pourtant ont des fonctions clés et variées dans la cellule telles que la régulation des gènes, la transcription et la traduction. Parmi les nombreuses catégories d'ARNnc qui ont été découvertes, on trouve des ARN bien connus tels que les ARN ribosomiques (ARNr), les ARN de transfert (ARNt), les snoARN et les microARN (miARN). Les fonctions des ARNnc sont étroitement liées à leurs structures d’où l’importance de développer des outils de prédiction de structure et des méthodes de recherche de nouveaux ARNnc. Les progrès technologiques ont mis à la disposition des chercheurs des informations abondantes sur les séquences d'ARN. Ces informations sont accessibles dans des bases de données telles que Rfam, qui fournit des alignements et des informations structurelles sur de nombreuses familles d'ARNnc. Dans ce travail, nous avons récupéré toutes les séquences des structures secondaires annotées dans Rfam, telles que les boucles en épingle à cheveux, les boucles internes, les renflements « bulge », etc. dans toutes les familles d'ARNnc. Une base de données locale, RNAstem, a été créée pour faciliter la manipulation et la compilation des données sur les motifs de structure secondaire. Nous avons analysé toutes les boucles terminales et internes ainsi que les « bulges » et nous avons calculé un score d’abondance qui nous a permis d’étudier la fréquence de ces motifs. Tout en minimisant le biais de la surreprésentation de certaines classes d’ARN telles que l’ARN ribosomal, l’analyse des scores a permis de caractériser les motifs rares pour chacune des catégories d’ARN en plus de confirmer des motifs communs comme les boucles de type GNRA ou UNCG. Nous avons identifié des motifs abondants qui n’ont pas été étudiés auparavant tels que la « tetraloop » UUUU. En analysant le contenu de ces motifs en nucléotides, nous avons remarqué que ces régions simples brins contiennent beaucoup plus de nucléotides A et U. Enfin, nous avons exploré la possibilité d’utiliser ces scores pour la conception d’un filtre qui permettrait d’accélérer la recherche de nouveaux ARN non-codants. Nous avons développé un système de scores, RNAscore, qui permet d’évaluer un ARN en se basant sur son contenu en motifs et nous avons testé son applicabilité avec différents types de contrôles. / Noncoding RNAs (ncRNAs) are RNA transcripts that are not translated into proteins yet they play important functional roles in the cell including gene regulation, transcription and translation. Among the many categories of ncRNAs that were discovered, we find the well-known ribosomal RNA (rRNA), transfer RNA (tRNA), snoRNA and microRNAs (miRNA). The functions of ncRNAs are tightly linked to their structural features. Thus, understanding and predicting RNA structure as well as developing methods to search for new ncRNAs help to gain insight into these molecules. Technological advances have made available abundant sequence information accessible in databases such as Rfam, which provides alignments and structural information of many ncRNA families. In this research project, we retrieved the information from the Rfam database about the sequences of all secondary structures such as hairpin loops, internal loops, bulges, etc. in all RNA families. A local database, RNAstem, was created to facilitate the use and manipulation of information about secondary structure motifs. We analyzed hairpin loops, bulges and internal loops using the compiled data about the frequencies of occurrence of each loop or bulge and calculated a frequency score. The frequency score is aimed to be an indicator for the abundance of a specific secondary structure motif. While minimizing the bias caused by the high redundancy of some RNA classes as ribosomal RNAs, the frequency score allowed us to identify the rare motifs in each category as well as the common ones. Our findings about the abundant motifs confirm what is already known from previous studies (ex. abundant GNRA or UNCG tetraloops). We found very large gaps between the most abundant and rare RNA structural features. Moreover, we discovered that "A" and "U" dominate single stranded RNA regions, whether they are bulges or loops. We further explored the possibility of using this data to improve current prediction tools for ncRNAs by applying a filter to new candidates. We developed a score system, RNAscore, that evaluates RNAs depending on their motif contents and we tested the program with many different controls.
2

Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1 / Predição de função para TSSaRNAs (transcritos associados a sitios de início de transcrição) em Halobacterium salinarum NRC-1

Adam, Yagoub Ali Ibrahim 07 February 2019 (has links)
The Transcription Start Site Associated non-coding RNAs (TSSaRNAs) have been predicted across the three domain of life. However, still, there are no reliable annotation efforts to identify their biological functions and their underline molecular machinery. Therefore, this project addresses the question of what are the potential functions of TSSaRNAs regarding their roles in addressing the cellular functions. To answer this question, we aimed to accurately identify TSSaRNAs in the model organism Halobacterium salinarum NRC-1 (an Archean microorganism) that incubated at the standard growth condition. Consequently, we aimed to investigate TSSaRNAs structural stability in the term of the thermodynamic energies. Moreover, we attempted to functionally annotate TSSaRNAs based on Rfam functional classification of non-coding RNAs. Based on the statistical approach we developed an algorithm to predict TSSaRNA using next-generation RNA sequencing data (RNA-Seq). To perform structural annotation of TSSaRNAs, we investigated the structural stability of TSSaRNAs by modeling the secondary structures by minimizing the thermodynamic free energy. We simulated TSSaRNAs tertiary structures based on the secondary structures constrain using the Rosetta-Common RNA tool. The structures of the minimum free energy supposed to be biophysically stable structures. To investigate the higher order structures of TSSaRNAs, we studied the hybridization between TSSaRNAs and their cognate genes as part of RNA based regulation system. Also, based on our hypothesis that TSSaRNAs may bind to protein to trigger their function, we have investigated the interaction between TSSaRNAs and Lsm protein which known as a chaperone protein that mediates RNA function and involved in RNA processing. Our pipeline to perform the functional annotation of TSSaRNAs aimed to classify TSSaRNAs into their corresponding Rfam families based on two steps: either through querying TSSaRNAs sequences against the co-variance models of Rfam families or by querying the Rfam sequences against the co-variance models of the consensus secondary structures in TSSaRNAs. The results showed that the prediction algorithm has succeeded to identify a total of 224 TSSaRNAs that expressed in the same strand of the mRNAs and 58 TSSaRNAs that expressed as antisense of the mRNAs. The identified TSSaRNAs molecules showed a median length of 25 nucleotides. Regarding the structural annotation of TSSaRNAs, the results showed that most of TSSaRNAs possessed thermodynamically stable secondary structures and their tertiary structures were capable of forming more complex structures through binding with other biomolecules. About the formation of higher-order structures, we have observed that most of TSSaRNAs (92.2%) were capable of hybridizing into their cognate genes also 55 TSSaRNAs indicated putative interactions with Lsm protein. Furthermore, the computation docking experiments demonstrated the TSSaRNAs-Lsm complexes associated with favorable binding energy of a median of -542900 kcal mole -¹. Regarding the functional annotation of TSSaRNAs, the results showed that the majority of TSSaRNAs (42.05%) considered as potential cis-acting regulators such as cis-regulatory element and sRNAs, but still, there are potential trans-acting regulators to regulate distant molecules such as CRISPR and antisense RNA. Moreover, the results indicated that TSSaRNAs could trigger more complex function as a catalytic function such as Riboswitch or to play a role in the defense against a virus such as CRISPR. As a conclusion; based on the results of this study we could state that TSSaRNAs have several potential functions opening the experimental validation perspective. / Os RNA não codificantes associados ao sítio de início da transcrição - em inglês, transcription start site associated non-coding RNAs (TSSaRNA) - foram observados nos três domínios da vida. No entanto, sem esforço confiável de anotação para identificar suas funções biológicas e seus mecanismos moleculares. Portanto, esse projeto levanta a questão de quais são as funções em potencial dos TSSaRNAs a respeito de seus papeis nas funções celulares. Para responder esta questão, nós objetivamos em identificar de forma eficaz os TSSaRNAs no organismo modelo Halobacterium salinarum NRC-1 (um microrganismo do domínio Arqueia) encubado em uma condição de crescimento padrão. Consequentemente, nós investigamos a estabilidade estrutural dos TSSaRNAs em relação a energias termodinâmicas. Ainda, fizemos a anotação funcional dos TSSaRNAs baseado na classificação funcional Rfam dos RNAs não-codificantes. Baseada em uma abordagem estatística nós desenvolvemos um algoritmo para predizer TSSaRNA usando dados de sequenciamento de RNA de nova geração (RNA-Seq). Para investigar a estabilidade estrutural dos TSSaRNAs nós modelamos as estruturas secundárias minimizando a energia livre termodinâmica para alcançar a estrutura mais estável biofisicamente. Nós simulamos estruturas terciárias de TSSaRNAs baseado nas restrições das estruturas secundárias usando a ferramenta Rosetta-Common RNA. As estruturas de energia livre mínima seriam supostamente estruturas estáveis biofisicamente. Para investigar as estruturas de ordem superior (quaternária) dos TSSaRNAs, nós estudamos a hibridização entre os TSSaRNAs e seus genes cognatos como parte de um possível sistema de regulação baseado em RNA. Ainda, baseada na hipótese que os TSSaRNAs podem ligar à proteína para habilitar sua função, nós investigamos a interação entre TSSaRNAs e proteína Lsm que é conhecida por ser uma proteína chaperone que media função do RNA e está envolvida no processamento do RNA. Nosso pipeline para executar a anotação funcional dos TSSaRNAs objetivou classificar as TSSaRNAs em suas correspondentes classes Rfam baseado em dois passos: por meio de consulta das sequências TSSaRNA em relação a modelos de covariância de famílias Rfam ou por consulta de sequências Rfam em relação a modelos de covariância das estruturas de secundárias de consenso das estruturas secundárias nos TSSaRNAs. Os resultados mostraram que o algoritmo de detecção teve sucesso em identificar um total de 224 TSSaRNAs que expressaram na mesma direção dos mRNAs e 58 TSSaRNAs que expressaram no sentido oposto (antisenso) dos mRNAs. As moléculas TSSaRNAs identificadas mostraram um comprimento mediano de 25 nucleotídeos. A respeito da anotação estrutural dos TSSaRNAs, os resultados mostraram que a maioria dos TSSaRNAs possuíam estruturas secundárias estáveis termodinamicamente e suas estruturas terciárias foram capazes de formar estruturas mais complexas por meio de vínculos com outras biomoléculas. Quanto à formação de estruturas de maior de estruturas de alta ordem nos observamos que a maioria dos TSSaRNAs (92.2%) são capazes, pelo menos em princípio, de hibridizar em seus genes cognatos e, também, 55 TSSaRNAs evidenciaram interagir com a proteína Lsm. Além disso, os experimentos computacionais de docking demonstratam os complexos TSSaRNAs-Lsm associados com energia de ligação favorável com uma média de - 542900 kcal mole -¹. Quanto à anotação funcional dos TSSaRNAs, os resultados mostraram que a maioria dos TSSaRNAs (42.05%) podem ser consideradas potenciais reguladores atuando em cis tais como elemento cis-regulamentar e sRNAs, mas ainda há pontenciais reguladores atuando em trans para regular moléculas em loci distantes, tais como CRISPR e RNA antisense. Além disso, os resultados mostraram que TSSaRNAs podem potencialmente ativar funções mais complexas como uma função catalítica, tal como Riboswitch ou executar um papel de defesa contra vírus, tal como CRISPR. Como conclusão; baseado nos resultados desse estudo, nós podemos afirmar que TSSaRNAs possuem várias funções em potencial abrindo a perspecitiva de validação experimental.

Page generated in 0.0368 seconds