Global ETD Search

11	Discovery of the role of protein-RNA interactions in protein multifunctionality and cellular complexity / Découverte du rôle des interactions protéine-ARN dans la multifonctionnalité des protéines et la complexité cellulaire Ribeiro, Diogo 05 December 2018 (has links) Au fil du temps, la vie a évolué pour produire des organismes remarquablement complexes. Pour faire face à cette complexité, les organismes ont développé une pléthore de mécanismes régulateurs. Par exemple, les mammifères transcrivent des milliers d'ARN longs non codants (ARNlnc), accroissant ainsi la capacité régulatrice de leurs cellules. Un concept émergent est que les ARNlnc peuvent servir d'échafaudages aux complexes protéiques, mais la prévalence de ce mécanisme n'a pas encore été démontrée. De plus, pour chaque ARN messager, plusieurs régions 3’ non traduites (3’UTRs) sont souvent présentes. Ces 3’UTRs pourraient réguler la fonction de la protéine en cours de traduction, en participant à la formation des complexes protéiques dans lesquels elle est impliquée. Néanmoins, la fréquence et l’importance ce mécanisme reste à aborder.Cette thèse a pour objectif de découvrir et comprendre systématiquement ces deux mécanismes de régulation méconnus. Concrètement, l'assemblage de complexes protéiques promus par les ARNlnc et les 3'UTRs est étudié avec des données d’interactions protéines-protéines et protéines-ARN à grande échelle. Ceci a permis (i) de prédire le rôle de plusieurs centaines d'ARNlnc comme molécules d'échafaudage pour plus de la moitié des complexes protéiques connus, ainsi que (ii) d’inférer plus d’un millier de complexes 3'UTR-protéines, dont certains cas pourraient réguler post-traductionnellement des protéines moonlighting aux fonctions multiples et distinctes. Ces résultats indiquent qu'une proportion élevée d'ARNlnc et de 3'UTRs pourrait réguler la fonction des protéines en augmentant ainsi la complexité du vivant. / Over time, life has evolved to produce remarkably complex organisms. To cope with this complexity, organisms have evolved a plethora of regulatory mechanisms. For instance, thousands of long non-coding RNAs (lncRNAs) are transcribed by mammalian genomes, presumably expanding their regulatory capacity. An emerging concept is that lncRNAs can serve as protein scaffolds, bringing proteins in proximity, but the prevalence of this mechanism is yet to be demonstrated. In addition, for every messenger RNA encoding a protein, regulatory 3’ untranslated regions (3’UTRs) are also present. Recently, 3’UTRs were shown to form protein complexes during translation, affecting the function of the protein under synthesis. However, the extent and importance of these 3’UTR-protein complexes in cells remains to be assessed.This thesis aims to systematically discover and provide insights into two ill-known regulatory mechanisms involving the non-coding portion of the human transcriptome. Concretely, the assembly of protein complexes promoted by lncRNAs and 3’UTRs is investigated using large-scale datasets of protein-protein and protein-RNA interactions. This enabled to (i) predict hundreds of lncRNAs as possible scaffolding molecules for more than half of the known protein complexes, as well as (ii) infer more than a thousand distinct 3’UTR-protein complexes, including cases likely to post-translationally regulate moonlighting proteins, proteins that perform multiple unrelated functions. These results indicate that a high proportion of lncRNAs and 3’UTRs may be employed in regulating protein function, potentially playing a role both as regulators and as components of complexity. Réseaux d’interactions Interactions protéine-ARN Interactions protéine-Protéine Fonction des ARNlnc Fonction des 3’UTRs Multifonctionnalité des protéines Interaction networks Protein-RNA interactions Protein-Protein interactions LncRNA function 3’UTR function Protein multifunctionality 570
12	Exploring Protein-Nucleic Acid Interactions Using Graph And Network Approaches Sathyapriya, R 03 1900 (has links) The flow of genetic information from genes to proteins is mediated through proteins which interact with the nucleic acids at several stages to successfully transmit the information from the nucleus to the cell cytoplasm. Unlike in the case of protein-protein interactions, the principles behind protein-nucleic acid interactions are still not very (Pabo and Nekludova, 2000) and efforts are still underway to arrive at the basic principles behind the specific recognition of nucleic acids by proteins (Prabakaran et al., 2006). This is mainly due to the innate complexity involved in recognition of nucleotides by proteins, where, even within a given family of DNA binding proteins, different modes of binding and recognition strategies are employed to suit their function (Luscomb et al., 2000). Such difficulties have also not made possible, a thorough classification of DNA/RNA binding proteins based on the mode of interaction as well as the specificity of recognition of the nucleotides. The availability of a large number of structures of protein-nucleic acids complexes (albeit lesser than the number of protein structures present in the PDB) in the past few decades has provided the knowledge-base for understanding the details behind their molecular mechanisms (Berman et al., 1992). Previously, studies have been carried out to characterize these interactions by analyzing specific non-covalent interactions such as hydrogen bonds, van der Walls, and hydrophobic interactions between a given amino acid and the nucleic acid (DNA, RNA) in a pair-wise manner, or through the analysis of interface areas of the protein-nucleic acid complexes (Nadassy et al., 1998; Jones et al., 1999). Though the studies have deciphered the common pairing preferences of a particular amino acid with a given nucleotide of DNA or RNA, there is little room for understanding these specificities in the context of spatial interactions at a global level from the protein-nucleic acid complexes. The representation of the amino acids and the nucleotides as components of graphs, and trying to explore the nature of the interactions at a level higher than exploring the individual pair-wise interactions, could provide greater details about the nature of these interactions and their specificity. This thesis reports the study of protein-nucleic interactions using graph and network based approaches. The evaluation of the parameters for characterizing protein-nucleic acid graphs have been carried out for the first time and these parameters have been successfully employed to capture biologically important non-covalent interactions as clusters of interacting amino acids and nucleotides from different protein-DNA and protein-RNA complexes. Graph and network based approaches are well established in the field of protein structure analysis for analyzing protein structure, stability and function (Kannan and Vishveshwara, 1999; Brinda and Vishveshwara, 2005). However, the use of graph and network principles for analyzing structures of protein-nucleic acid complexes is so far not accomplished and is being reported the first time in this thesis. The matter embodied in the thesis is presented as ten chapters. Chapter 1 lays the foundation for the study, surveying relevant literature from the field. Chapter 2 describes in detail the methods used in constructing graphs and networks from protein-nucleic acid complexes. Initially, only protein structure graphs and networks are constructed from proteins known to interact with specific DNA or RNA, and inferences with regard to nucleic acid binding and recognition were indirectly obtained . Subsequently, parameters were evaluated for representing both the interacting amino acids and the nucleotides as components of graphs and a direct evaluation of protein-DNA and Protein-RNA interactions as graphs has been carried out. Chapter 3 and 4 discuss the graph and network approaches applied to proteins from a dataset of DNA binding proteins complexed with DNA. In chapter 3, the protein structure graphs were constructed on the basis of the non-covalent interactions existing between the side chains of amino acids. Clusters of interacting side chains from the graphs were obtained using the graph spectral method. The clusters from the protein-DNA interface were analyzed in detail for the interaction geometry and biological importance (Sathyapriya and Vishveshwara, 2004). Chapter 4 also uses the same dataset of DNA binding proteins, but a network-based approach is presented. From the analysis of the protein structure networks from these DNA binding proteins, interesting observations relating the presence of highly connected nodes(or hubs) of the network to functionally important amino acids in the structure, emerged. Also, the comparison between the hubs identified from the protein-protein and the protein-DNA interfaces in terms of their amino acid composition and their connectivity are also presented (Sathyapriya and Vishveshwara, 2006) Chapter 5 and 6 deal with the graph and network applications to a specific system of protein-RNA complex (aminoacyl-tRNA synthetases) to gain insights into their interface biology based on amino acid connectivity. Chapter 5 deals with a dataset of aminoacyl-tRNA synthetase (aaRS) complexes obtained with various ligands like ATP, tRNA and L-amino acids. A graph based identification of side chain clusters from these ligand-bound aaRS structures has highlighted important features of ligand-binding at the catalytic sites of the two structurally different classes of aaRS (Class I and Class II). Side chain clusters from other regions of aaRS such as the anticodon binding region and the ligand-activation sites are discussed. A network approach is used in a specific system of aaRS(E.coli Glutaminyl-tRNA synthetase (GlnRS) complexed with its ligands, to specifically understand the effects of different ligand binding., in chapter 6. The structure networks of E.coli GlnRS in the ligand-free and different ligand-bound states are constructed. The ligand-free and the ligand-bound complexes are compared by analyzing their network properties and the presence of hubs to understand the effect of ligand-binding. These properties have elegantly captured the effects of ligand-binding to the GlnRS structure and have also provided an alternate method for comparing three dimensional structures of proteins in different ligand-bound states (Sathyapriya and Vishveshwara, 2007). In contrast to protein structure graphs (PSG), both the interacting amino acids and nucleotides (DNA/RNA) form the components of the protein-nucleic acid graphs (PNG) from protein-nucleic acid complexes. These graphs are constructed based on the non-covalent interactions existing between the side chains of the amino acids and nucleotides. After representing the interacting nucleotides and amino acids as graphs, clusters of the interacting components are identified. These clusters are the strongly interacting amino acids and nucleotides from the protein-nucleic acid complexes. These clusters can be generated at different strengths of interaction between the amino acid side chain and the nucleotide (measured in terms of its atomic connectivity) and can be used for detecting clusters of non-specific as well as specific interactions of amino acids and nucleotides. Though the methodology of graph construction and cluster identification are given in chapter 2, the details of the parameters evaluated for constructing PNG are given in chapter 7. Unlike in the previous chapters, the succeeding chapters deal exclusively with results that are obtained from the analyses of PNG. Two examples of obtaining clusters from a PNG are given, one each for a protein-DNA and a protein-RNA complex. In the first example, a nucleosome core particle is subjected to the graph based analysis and different clusters of amino acids with different regions of the DNA chain such as phosphate, deoxyribose sugar and the base are identified. Another example of aminoacyl-tRNA synthetase complexed with its cognate tRNA is used to illustrate the method with a protein-RNA complex. Further, the method of constructing and analyzing protein-nucleic acid graphs has been applied to the macromolecular machinery of the pre-translocation complex of the T. thermophilus 70S ribosome. Chapter 8 deals exclusively with the results identified from the analysis of this magnificent macromolecular ensemble. The availability of the method that can handle interactions between both amino acids and the nucleotides of the protein-nucleic acid complexes has given us the basis fro evaluating these interactions in a level higher than that of analyzing pair-wise interactions. A study on the evaluation of short hydrogen bonds(SHB) in proteins, which does not fall under the realm of the main objective of the thesis, is discussed in the Chapter 9. The short hydrogen bonds, defined by the geometrical distance and angle parameters, are identified from a non-redundant dataset of proteins. The insights into their occurrence, amino acid composition and secondary structural preferences are discussed. The SHB are present in distinct regions of protein three-dimensional structures, such that they mediate specific geometrical constraints that are necessary for stability of the structure (Sathyapriya and Vishveshwara, 2005). The significant conclusions of various studies carried out are summarized in the last chapter (Chapter 10). In conclusion, this thesis reports the analyses performed with protein-nucleic acid complexes using graph and network based methods. The parameters necessary for representing both amino acids and the nucleotides as components of a graph, are evaluated for the first time and can be used subsequently for other analyses. More importantly, the use of graph-based methods has resulted in considering the interaction between the amino acids and the nucleotides at a global level with respect to their topology of the protein-nucleic acid complexes. Such studies performed on a wide variety of protein-nucleic acid complexes could provide more insights into the details of protein-nucleic acid recognition mechanisms. The results of these studies can be used for rational design of experimental mutations that ascertain the structure-function relationships in proteins and protein-nucleic acid complexes. Protein-Nucleic Acid Interactions Protein-RNA Interactions Protein Structure Graphs Protein Nucleotide Graphs (PNG) Protein-DNA Complexes Protein-RNA Complexes Aminoacyl-tRNA Synthetase Glutaminyl-tRNA Synthetase (GlnRS) Proteins - Short Hydrogen Bonds Protein Structure Networks Protein-RNA Interaction Structure Graphs Protein-RNA Graphs 70S Ribosome Biochemical Genetics
13	Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1 / Predição de função para TSSaRNAs (transcritos associados a sitios de início de transcrição) em Halobacterium salinarum NRC-1 Adam, Yagoub Ali Ibrahim 07 February 2019 (has links) The Transcription Start Site Associated non-coding RNAs (TSSaRNAs) have been predicted across the three domain of life. However, still, there are no reliable annotation efforts to identify their biological functions and their underline molecular machinery. Therefore, this project addresses the question of what are the potential functions of TSSaRNAs regarding their roles in addressing the cellular functions. To answer this question, we aimed to accurately identify TSSaRNAs in the model organism Halobacterium salinarum NRC-1 (an Archean microorganism) that incubated at the standard growth condition. Consequently, we aimed to investigate TSSaRNAs structural stability in the term of the thermodynamic energies. Moreover, we attempted to functionally annotate TSSaRNAs based on Rfam functional classification of non-coding RNAs. Based on the statistical approach we developed an algorithm to predict TSSaRNA using next-generation RNA sequencing data (RNA-Seq). To perform structural annotation of TSSaRNAs, we investigated the structural stability of TSSaRNAs by modeling the secondary structures by minimizing the thermodynamic free energy. We simulated TSSaRNAs tertiary structures based on the secondary structures constrain using the Rosetta-Common RNA tool. The structures of the minimum free energy supposed to be biophysically stable structures. To investigate the higher order structures of TSSaRNAs, we studied the hybridization between TSSaRNAs and their cognate genes as part of RNA based regulation system. Also, based on our hypothesis that TSSaRNAs may bind to protein to trigger their function, we have investigated the interaction between TSSaRNAs and Lsm protein which known as a chaperone protein that mediates RNA function and involved in RNA processing. Our pipeline to perform the functional annotation of TSSaRNAs aimed to classify TSSaRNAs into their corresponding Rfam families based on two steps: either through querying TSSaRNAs sequences against the co-variance models of Rfam families or by querying the Rfam sequences against the co-variance models of the consensus secondary structures in TSSaRNAs. The results showed that the prediction algorithm has succeeded to identify a total of 224 TSSaRNAs that expressed in the same strand of the mRNAs and 58 TSSaRNAs that expressed as antisense of the mRNAs. The identified TSSaRNAs molecules showed a median length of 25 nucleotides. Regarding the structural annotation of TSSaRNAs, the results showed that most of TSSaRNAs possessed thermodynamically stable secondary structures and their tertiary structures were capable of forming more complex structures through binding with other biomolecules. About the formation of higher-order structures, we have observed that most of TSSaRNAs (92.2%) were capable of hybridizing into their cognate genes also 55 TSSaRNAs indicated putative interactions with Lsm protein. Furthermore, the computation docking experiments demonstrated the TSSaRNAs-Lsm complexes associated with favorable binding energy of a median of -542900 kcal mole -¹. Regarding the functional annotation of TSSaRNAs, the results showed that the majority of TSSaRNAs (42.05%) considered as potential cis-acting regulators such as cis-regulatory element and sRNAs, but still, there are potential trans-acting regulators to regulate distant molecules such as CRISPR and antisense RNA. Moreover, the results indicated that TSSaRNAs could trigger more complex function as a catalytic function such as Riboswitch or to play a role in the defense against a virus such as CRISPR. As a conclusion; based on the results of this study we could state that TSSaRNAs have several potential functions opening the experimental validation perspective. / Os RNA não codificantes associados ao sítio de início da transcrição - em inglês, transcription start site associated non-coding RNAs (TSSaRNA) - foram observados nos três domínios da vida. No entanto, sem esforço confiável de anotação para identificar suas funções biológicas e seus mecanismos moleculares. Portanto, esse projeto levanta a questão de quais são as funções em potencial dos TSSaRNAs a respeito de seus papeis nas funções celulares. Para responder esta questão, nós objetivamos em identificar de forma eficaz os TSSaRNAs no organismo modelo Halobacterium salinarum NRC-1 (um microrganismo do domínio Arqueia) encubado em uma condição de crescimento padrão. Consequentemente, nós investigamos a estabilidade estrutural dos TSSaRNAs em relação a energias termodinâmicas. Ainda, fizemos a anotação funcional dos TSSaRNAs baseado na classificação funcional Rfam dos RNAs não-codificantes. Baseada em uma abordagem estatística nós desenvolvemos um algoritmo para predizer TSSaRNA usando dados de sequenciamento de RNA de nova geração (RNA-Seq). Para investigar a estabilidade estrutural dos TSSaRNAs nós modelamos as estruturas secundárias minimizando a energia livre termodinâmica para alcançar a estrutura mais estável biofisicamente. Nós simulamos estruturas terciárias de TSSaRNAs baseado nas restrições das estruturas secundárias usando a ferramenta Rosetta-Common RNA. As estruturas de energia livre mínima seriam supostamente estruturas estáveis biofisicamente. Para investigar as estruturas de ordem superior (quaternária) dos TSSaRNAs, nós estudamos a hibridização entre os TSSaRNAs e seus genes cognatos como parte de um possível sistema de regulação baseado em RNA. Ainda, baseada na hipótese que os TSSaRNAs podem ligar à proteína para habilitar sua função, nós investigamos a interação entre TSSaRNAs e proteína Lsm que é conhecida por ser uma proteína chaperone que media função do RNA e está envolvida no processamento do RNA. Nosso pipeline para executar a anotação funcional dos TSSaRNAs objetivou classificar as TSSaRNAs em suas correspondentes classes Rfam baseado em dois passos: por meio de consulta das sequências TSSaRNA em relação a modelos de covariância de famílias Rfam ou por consulta de sequências Rfam em relação a modelos de covariância das estruturas de secundárias de consenso das estruturas secundárias nos TSSaRNAs. Os resultados mostraram que o algoritmo de detecção teve sucesso em identificar um total de 224 TSSaRNAs que expressaram na mesma direção dos mRNAs e 58 TSSaRNAs que expressaram no sentido oposto (antisenso) dos mRNAs. As moléculas TSSaRNAs identificadas mostraram um comprimento mediano de 25 nucleotídeos. A respeito da anotação estrutural dos TSSaRNAs, os resultados mostraram que a maioria dos TSSaRNAs possuíam estruturas secundárias estáveis termodinamicamente e suas estruturas terciárias foram capazes de formar estruturas mais complexas por meio de vínculos com outras biomoléculas. Quanto à formação de estruturas de maior de estruturas de alta ordem nos observamos que a maioria dos TSSaRNAs (92.2%) são capazes, pelo menos em princípio, de hibridizar em seus genes cognatos e, também, 55 TSSaRNAs evidenciaram interagir com a proteína Lsm. Além disso, os experimentos computacionais de docking demonstratam os complexos TSSaRNAs-Lsm associados com energia de ligação favorável com uma média de - 542900 kcal mole -¹. Quanto à anotação funcional dos TSSaRNAs, os resultados mostraram que a maioria dos TSSaRNAs (42.05%) podem ser consideradas potenciais reguladores atuando em cis tais como elemento cis-regulamentar e sRNAs, mas ainda há pontenciais reguladores atuando em trans para regular moléculas em loci distantes, tais como CRISPR e RNA antisense. Além disso, os resultados mostraram que TSSaRNAs podem potencialmente ativar funções mais complexas como uma função catalítica, tal como Riboswitch ou executar um papel de defesa contra vírus, tal como CRISPR. Como conclusão; baseado nos resultados desse estudo, nós podemos afirmar que TSSaRNAs possuem várias funções em potencial abrindo a perspecitiva de validação experimental. Anotação estrutural Anotação funcional Docagem de RNA Estruturas de ncRNAs de ordem superior Estruturas RNA Funções de ncRNAs Functional annotation Halobacterium salinarum NRC-1 Halobacterium salinarum NRC-1 Higher-order ncRNAs structures Interações de RNA Interações de TSSaRNAs-LSm ncRNAs functions ncRNAs prediction Non-coding RNA Predição de ncRNAs Regulação baseada em RNA Rfam Rfam RNA based regulation RNA docking RNA interactions RNA não codificante RNA structures Structural annotation TSSaRNAs TSSaRNAs TSSaRNAs-LSm interactions

Page generated in 0.0815 seconds