Global ETD Search

341	Non-lectin type Protein-carbohydrate Interactions: A Structural Perspective Bhatt, Veer Sandeep 27 July 2011 (has links) No description available. Biochemistry Biomedical Engineering Biophysics Protein-carbohydrate interaction protein glycosylation protein structure X-ray crystallography UDP-hexose 4-epimerase Rossmann fold Hemoglobin Lipopolysaccharide Wzy Wzz WaaL Heteropolysaccharide Oxygen affinity
342	Matrix-assisted laser desorption/ionization- quadrupole ion trap-time of flight mass spectrometry sequencing resolves structures of unidentified peptides obtained by in-gel tryptic digestion of haptoglobin derivatives from human plasma proteomes. Sutton, Chris W., Glocker, M.O., Koy, C., Tanaka, K., Mikkat, S., Resch, M. 2009 July 1914 (has links) No / Two-dimensional gel electrophoresis-separated and excised haptoglobin alpha2-chain protein spots were subjected to in-gel digestion with trypsin. Previously unassigned peptide ion signals observed in mass spectrometric fingerprinting experiments were sequenced using the matrix-assisted laser desorption/ionization-quadrupole ion trap-time of flight (MALDI-QIT-TOF) mass spectrometer and showed that the haptoglobin alpha-chain derivative under study was cleaved by trypsin unspecifically. Abundant cleavages occurred C-terminal to histidine residues at H23, H28, and H87. In addition, mild acidic hydrolysis leading to cleavage after aspartic acid residues at D13 was observed. The uninterpreted tandem mass spectrometry (MS/MS) spectrum of the peptide with ion signal at 2620.19 was submitted to database search and yielded the identification of the corresponding peptide sequence comprising amino acids (aa) aa65-87 from the haptoglobin alpha-chain protein. Also, the presence of a mixture of two tryptic peptides (mass to charge ratio m/z 1708.8; aa40-54, and aa99-113, respectively), that is caused by a tiny sequence variation between the two repeats in the haptoglobin alpha2-chain protein was resolved by MS/MS fragmentation using the MALDI-QIT-TOF mass spectrometer instrument. Advantageous features such as (i) easy parent ion creation, (ii) minimal sample consumption, and (iii) real collision induced dissociation conditions, were combined successfully to determine the amino acid sequences of the previously unassigned peptides. Hence, the novel mass spectrometric sequencing method applied here has proven effective for identification of distinct molecular protein structures. Mass spectrometric fragmentation Peptide sequencing Plasma Protein structure characterization Haptoglobin
343	Exploring Protein-Nucleic Acid Interactions Using Graph And Network Approaches Sathyapriya, R 03 1900 (has links) The flow of genetic information from genes to proteins is mediated through proteins which interact with the nucleic acids at several stages to successfully transmit the information from the nucleus to the cell cytoplasm. Unlike in the case of protein-protein interactions, the principles behind protein-nucleic acid interactions are still not very (Pabo and Nekludova, 2000) and efforts are still underway to arrive at the basic principles behind the specific recognition of nucleic acids by proteins (Prabakaran et al., 2006). This is mainly due to the innate complexity involved in recognition of nucleotides by proteins, where, even within a given family of DNA binding proteins, different modes of binding and recognition strategies are employed to suit their function (Luscomb et al., 2000). Such difficulties have also not made possible, a thorough classification of DNA/RNA binding proteins based on the mode of interaction as well as the specificity of recognition of the nucleotides. The availability of a large number of structures of protein-nucleic acids complexes (albeit lesser than the number of protein structures present in the PDB) in the past few decades has provided the knowledge-base for understanding the details behind their molecular mechanisms (Berman et al., 1992). Previously, studies have been carried out to characterize these interactions by analyzing specific non-covalent interactions such as hydrogen bonds, van der Walls, and hydrophobic interactions between a given amino acid and the nucleic acid (DNA, RNA) in a pair-wise manner, or through the analysis of interface areas of the protein-nucleic acid complexes (Nadassy et al., 1998; Jones et al., 1999). Though the studies have deciphered the common pairing preferences of a particular amino acid with a given nucleotide of DNA or RNA, there is little room for understanding these specificities in the context of spatial interactions at a global level from the protein-nucleic acid complexes. The representation of the amino acids and the nucleotides as components of graphs, and trying to explore the nature of the interactions at a level higher than exploring the individual pair-wise interactions, could provide greater details about the nature of these interactions and their specificity. This thesis reports the study of protein-nucleic interactions using graph and network based approaches. The evaluation of the parameters for characterizing protein-nucleic acid graphs have been carried out for the first time and these parameters have been successfully employed to capture biologically important non-covalent interactions as clusters of interacting amino acids and nucleotides from different protein-DNA and protein-RNA complexes. Graph and network based approaches are well established in the field of protein structure analysis for analyzing protein structure, stability and function (Kannan and Vishveshwara, 1999; Brinda and Vishveshwara, 2005). However, the use of graph and network principles for analyzing structures of protein-nucleic acid complexes is so far not accomplished and is being reported the first time in this thesis. The matter embodied in the thesis is presented as ten chapters. Chapter 1 lays the foundation for the study, surveying relevant literature from the field. Chapter 2 describes in detail the methods used in constructing graphs and networks from protein-nucleic acid complexes. Initially, only protein structure graphs and networks are constructed from proteins known to interact with specific DNA or RNA, and inferences with regard to nucleic acid binding and recognition were indirectly obtained . Subsequently, parameters were evaluated for representing both the interacting amino acids and the nucleotides as components of graphs and a direct evaluation of protein-DNA and Protein-RNA interactions as graphs has been carried out. Chapter 3 and 4 discuss the graph and network approaches applied to proteins from a dataset of DNA binding proteins complexed with DNA. In chapter 3, the protein structure graphs were constructed on the basis of the non-covalent interactions existing between the side chains of amino acids. Clusters of interacting side chains from the graphs were obtained using the graph spectral method. The clusters from the protein-DNA interface were analyzed in detail for the interaction geometry and biological importance (Sathyapriya and Vishveshwara, 2004). Chapter 4 also uses the same dataset of DNA binding proteins, but a network-based approach is presented. From the analysis of the protein structure networks from these DNA binding proteins, interesting observations relating the presence of highly connected nodes(or hubs) of the network to functionally important amino acids in the structure, emerged. Also, the comparison between the hubs identified from the protein-protein and the protein-DNA interfaces in terms of their amino acid composition and their connectivity are also presented (Sathyapriya and Vishveshwara, 2006) Chapter 5 and 6 deal with the graph and network applications to a specific system of protein-RNA complex (aminoacyl-tRNA synthetases) to gain insights into their interface biology based on amino acid connectivity. Chapter 5 deals with a dataset of aminoacyl-tRNA synthetase (aaRS) complexes obtained with various ligands like ATP, tRNA and L-amino acids. A graph based identification of side chain clusters from these ligand-bound aaRS structures has highlighted important features of ligand-binding at the catalytic sites of the two structurally different classes of aaRS (Class I and Class II). Side chain clusters from other regions of aaRS such as the anticodon binding region and the ligand-activation sites are discussed. A network approach is used in a specific system of aaRS(E.coli Glutaminyl-tRNA synthetase (GlnRS) complexed with its ligands, to specifically understand the effects of different ligand binding., in chapter 6. The structure networks of E.coli GlnRS in the ligand-free and different ligand-bound states are constructed. The ligand-free and the ligand-bound complexes are compared by analyzing their network properties and the presence of hubs to understand the effect of ligand-binding. These properties have elegantly captured the effects of ligand-binding to the GlnRS structure and have also provided an alternate method for comparing three dimensional structures of proteins in different ligand-bound states (Sathyapriya and Vishveshwara, 2007). In contrast to protein structure graphs (PSG), both the interacting amino acids and nucleotides (DNA/RNA) form the components of the protein-nucleic acid graphs (PNG) from protein-nucleic acid complexes. These graphs are constructed based on the non-covalent interactions existing between the side chains of the amino acids and nucleotides. After representing the interacting nucleotides and amino acids as graphs, clusters of the interacting components are identified. These clusters are the strongly interacting amino acids and nucleotides from the protein-nucleic acid complexes. These clusters can be generated at different strengths of interaction between the amino acid side chain and the nucleotide (measured in terms of its atomic connectivity) and can be used for detecting clusters of non-specific as well as specific interactions of amino acids and nucleotides. Though the methodology of graph construction and cluster identification are given in chapter 2, the details of the parameters evaluated for constructing PNG are given in chapter 7. Unlike in the previous chapters, the succeeding chapters deal exclusively with results that are obtained from the analyses of PNG. Two examples of obtaining clusters from a PNG are given, one each for a protein-DNA and a protein-RNA complex. In the first example, a nucleosome core particle is subjected to the graph based analysis and different clusters of amino acids with different regions of the DNA chain such as phosphate, deoxyribose sugar and the base are identified. Another example of aminoacyl-tRNA synthetase complexed with its cognate tRNA is used to illustrate the method with a protein-RNA complex. Further, the method of constructing and analyzing protein-nucleic acid graphs has been applied to the macromolecular machinery of the pre-translocation complex of the T. thermophilus 70S ribosome. Chapter 8 deals exclusively with the results identified from the analysis of this magnificent macromolecular ensemble. The availability of the method that can handle interactions between both amino acids and the nucleotides of the protein-nucleic acid complexes has given us the basis fro evaluating these interactions in a level higher than that of analyzing pair-wise interactions. A study on the evaluation of short hydrogen bonds(SHB) in proteins, which does not fall under the realm of the main objective of the thesis, is discussed in the Chapter 9. The short hydrogen bonds, defined by the geometrical distance and angle parameters, are identified from a non-redundant dataset of proteins. The insights into their occurrence, amino acid composition and secondary structural preferences are discussed. The SHB are present in distinct regions of protein three-dimensional structures, such that they mediate specific geometrical constraints that are necessary for stability of the structure (Sathyapriya and Vishveshwara, 2005). The significant conclusions of various studies carried out are summarized in the last chapter (Chapter 10). In conclusion, this thesis reports the analyses performed with protein-nucleic acid complexes using graph and network based methods. The parameters necessary for representing both amino acids and the nucleotides as components of a graph, are evaluated for the first time and can be used subsequently for other analyses. More importantly, the use of graph-based methods has resulted in considering the interaction between the amino acids and the nucleotides at a global level with respect to their topology of the protein-nucleic acid complexes. Such studies performed on a wide variety of protein-nucleic acid complexes could provide more insights into the details of protein-nucleic acid recognition mechanisms. The results of these studies can be used for rational design of experimental mutations that ascertain the structure-function relationships in proteins and protein-nucleic acid complexes. Protein-Nucleic Acid Interactions Protein-RNA Interactions Protein Structure Graphs Protein Nucleotide Graphs (PNG) Protein-DNA Complexes Protein-RNA Complexes Aminoacyl-tRNA Synthetase Glutaminyl-tRNA Synthetase (GlnRS) Proteins - Short Hydrogen Bonds Protein Structure Networks Protein-RNA Interaction Structure Graphs Protein-RNA Graphs 70S Ribosome Biochemical Genetics
344	Investigations on natural silks using dynamic mechanical thermal analysis (DMTA) Guan, Juan January 2013 (has links) This thesis examines the dynamic mechanical properties of natural silk fibres, mainly from silkworm species Bombyx mori (B. mori) and spider species Nephila edulis, using dynamic mechanical thermal analysis, DMTA. The aim is not only to provide novel data on mechanical properties of silk, but also to relate these properties to the structure and morphology of silk. A systematic approach is adopted to evaluate the effect of the three principal factors of stress, temperature and hydration on the properties and structure of silk. The methods developed in this work are then used to examine commercially important aspects of the ‘quality’ of silk. I show that the dynamic storage modulus of silks increases with loading stress in the deformation through yield to failure, whereas the conventional engineering tensile modulus decreases significantly post-yield. Analyses of the effects of temperature and thermal history show a number of important effects: (1) the loss peak at -60 °C is found to be associated the protein-water glass transition; (2) the increase in the dynamic storage modulus of native silks between temperature +25 and 100 °C is due simply to water loss; (3) a number of discrete loss peaks from +150 to +220°C are observed and attributed to the glass transition of different states of disordered structure with different intermolecular hydrogen bonding. Excess environmental humidity results in a lower effective glass transition temperature (Tg) for disordered silk fractions. Also, humidity-dynamic mechanical analysis on Nephila edulis spider dragline silks has shown that the glass transition induces a partial supercontraction, called Tg contraction. This new finding leads to the conclusion of two independent mechanisms for supercontraction in spider dragline silks. Study of three commercial B. mori cocoon silk grades and a variety of processed silks or artificial silks shows that lower grade and poorly processed silks display lower Tg values, and often have a greater loss tangent at Tg due to increased disorder. This suggests that processing contributes significantly to the differences in the structural order among natural or unnatural silks. More importantly, dynamic mechanical thermal analysis is proposed to be a potential tool for quality evaluation and control in silk production and processing. In summary, I demonstrate that DMTA is a valuable analytical tool for understanding the structure and properties of silk, and use a systematic approach to understand quantitatively the important mechanical properties of silk in terms of a generic structural framework in silk proteins. 677.39
345	UDP-sugar metabolizing pyrophosphorylases in plants : formation of precursors for essential glycosylation-reactions Decker, Daniel January 2017 (has links) UDP-sugar metabolizing pyrophosphorylases provide the primary mechanism for de novo synthesis of UDP-sugars, which can then be used for myriads of glycosyltranferase reactions, producing cell wall carbohydrates, sucrose, glycoproteins and glycolipids, as well as many other glycosylated compounds. The pyrophosphorylases can be divided into three families: UDP-Glc pyrophosphorylase (UGPase), UDP-sugar pyrophosphorylase (USPase) and UDP-N-acety lglucosamine pyrophosphorylase (UAGPase), which can be discriminated both by differences in accepted substrate range and amino acid sequences. This thesis focuses both on experimental examination (and re-examination) of some enzymatic/ biochemical properties of selected members of the UGPases and USPases and UAGPase families and on the design and implementation of a strategy to study in vivo roles of these pyrophosphorylases using specific inhibitors. In the first part, substrate specificities of members of the Arabidopsis UGPase, USPase and UAGPase families were comprehensively surveyed and kinetically analyzed, with barley UGPase also further studied with regard to itspH dependency, regulation by oligomerization, etc. Whereas all the enzymes preferentially used UTP as nucleotide donor, they differed in their specificity for sugar-1-P. UGPases had high activity with D-Glc-1-P, but could also react with Frc-1-P, whereas USPase reacted with arange of sugar-1-phosphates, including D-Glc-1-P, D-Gal-1-P, D-GalA-1-P, β-L-Ara-1-P and α-D-Fuc-1-P. In contrast, UAGPase2 reacted only with D-GlcNAc-1-P, D-GalNAc-1-P and, to some extent, with D-Glc-1-P. A structure activity relationship was established to connect enzyme activity, the examined sugar-1-phosphates and the three pyrophosphorylases. The UGPase/USPase/UAGPase active sites were subsequently compared in an attempt to identify amino acids which may contribute to the experimentally determined differences in substrate specificities. The second part of the thesis deals with identification and characterization of inhibitors of the pyrophosphorylases and with studies on in vivo effects of those inhibitors in Arabidopsis-based systems. A novel luminescence-based high-throughput assay system was designed, which allowed for quantitative measurement of UGPase and USPase activities, down to a pmol per min level. The assay was then used to screen a chemical library (which contained 17,500 potential inhibitors) to identify several compounds affecting UGPase and USPase. Hit-optimization on one of the compounds revealed even stronger inhibitors of UGPase and USPase which also strongly inhibited Arabidopsis pollen germination, by disturbing UDP-sugar metabolism. The inhibitors may represent useful tools to study in vivo roles of the pyrophosphorylases, as a complement to previous genetics-based studies. The thesis also includes two review papers on mechanisms of synthesis of NDP-sugars. The first review covered the characterization of USPase from both prokaryotic and eukaryotic organisms, whereas the second review was a comprehensive survey of NDP-sugar producing enzymes (not only UDP-sugar producing and not only pyrophosphorylases). All these enzymes were discussed with respect to their substrate specificities and structural features (if known) and their proposed in vivo functions. Chemical library screening Cell wall synthesis Glycosylation Nucleotide sugars Oligomerization Protein structure Reverse chemical genetics Sugar activation UDP-sugar synthesis Plant Biotechnology Växtbioteknologi Microbiology Mikrobiologi Pharmaceutical Chemistry Farmaceutisk synteskemi
346	Applicability of a computational design approach for synthetic riboswitches Domin, Gesine, Findeiß, Sven, Wachsmuth, Manja, Will, Sebastian, Stadler, Peter F., Mörl, Mario 25 January 2017 (has links) (PDF) Riboswitches have gained attention as tools for synthetic biology, since they enable researchers to reprogram cells to sense and respond to exogenous molecules. In vitro evolutionary approaches produced numerous RNA aptamers that bind such small ligands, but their conversion into functional riboswitches remains difficult. We previously developed a computational approach for the design of synthetic theophylline riboswitches based on secondary structure prediction. These riboswitches have been constructed to regulate ligand dependent transcription termination in Escherichia coli. Here, we test the usability of this design strategy by applying the approach to tetracycline and streptomycin aptamers. The resulting tetracycline riboswitches exhibit robust regulatory properties in vivo. Tandem fusions of these riboswitches with theophylline riboswitches represent logic gates responding to two different input signals. In contrast, the conversion of the streptomycin aptamer into functional riboswitches appears to be difficult. Investigations of the underlying aptamer secondary structure revealed differences between in silico prediction and structure probing. We conclude that only aptamers adopting the minimal free energy (MFE) structure are suitable targets for construction of synthetic riboswitches with design approaches based on equilibrium thermodynamics of RNA structures. Further improvements in the design strategy are required to implement aptamer structures not corresponding to the calculated MFE state. Genexpression Galactosidase Bindungsprotein Proteinstruktur Streptomyzin Tetrazykline Theophyllin RNA Molekül gene expression galactosidase ligands logic protein structure secondary streptomycin thermodynamics tetracycline theophylline rna escherichia coli free energy binding (molecular function) molecule ddc:572
347	Structural modelling of transmembrane domains Kelm, Sebastian January 2011 (has links) Membrane proteins represent about one third of all known vertebrate proteins and over half of the current drug targets. Knowledge of their three-dimensional (3D) structure is worth millions of pounds to the pharmaceutical industry. Yet experimental structure elucidation of membrane proteins is a slow and expensive process. In the absence of experimental data, computational modelling tools can be used to close the gap between the numbers of known protein sequences and structures. However, currently available structure prediction tools were developed with globular soluble proteins in mind and perform poorly on membrane proteins. This thesis describes the development of a modelling approach able to predict accurately the structure of transmembrane domains of proteins. In this thesis we build a template-based modelling framework especially for membrane proteins, which uses membrane protein-specific information to inform the modelling process.Firstly, we develop a tool to accurately determine a given membrane protein structure's orientation within the membrane. We offer an analysis of the preferred substitution patterns within the membrane, as opposed to non-membrane environments, and how these differences influence the structures observed. This information is then used to build a set of tools that produce better sequence alignments of membrane proteins, compared to previously available methods, as well as more accurate predictions of their 3D structures. Each chapter describes one new piece of software or information and uses the tools and knowledge described in previous chapters to build up to a complete accurate model of a transmembrane domain. 572.696
348	Statistical potentials for evolutionary studies Kleinman, Claudia L. 06 1900 (has links) Les séquences protéiques naturelles sont le résultat net de l’interaction entre les mécanismes de mutation, de sélection naturelle et de dérive stochastique au cours des temps évolutifs. Les modèles probabilistes d’évolution moléculaire qui tiennent compte de ces différents facteurs ont été substantiellement améliorés au cours des dernières années. En particulier, ont été proposés des modèles incorporant explicitement la structure des protéines et les interdépendances entre sites, ainsi que les outils statistiques pour évaluer la performance de ces modèles. Toutefois, en dépit des avancées significatives dans cette direction, seules des représentations très simplifiées de la structure protéique ont été utilisées jusqu’à présent. Dans ce contexte, le sujet général de cette thèse est la modélisation de la structure tridimensionnelle des protéines, en tenant compte des limitations pratiques imposées par l’utilisation de méthodes phylogénétiques très gourmandes en temps de calcul. Dans un premier temps, une méthode statistique générale est présentée, visant à optimiser les paramètres d’un potentiel statistique (qui est une pseudo-énergie mesurant la compatibilité séquence-structure). La forme fonctionnelle du potentiel est par la suite raffinée, en augmentant le niveau de détails dans la description structurale sans alourdir les coûts computationnels. Plusieurs éléments structuraux sont explorés : interactions entre pairs de résidus, accessibilité au solvant, conformation de la chaîne principale et flexibilité. Les potentiels sont ensuite inclus dans un modèle d’évolution et leur performance est évaluée en termes d’ajustement statistique à des données réelles, et contrastée avec des modèles d’évolution standards. Finalement, le nouveau modèle structurellement contraint ainsi obtenu est utilisé pour mieux comprendre les relations entre niveau d’expression des gènes et sélection et conservation de leur séquence protéique. / Protein sequences are the net result of the interplay of mutation, natural selection and stochastic variation. Probabilistic models of molecular evolution accounting for these processes have been substantially improved over the last years. In particular, models that explicitly incorporate protein structure and site interdependencies have recently been developed, as well as statistical tools for assessing their performance. Despite major advances in this direction, only simple representations of protein structure have been used so far. In this context, the main theme of this dissertation has been the modeling of three-dimensional protein structure for evolutionary studies, taking into account the limitations imposed by computationally demanding phylogenetic methods. First, a general statistical framework for optimizing the parameters of a statistical potential (an energy-like scoring system for sequence-structure compatibility) is presented. The functional form of the potential is then refined, increasing the detail of structural description without inflating computational costs. Always at the residue-level, several structural elements are investigated: pairwise distance interactions, solvent accessibility, backbone conformation and flexibility of the residues. The potentials are then included into an evolutionary model and their performance is assessed in terms of model fit, compared to standard evolutionary models. Finally, this new structurally constrained phylogenetic model is used to better understand the selective forces behind the differences in conservation found in genes of very different expression levels. Évolution moléculaire structure des protéines Markov chain Monte Carlo maximum de vraisemblance statistique Bayesienne potentiels statistiques molecular evolution protein structure Markov chain Monte Carlo maximum likelihood Bayesian statistics statistical potentials
349	Topological and domain Knowledge-based subgraph mining : application on protein 3D-structures / Fouille de sous-graphes basée sur la topologie et la connaissance du domaine : application sur les structures 3D de protéines Dhifli, Wajdi 11 December 2013 (has links) Cette thèse est à l'intersection de deux domaines de recherche en plein expansion, à savoir la fouille de données et la bioinformatique. Avec l'émergence des bases de graphes au cours des dernières années, de nombreux efforts ont été consacrés à la fouille des sous-graphes fréquents. Mais le nombre de sous-graphes fréquents découverts est exponentiel, cela est dû principalement à la nature combinatoire des graphes. Beaucoup de sous-graphes fréquents ne sont pas pertinents parce qu'ils sont redondants ou tout simplement inutiles pour l'utilisateur. En outre, leur nombre élevé peut nuire ou même rendre parfois irréalisable toute utilisation ultérieure. La redondance dans les sous-graphes fréquents est principalement due à la similarité structurelle et / ou sémantique, puisque la plupart des sous-graphes découverts diffèrent légèrement dans leur structures et peuvent exprimer des significations similaires ou même identiques. Dans cette thèse, nous proposons deux approches de sélection des sous-graphes représentatifs parmi les fréquents afin d'éliminer la redondance. Chacune des approches proposées s'intéresse à un type spécifique de redondance. La première approche s'adresse à la redondance sémantique où la similarité entre les sous-graphes est mesurée en fonction de la similarité entre les étiquettes de leurs noeuds, en utilisant les connaissances de domaine. La deuxième approche s'adresse à la redondance structurelle où les sous-graphes sont représentés par des descripteurs topologiques définis par l'utilisateur, et la similarité entre les sous-graphes est mesurée en fonction de la distance entre leurs descriptions topologiques respectives. Les principales données d'application de cette thèse sont les structures 3D des protéines. Ce choix repose sur des raisons biologiques et informatiques. D'un point de vue biologique, les protéines jouent un rôle crucial dans presque tous les processus biologiques. Ils sont responsables d'une variété de fonctions physiologiques. D'un point de vue informatique, nous nous sommes intéressés à la fouille de données complexes. Les protéines sont un exemple parfait de ces données car elles sont faites de structures complexes composées d'acides aminés interconnectés qui sont eux-mêmes composées d'atomes interconnectés. Des grandes quantités de structures protéiques sont actuellement disponibles dans les bases de données en ligne. Les structures 3D des protéines peuvent être transformées en graphes où les acides aminés représentent les noeuds du graphe et leurs connexions représentent les arêtes. Cela permet d'utiliser des techniques de fouille de graphes pour les étudier. L'importance biologique des protéines et leur complexité ont fait d'elles des données d'application appropriées pour cette thèse. / This thesis is in the intersection of two proliferating research fields, namely data mining and bioinformatics. With the emergence of graph data in the last few years, many efforts have been devoted to mining frequent subgraphs from graph databases. Yet, the number of discovered frequentsubgraphs is usually exponential, mainly because of the combinatorial nature of graphs. Many frequent subgraphs are irrelevant because they are redundant or just useless for the user. Besides, their high number may hinder and even makes further explorations unfeasible. Redundancy in frequent subgraphs is mainly caused by structural and/or semantic similarities, since most discovered subgraphs differ slightly in structure and may infer similar or even identical meanings. In this thesis, we propose two approaches for selecting representative subgraphs among frequent ones in order to remove redundancy. Each of the proposed approaches addresses a specific type of redundancy. The first approach focuses on semantic redundancy where similarity between subgraphs is measured based on the similarity between their nodes' labels, using prior domain knowledge. The second approach focuses on structural redundancy where subgraphs are represented by a set of user-defined topological descriptors, and similarity between subgraphs is measured based on the distance between their corresponding topological descriptions. The main application data of this thesis are protein 3D-structures. This choice is based on biological and computational reasons. From a biological perspective, proteins play crucial roles in almost every biological process. They are responsible of a variety of physiological functions. From a computational perspective, we are interested in mining complex data. Proteins are a perfect example of such data as they are made of complex structures composed of interconnected amino acids which themselves are composed of interconnected atoms. Large amounts of protein structures are currently available in online databases, in computer analyzable formats. Protein 3D-structures can be transformed into graphs where amino acids are the graph nodes and their connections are the graph edges. This enables using graph mining techniques to study them. The biological importance of proteins, their complexity, and their availability in computer analyzable formats made them a perfect application data for this thesis. Sélection de motifs Fouille de motifs Sous-graphe fréquent Sous-graphe représentant non-substitué Graphe représentant topologique Structure de protéine Feature selection Pattern mining Frequent subgraph Representative unsubstituted subgraph Topological representative subgraph Protein structure
350	Structure and dynamics of intrinsically disordered regions of MAPK signalling proteins / Structure et dynamique des régions intrinsèquement désordonnées des MAPK Kragelj, Jaka 11 December 2014 (has links) Les voies de transduction du signal cellulaire permettent aux cellules de répondre aux signaux de l'environnement et de les traiter. Les voies de transduction de kinases MAP (MAPK) sont bien conservées dans toutes les cellules eucaryotes et sont impliquées dans la régulation de nombreux processus cellulaires importants. Les régions intrinsèquement désordonnées (RID), présentes dans de nombreuses MAPK, n'étaient pas encore structurellement caractérisées. Les RID de MAPK sont particulièrement importantes car elles contiennent des motifs de liaison qui contrôlent les interactions entre les protéines MAPK elles-mêmes et aussi entre les protéines MAPK et d'autres protéines contenant les mêmes motifs. La résonance magnétique nucléaire (RMN) en combinaison avec d'autres techniques biophysiques a été utilisée pour étudier les RID de kinase des voies de transduction du signal MAPK. La spectroscopie RMN est bien adaptée pour l'étude des protéines intrinsèquement désordonnées à l'échelle atomique. Les déplacements chimiques et couplages dipolaires résiduels peuvent être utilisés conjointement avec des méthodes de sélection d'ensemble pour étudier la structure résiduelle dans les RID. La relaxation de spin nucléaire nous renseigne sur les mouvements rapides. Des titrations par RMN et des techniques de spectroscopie d'échange peuvent être utilisées pour surveiller la cinétique d'interactions protéine-protéine. Cette étude contribuera à la compréhension du rôle des RID dans les voies de transduction du signal cellulaire. / Protein signal transduction pathways allow cells respond to and process signals from the environment. A group of such pathways, called mitogen-activated protein kinase (MAPK) signal transduction pathways, is well conserved in all eukaryotic cells and is involved in regulating many important cell processes. Long intrinsically disordered region (IDRs), present in many MAPKs, have remained structurally uncharacterised. The IDRs of MAPKs are especially important as they contain docking-site motifs which control the interactions between MAPK proteins themselves and also between MAPKs and other interacting proteins containing the same motifs. Nuclear magnetic resonance (NMR) spectroscopy in combination with other biophysical techniques was used to study IDRs of MAPKs. NMR spectroscopy is well suited for studying intrinsically disordered proteins (IDPs) at atomic-level resolution. NMR observables, such as for example chemical shifts and residual dipolar couplings, can be used together with ensemble selection methods to study residual structure in IDRs. Nuclear spin relaxation informs us about fast pico-nanosecond motions. NMR titrations and exchange spectroscopy techniques can be used to monitor kinetics of protein-protein interactions. The mechanistic insight into function of IDRs and motifs will contribute to understanding of how signal transduction pathways work. RMN Couplages dipolaires résiduels Structure des protéines Kinases kinases activees par un mitogene Signalisation des MAP Kinases Proteines intrinsèquement RMN Couplages dipolaires residuels Protein structure Mitogen-activated protein kinase kinases MAP Kinase Signaling Intrinsically disordered proteins 570

Search results