Spelling suggestions: "subject:"2proteins 3structure"" "subject:"2proteins bstructure""
81 |
Algorithmic Approaches For Protein-Protein Docking And quarternary Structure InferenceMitra, Pralay 07 1900 (has links)
Molecular interaction among proteins drives the cellular processes through the formation of complexes that perform the requisite biochemical function. While some of the complexes are obligate (i.e., they fold together while complexation) others are non-obligate, and are formed through macromolecular recognition. Macromolecular recognition in proteins is highly specific, yet it can be both permanent and non permanent in nature. Hallmarks of permanent recognition complexes include large surface of interaction, stabilization by hydrophobic interaction and other noncovalent forces. Several amino acids which contribute critically to the free energy of binding at these interfaces are called as “hot spot” residues. The non permanent recognition complexes, on the other hand, usually show small interface of interaction, with limited stabilization from non covalent forces. For both the permanent and non permanent complexes, the specificity of molecular interaction is governed by the geometric compatibility of the interaction surface, and the noncovalent forces that anchor them. A great deal of studies has already been performed in understanding the basis of protein macromolecular recognition.1; 2 Based on these studies efforts have been made to develop protein-protein docking algorithms that can predict the geometric orientation of the interacting molecules from their individual unbound states. Despite advances in docking methodologies, several significant difficulties remain.1 Therefore, in this thesis, we start with literature review to understand the individual merits and demerits of the existing approaches (Chapter 1),3 and then, we attempt to address some of the problems by developing methods to infer protein quaternary structure from the crystalline state, and improve structural and chemical understanding of protein-protein interactions through biological complex prediction.
The understanding of the interaction geometry is the first step in a protein-protein interaction study. Yet, no consistent method exists to assess the geometric compatibility of the interacting interface because of its highly rugged nature. This suggested that new sensitive measures and methods are needed to tackle the problem. We, therefore, developed two new and conceptually different measures using the Delaunay tessellation and interface slice selection to compute the surface complementarity and atom packing at the protein-protein interface (Chapter 2).4 We called these Normalized Surface Complementarity (NSc) and Normalized Interface Packing (NIP). We rigorously benchmarked the measures on the non redundant protein complexes available in the Protein Data Bank (PDB) and found that they efficiently segregate the biological protein-protein contacts from the non biological ones, especially those derived from X-ray crystallography. Sensitive surface packing/complementarity recognition algorithms are usually computationally expensive and thus limited in application to high-throughput screening. Therefore, special emphasis was given to make our measure compute-efficient as well. Our final evaluation showed that NSc, and NIP have very strong correlation among themselves, and with the interface area normalized values available from the Surface Complementarity program (CCP4 Suite: <http://smb.slac.stanford.edu/facilities/software/ccp4/html/sc.html>); but at a fraction of the computing cost.
After building the geometry based surface complementarity and packing assessment methods to assess the rugged protein surface, we advanced our goal to determine the stabilities of the geometrically compatible interfaces formed. For doing so, we needed to survey the quaternary structure of proteins with various affinities. The emphasis on affinity arose due to its strong relationship with the permanent and non permanent life-time of the complex. We, therefore, set up data mining studies on two databases named PQS (Protein Quaternary structure database: http://pqs.ebi.ac.uk) and PISA (Protein Interfaces, Surfaces and Assemblies: www.ebi.ac.uk/pdbe/prot_int/pistart.html) that offered downloads on quaternary structure data on protein complexes derived from X-ray crystallographic methods. To our surprise, we found that above mentioned databases provided the valid quaternary structure mostly for moderate to strong affinity complexes. The limitation could be ascertained by browsing annotations from another curated database of protein quaternary structure (PiQSi:5 supfam.mrc-lmb.cam.ac.uk/elevy/piqsi/piqsi_home.cgi) and literature surveys. This necessitated that we at first develop a more robust method to infer quaternary structures of all affinity available from the PDB. We, therefore, developed a new scheme focused on covering all affinity category complexes, especially the weak/very weak ones, and heteromeric quaternary structures (Chapter 3).6 Our scheme combined the naïve Bayes classifier and point-group symmetry under a Boolean framework to detect all categories of protein quaternary structures in crystal lattice. We tested it on a standard benchmark consisting of 112 recognition heteromeric complexes, and obtained a correct recall in 95% cases, which are significantly better than 53% achieved by the PISA,7 a state-of-art quaternary structure detection method hosted at the European Bioinformatics Institute, Hinxton, UK. A few cases that failed correct detection through our scheme, offered interesting insights into the intriguing nature of protein contacts in the lattice. The findings have implications for accurate inference of quaternary states of proteins, especially weak affinity complexes, where biological protein contacts tend to be sacrificed for the energetically optimal ones that favor the formation/stabilization of the crystal lattice. We expect our method to be used widely by all researchers interested in protein quaternary structure and interaction.
Having developed a method that allows us to sample all categories of quaternary structures in PDB, we set our goal in addressing the next problem that of accurately determining stabilities of the geometrically compatible protein surfaces involved in interaction. Reformulating the question in terms of protein-protein docking, we sought to ask how we could reliably infer the stabilities of any arbitrary interface that is formed when two protein molecules are brought sterically closer. In a real protein docking exercise this question is asked innumerable times during energy-based screening of thousands of decoys geometrically sampled (through rotation+translation) from the unbound subunits. The current docking methods face problems in two counts: (i), the number of interfaces from decoys to evaluate energies is rather large (64320 for a 9º rotation and translation for a dimeric complex), and (ii) the energy based screening is not quite efficient such that the decoys with native-like quaternary structure are rarely selected at high ranks. We addressed both the problems with interesting results.
Intricate decoy filtering approaches have been developed, which are either applied during the search stage or the sampling stage, or both. For filtering, usually statistical information, such as 3D conservation information of the interfacial residues, or similar facts is used; more expensive approaches screen for orientation, shape complementarity and electrostatics. We developed an interface area based decoy filter for the sampling stage, exploiting an assumption that native-like decoys must have the largest, or close to the largest, interface (Chapter 4).8 Implementation of this assumption and standard benchmarking showed that in 91% of the cases, we could recover native-like decoys of bound and unbound binary docking-targets of both strong and weak affinity. This allowed us to propose that “native-like decoys must have the largest, or close to the largest, interface” can be used as a rule to exclude non native decoys efficiently during docking sampling. This rule can dramatically clip the needle-in-a-haystack problem faced in a docking study by reducing >95% of the decoy set available from sampling search. We incorporated the rule as a central part of our protein docking strategy.
While addressing the question of energy based screening to rank the native-like decoys at high rank during docking, we came across a large volume of work already published. The mainstay of most of the energy based screenings that avoid statistical potential, involve some form of the Coulomb’s potential, Lennard Jones potential and solvation energy. Different flavors of the energy functions are used with diverse preferences and weights for individual terms. Interestingly, in all cases the energy functions were of the unnormalized form. Individual energy terms were simply added to arrive at a final score that was to be used for ranking. Proteins being large molecules, offer limited scope of applying semi-empirical or quantum mechanical methods for large scale evaluation of energy. We, therefore, developed a de novo empirical scoring function in the normalized form. As already stated, we found NSc and NIP to be highly discriminatory for segregating biological and non biological interface. We, therefore, incorporated them as parameters for our scoring function. Our data mining study revealed that there is a reasonable correlation of -0.73 between normalized solvation energy and normalized nonbonding energy (Coulombs + van der Waals) at the interface. Using the information, we extended our scoring function by combining the geometric measures and the normalized interaction energies. Tests on 30 unbound binary protein-protein complexes showed that in 16 cases we could identify at least one decoy in top three ranks with ≤10 Å backbone root-mean-square-deviation (RMSD) from true binding geometry. The scoring results were compared with other state-of-art methods, which returned inferior results. The salient feature of our scoring function was exclusion of any experiment guided restraints, evolutionary information, statistical propensities or modified interaction energy equations, commonly used by others. Tests on 118 less difficult bound binary protein-protein complexes with ≤35% sequence redundancy at the interface gave first rank in 77% cases, where the native like decoy was chosen among 1 in 10,000 and had ≤5 Å backbone RMSD from true geometry. The details about the scoring function, results and comparison with the other methods are extensively discussed in Chapter 5.9 The method has been implemented and made available for public use as a web server - PROBE (http://pallab.serc.iisc.ernet.in/probe). The development and use of PROBE has been elaborated in Chapter 7.10
On course of this work, we generated huge amounts of data, which is useful information that could be used by others, especially “protein dockers”. We, therefore, developed dockYard (http://pallab.serc.iisc.ernet.in/dockYard) - a repository for protein-protein docking decoys (Chapter 6).11 dockYard offers four categories of docking decoys derived from: Bound (native dimer co-crystallized), Unbound (individual subunits as well as the target are crystallized), Variants (match the previous two categories in at least one subunit with 100% sequence identity), and Interlogs (match the previous categories in at least one subunit with ≥90% or ≥50% sequence identity). There is facility for full or selective download based on search parameters. The portal also serves as a repository to modelers who may want to share their decoy sets with the community.
In conclusion, although we made several contributions in development of algorithms for improved protein-protein docking and quaternary structure inference, a lot of challenges remain (Chapter 8). The principal challenge arises by considering proteins as flexible bodies, whose conformational states may change on quaternary structure formation. In addition, solvent plays a major role in the free energy of binding, but its exact contribution is not straightforward to estimate. Undoubtedly, the cost of computation is one of the limiting factors apart from good energy functions to evaluate the docking decoys. Therefore, the next generation of algorithms must focus on improved docking studies that realistically incorporate flexibility and solvent environment in all their evaluations.
|
82 |
Analysis Of Protein Evolution And Its Implications In Remote Homology Detection And Function RecognitionGowri, V S 10 1900 (has links)
One of the major outcomes of a genome sequencing project is the availability of amino acid sequences of all the proteins encoded in the genome of the organism concerned. However, most commonly, for a substantial proportion of the proteins encoded in the genome no information in function is available either from experimental studies or by inference on the basis of homology with a protein of known function. Even if the general function of a protein is known, the region of the protein corresponding to the function might be a domain and there may be additional regions of considerable length in the protein with no known function. In such cases the information on function is incomplete.
Lack of understanding of the repertoire of functions of proteins encoded in the genome limits the utility of the genomic data. While there are many experimental approaches available for deciphering functions of proteins at the genomic scale, bioinformatics approaches form a good early step in obtaining clues about functions of proteins at the genomic scale (Koonin et al, 1998). One of the common bioinformatics approaches is recognition of function by homology (Bork et al, 1994). If the evolutionary relationship between two proteins, one with known function and the other with unknown function, could be established it raises the possibility of common function and 3-D structure for these proteins(Bork and Gibson, 1996). While this approach is effective its utility is limited by the ability of the bioinformatics approach to identify related proteins when their evolutionary divergence is high leading to low amino acid sequence similarity which is typical of two unrelated proteins (Bork and Koonin, 1998). Use of 3-D structural information, obtained by predictive methods such as fold recognition, has offered approaches towards increasing the sensitivity of remote homology detection 9e.g., Kelley et al, 2000; Shi et al, 2001; Gough et al, 2001).
The work embodied in this thesis has the general objective of analysis of evolution of structural features and functions of families of proteins and design of new bioinformatics approaches for recognizing distantly related proteins and their applications. After an introductory chapter, a few chapters report analysis of functional and structural features of homologous protein domains. Further chapters report development and assessment of new remote homology detection approaches and applications to the proteins encoded in two protozoan organisms. A further chapter is presented on the analysis of proteins involved in methylglyoxal detoxification pathways in kinetoplastid organisms.
Chapter I of the thesis presents a brief introduction, based on the information available in the literature, to protein structures, classification, methods for structure comparison, popular methods for remote homology detection and homology-based methods for function annotation.
Chapter 2 describes the steps involved in the update and improvements made in this database. In addition to the update, the domain structural families are integrated with the homologous sequences from the sequence databases. Thus, every family in PALI is enriched with a substantial volume of sequence information from proteins with no known structural information.
Chapter 3 reports investigations on the inter-relationships between sequence, structure and functions of closely-related homologous enzyme domain families.
Chapter 4 describes the investigations on the unusual differences in the lengths of closely-related homologous protein domains, accommodation of additional lengths in protein 3-D structures and their functional implications.
Chapter 5 reports the development and assessment of a new approach for remote homology detection using dynamic multiple profiles of homologous protein domain families.
Chapter 6 describes development of another remote homology detection approach which are multiple, static profiles generated using the bonafide members of the family. A rigorous assessment of the approach and strategies for improving the detection of distant homologues using the multiple profile approach are discussed in this chapter.
Chapter 7 describes results of searches made in the database of multiple family profiles (MulPSSM database) in order to recognize the functions of hypothetical proteins encoded in two parasitic protozoa.
Chapter 8 describes the sequence and structural analyses of two glyoxalase pathway proteins from the kinetoplastid organism Leishmania donovani which causes Leishmaniases. An alternate enzyme, which would probably substitute the glyoxalase pathway enzymes in certain kinetoplastid organisms which lack the glyoxalase enzymes are also discussed.
Chapter 9 summarises the important findings from the various analyses discussed in this thesis.
Appendix describes an analysis on the correlation between a measure of hydrophobicity of amino acid residues aligned in a multiple sequence alignment and residue depth in 3-D structures of proteins.
|
83 |
Proteomic analysis of liver membranes through an alternative shotgun methodologyChick, Joel January 2009 (has links)
Thesis (PhD)--Macquarie University, Division of Environmental & Life Sciences, Dept. of Chemistry & Biomolecular Sciences, 2009. / Bibliography: p. 200-212. / Introduction -- Shotgun proteomic analysis of rat liver membrane proteins -- A combination of immobilised pH gradients improve membrane proteomics -- Affects of tumor-induced inflammation on membrane proteins abundance in the mouse liver -- Affects of tumor-induced inflammation on biochemical pathways in the mouse liver -- General discussion -- References. / The aim of this thesis was to develop a proteomics methodology that improves the identification of membrane proteomes from mammalian liver. Shotgun proteomics is a method that allows the analysis of proteins from cells, tissues and organs and provides comprehensive characterisation of proteomes of interest. The method developed in this thesis uses separation of peptides from trypsin digested membrane proteins by immobilised pH gradient isoelectric focusing (IPG-IEF) as the first dimension of two dimensional shotgun proteomics. In this thesis, peptide IPG-IEF was shown to be a highly reproducible, high resolution analytical separation that provided the identification of over 4,000 individual protein identifications from rat liver membrane samples. Furthermore, this shotgun proteomics strategy provided the identification of approximately 1,100 integral membrane proteins from the rat liver. The advantages of using peptide IPG-IEF as a shotgun proteomics separation dimension in conjunction with label-free quantification was applied to a biological question: namely, does the presence of a spatially unrelated benign tumor affect the abundance of mouse liver proteins. IPG-IEF shotgun proteomics provided comprehensive coverage of the mouse liver membrane proteome with 1,569 quantified proteins. In addition, the presence of an Englebreth-Holm-Swarm sarcoma induced changes in abundance of proteins in the mouse liver, including many integral membrane proteins. Changes in the abundance of liver proteins was observed in key liver metabolic processes such as fatty acid metabolism, fatty acid transport, xenobiotic metabolism and clearance. These results provide compelling evidence that the developed shotgun proteomics methodology allows for the comprehensive analysis of mammalian liver membrane proteins and detailed some of the underlying changes in liver metabolism induced by the presence of a tumor. This model may reflect changes that could occur in the livers of cancer patients and has implications for drug treatments. / Mode of access: World Wide Web. / 609 p. ill. (some col.)
|
84 |
Predição da estrutura de proteínas off-lattice usando evolução diferencial multiobjetivo adaptativaVenske, Sandra Mara Guse Scós 28 March 2014 (has links)
Fundação Araucária / A Predição da Estrutura das Proteínas, conhecida como PSP (Protein Structure Prediction) pode ser considerada um dos problemas mais desafiadores da Bioinformática atualmente. Quando uma proteína está em seu estado de conformação nativa, a energia livre tende para um valor mínimo. Em geral, a predição da conformação de uma proteína por métodos computacionais é feita pela estimativa de dois valores de energia livre que são provenientes das interações intra e intermoleculares entre os átomos. Alguns estudos recentes indicam que estas interações estão em conflito, justificando o uso de abordagens baseadas em otimização multiobjetivo para a solução do PSP. Neste caso, a otimização destas energias é realizada separadamente, diferente da formulação mono-objetivo que considera a soma das energias. A Evolução Diferencial (ED) é uma técnica baseada em Computação Evolucionária e representa uma alternativa interessante para abordar o PSP. Este trabalho busca desenvolver um otimizador baseado no algoritmo de ED para o problema da Predição da Estrutura de Proteínas multiobjetivo. Este trabalho investiga ainda estratégias baseadas em parâmetros adaptativos para a evolução diferencial. Nicialmente avalia-se uma abordagem simples baseada em ED proposta para a solução do PSP. Uma evolução deste método que incorpora conceitos do algoritmo MOEA/D e adaptação de parâmetros é testada em um conjunto de problemas benchmark de otimização multiobjetivo. Os resultados preliminares obtidos para o PSP (para seis proteínas reais) são promissores e aqueles obtidos para o conjunto benchmark colocam a abordagem proposta como candidata para otimização multiobjetivo. / Protein Structure Prediction (PSP) can be considered one of the most challenging problems in Bioinformatics nowadays. When a protein is in its conformation state, the free energy is minimized. Evaluation of protein conformation is generally performed based on two values of the estimated free energy, i.e., those provided by intra and intermolecular interactions among atoms. Some recent experimental studies show that these interactions are in conflit, justifying the use of multiobjective optimization approaches to solve PSP. In this case, the energy optimization is performed separately, different from the mono-objective optimization which considers the sum of free energy. Differential Evolution (DE) is a technique based on Evolutionary Computation and represents an interesting alternative to solve multiobjective PSP. In this work, an optimizer based on DE is proposed to solve the PSP problem. Due to the great number of parameters, typical for evolutionary algorithms, this work also investigates adaptive parameters strategies. In experiments, a simple approach based on ED is evaluated for PSP. An evolution for this method, which incorporates concepts of the MOEA/D algorithm and parameter adaptation techniques is tested for a set of benchmarks in the multiobjective optimization context. The preliminary results for PSP (for six real proteins) are promising and those obtained for the benchmark set stands the proposed approach as a candidate to the state-of-art for multiobjective optimization.
|
85 |
Predição da estrutura de proteínas off-lattice usando evolução diferencial multiobjetivo adaptativaVenske, Sandra Mara Guse Scós 28 March 2014 (has links)
Fundação Araucária / A Predição da Estrutura das Proteínas, conhecida como PSP (Protein Structure Prediction) pode ser considerada um dos problemas mais desafiadores da Bioinformática atualmente. Quando uma proteína está em seu estado de conformação nativa, a energia livre tende para um valor mínimo. Em geral, a predição da conformação de uma proteína por métodos computacionais é feita pela estimativa de dois valores de energia livre que são provenientes das interações intra e intermoleculares entre os átomos. Alguns estudos recentes indicam que estas interações estão em conflito, justificando o uso de abordagens baseadas em otimização multiobjetivo para a solução do PSP. Neste caso, a otimização destas energias é realizada separadamente, diferente da formulação mono-objetivo que considera a soma das energias. A Evolução Diferencial (ED) é uma técnica baseada em Computação Evolucionária e representa uma alternativa interessante para abordar o PSP. Este trabalho busca desenvolver um otimizador baseado no algoritmo de ED para o problema da Predição da Estrutura de Proteínas multiobjetivo. Este trabalho investiga ainda estratégias baseadas em parâmetros adaptativos para a evolução diferencial. Nicialmente avalia-se uma abordagem simples baseada em ED proposta para a solução do PSP. Uma evolução deste método que incorpora conceitos do algoritmo MOEA/D e adaptação de parâmetros é testada em um conjunto de problemas benchmark de otimização multiobjetivo. Os resultados preliminares obtidos para o PSP (para seis proteínas reais) são promissores e aqueles obtidos para o conjunto benchmark colocam a abordagem proposta como candidata para otimização multiobjetivo. / Protein Structure Prediction (PSP) can be considered one of the most challenging problems in Bioinformatics nowadays. When a protein is in its conformation state, the free energy is minimized. Evaluation of protein conformation is generally performed based on two values of the estimated free energy, i.e., those provided by intra and intermolecular interactions among atoms. Some recent experimental studies show that these interactions are in conflit, justifying the use of multiobjective optimization approaches to solve PSP. In this case, the energy optimization is performed separately, different from the mono-objective optimization which considers the sum of free energy. Differential Evolution (DE) is a technique based on Evolutionary Computation and represents an interesting alternative to solve multiobjective PSP. In this work, an optimizer based on DE is proposed to solve the PSP problem. Due to the great number of parameters, typical for evolutionary algorithms, this work also investigates adaptive parameters strategies. In experiments, a simple approach based on ED is evaluated for PSP. An evolution for this method, which incorporates concepts of the MOEA/D algorithm and parameter adaptation techniques is tested for a set of benchmarks in the multiobjective optimization context. The preliminary results for PSP (for six real proteins) are promising and those obtained for the benchmark set stands the proposed approach as a candidate to the state-of-art for multiobjective optimization.
|
86 |
Bases moleculares do efeito do pH na atividas catalítica de duas lisozimas digestivas de Musca domestica (Diptera) / Molecular basis of the pH effect on the catalytic activity of two digestive lysozymes from Musca domestica (Diptera)Fabiane Chaves Cançado 16 December 2008 (has links)
Lisozimas são enzimas que fazem parte do mecanismo de defesa contra bactérias, no entanto lisozimas com função digestiva também são encontradas no trato digestivo de vertebrados e no intestino médio de insetos. As lisozimas digestivas de insetos são do tipo c e assim compartilham semelhanças estruturais e mecanísticas com a lisozima da clara de ovo de galinha (HEWL). Entretanto, para desempenhar sua função digestiva, as lisozimas de insetos apresentam algumas propriedades particulares entre as quais se destaca um pH ótimo mais ácido em relação às lisozimas não-digestivas. Para elucidar as bases moleculares dessa diferença no pH ótimo, duas lisozimas digestivas (lisozima 1 AAQ20048 e lisozima 2 AAQ20047) da larva de Musca domestica (mosca Diptera Cyclorrhapha), clonadas em Pichia pastoris e purificadas, foram caracterizadas estruturalmente e cineticamente com o substrato sintético (MUQ3) e natural (cápsulas de Micrococcus lysodeikticus). Foi observado que o efeito do pH na atividade das lisozimas 1 e 2 sobre o MUQ3 é uma curva com formato de sino e pH ótimo mais ácido que o da HEWL. Essas curvas foram reflexos da diminuição simultânea dos valores de pKas do nucleófilo e do doador de prótons. Estruturas cristalográficas das lisozimas digestivas de Musca domestica foram obtidas a 1,9 Å e análise comparativa com a estrutura terciária da HEWL revelou resíduos de aminoácidos no ambiente do nucleófilo (N46) e do doador de prótons (S106 e T107) que podem estar envolvidos na modulação das constantes de ionização dos resíduos essenciais à catálise. Esses resíduos foram substituídos via mutagênese sítio-dirigida por D, V e A respectivamente e três mutantes simples (N46D, S106V e T107A) e um triplo (N46DS106V- T107A) foram produzidos e purificados. Caracterização revelou que as contribuições individuais da N46, S106 e T107 foram pequenas e próximas do limite de detecção da técnica utilizada. Por outro lado, o conjunto dos 3 aminoácidos foi responsável pelo pH ótimo ácido frente ao substrato sintético, elevando os valores de pKas do nucleófilo e doador de prótons para valores muito semelhantes ao da HEWL. Diferentemente, essa tripla mutação não foi suficiente para elevar o pH ótimo da lisozima 2 sobre cápsulas de Micrococcus lysodeikticus para valores próximos àqueles de HEWL, sugerindo que as bases moleculares do pH ótimo frente ao substrato natural e sintético são diferentes. Uma comparação estrutural entre lisozima 1 e HEWL sugere que os resíduos de aminoácidos carregados na superfície dessas lisozimas sejam importantes para determinação do pH ótimo. A investigação dessa hipótese foi feita substituindo 5 aminoácidos neutros e 1 ácido, via mutagênese sítio-dirigida, por resíduos básicos. A caracterização do mutante sêxtuplo revelou um aumento significativo nos valores de pH ótimo da lisozima 1, indicando que a redução da basicidade da superfície das lisozimas digestivas é determinante para seus pHs ótimos ácidos. / Lysozymes are enzymes that are part of the defence mechanism against bacteria, however lysozymes with digestive function are also found in the digestive tract of vertebrates and in the insect midgut. The digestive lysozymes from insects are c type, so they share similar structural and mechanistic characteristics with hen egg-white lysozyme (HEWL). However, to perform their digestive function, insect lysozymes present some particular properties among them a more acidic pH optimum than that of non-digestive lysozymes. To elucidate the molecular basis of this pH optimum difference, two digestive lysozymes (lysozyme 1 AAQ20048 and lysozyme 2 AAQ20047) from Musca domestica larvae (housefly Diptera Cyclorrhapha), cloned in Pichia pastoris and purified, were structurally and kinecticly characterized with synthetic (MUQ3) and natural (lyophilized cells of Micrococcus lysodeikticus) substrates. It was observed that the pH effect on the activity of lysozymes 1 and 2 upon MUQ3 is a bell shaped curve exhibiting a more acidic pH optimum than that of HEWL. These curves result from simultaneous decrease of pKas values of the nucleophile and proton donor. Crystallographic structures of these digestive lysozymes from Musca domestica were obtained at 1.9 Å and comparative analysis with the terciary structure of HEWL revealed amino acid residues in the catalytic nucleophile (N46) and proton donor environment (S106 and T107) that may be involved in the modulation of ionization constants of those catalytic residues. N46, S106, and T107 were replaced via site-directed mutagenesis by D, V and A respectively and three simple (N46D, S106V and T107A) and one triple (N46D-S106V-T107A) mutants were produced and purified. Their characterization revealed that the individual contributions of N46, S106 and T107 were small and close to the detection borderline of the technique utilized. On the other hand, a set of these 3 amino acids was responsible by acidic pH optimum upon synthetic substrate, increasing the pKas values of nucleophile and proton donor to similar values to that of the HEWL. Differently, this triple mutation was not enough to increase the pH optimum of lysozyme 2 upon lyophilized cells of Micrococcus lysodeikticus to values close to those of HEWL, suggesting that the molecular bases of pH optimum upon natural and synthetic substrates are different. A structural comparison between lysozyme 1 and HEWL suggests that the charged amino acid residues on the surface of these lysozymes are important for pH optimum determination. The investigation of this hypothesis was done replacing 5 neutral and 1 acidic amino acids, via site-directed mutagenesis, by basic residues. The characterization of this mutant revealed a significant increase in the pH optimum values of lysozyme 1, suggesting that the reduction of basicity on the surface of the digestive lysozymes is a important factor in the determination of their acidic pH optimum.
|
87 |
Estudo estrutural e funcional das proteínas PilZ e YaeQ do fitopatógeno Xanthomonas axonopodis pv citri / Structural and functional studies of PilZ and YaeQ from Xanthomonas axonopodis pv citri proteinsCristiane Rodrigues Guzzo 25 February 2010 (has links)
O trabalho aqui desenvolvido teve como objeto o estudo estrutural e funcional de várias proteínas do fitopatógeno Xanthomonas axonopodis pv citri (Xac), dentre as quais se destacam as proteínas hipotéticas conservadas YaeQ e SufE, as proteínas RpfC, RpfF e RpfG envolvidas em quorum sensing e proteínas PilZ, FimX e PilB envolvidas na biogênese do pilus tipo IV. Para o desenvolvimento deste trabalho foram utilizadas diferentes técnicas incluindo: clonagem, expressão, purificação, desnaturação térmica, cristalografia, difração de raios-X, RMN, ensaios de 2-híbrido, produção de nocautes, mutação sítio dirigida, Western- e Far- Western, entre outras. Dentre os resultados mais importantes obtidos temos a determinação estrutural das proteínas YaeQ e PilZ pela técnica MAD. Em ambos os casos, as estruturas representaram topologias inéditas. Com base nos dados estruturais, mostramos que YaeQ pertence à família PD-(D/E)XK presente em endonucleases dependentes de magnésio, e a partir de ensaios funcionais obtivemos evidências que sugerem que YaeQ está envolvida em alguma via de reparo de DNA em Xac. A estrutura tridimensional de PilZ revelou uma inesperada variedade estrutural dentro da família PilZ e mostrou de forma clara porque ortólogos não interagem com o segundo mensageiro bacteriano, c-diGMP. A cadeia principal de PilZ foi assinalada por RMN e a estrutura secundária de PilZ em solução é consistente com aquela determinada por cristalografia. Duas proteínas que interagem com PilZ foram identificadas: PilB e FimX. Como PilZ, ambos exercem papéis na biogênese do pilus tipo IV (T4P). Mostramos que PilZ interage especificamente com o domínio EAL de FimX e que resíduos conservados na região do C-terminal de PilZ estão envolvidos na interação com PilB, mas não com FimX. Ensaios de mutação sítio dirigida mostraram que a Y22 de PilZ pode estar envolvida na regulação da interação de PilZ com FimX e com PilB. Apesar de PilZ não interagir com c-diGMP seu parceiro, FimX, interage. PilZ consegue interagir com PilB ao mesmo tempo em que interage com FimX, formando um complexo ternário que é independente da interação de FimX com c-diGMP. Com base em todos estes resultados propusemos possíveis mecanismos de ação de PilZ e FimX no controle da biogênese do T4P. Além dos resultados acima descritos, determinamos a estrutura de SufE e mostramos que esta aumenta a atividade cisteína dessulfarase de seu parceiro, SufS, em torno de 10 vezes, como ocorre com SufE-SufS de E.coli. Clonamos, expressamos, purificamos e fizemos ensaios de cristalização de algumas proteínas envolvidas no controle de quorum sensing em Xac. Tivemos êxito na cristalização do domínio HPT (histidina fosfotransferase) da proteína chave deste sistema, RpfC / The aim of the project was to perform structural and functional studies of different Xanthomonas axonopodis pv citri (Xac) proteins including the hypothetical proteins YaeQ and SufE; RpfC, RpfF and RpfG involved in the quorum sensing and PilZ, FimX and PilB that play roles in type IV pilus (T4P) biogenesis. Several experimental techniques were employed including cloning, expression and purification of recombinant proteins, thermal denaturation, protein crystallography, X-ray diffraction, NMR, two-hybrid assays, Western- and Far-Western Blotting assays, site direct mutagenesis, and the production of Xac knockouts strains. The most important results include the determination of the three-dimensional crystal structures of PilZ and YaeQ using the MAD technique. In both cases, the structures reveled new protein topologies. The comparison of the YaeQ structure with others deposited in public databases revealed that YaeQ proteins represent a new variation within the PD-(D/E)XK magnesium dependent endonucleases superfamily. Functional assays suggest that YaeQ may be envolved in DNA repair in Xac. The PilZ three-dimensional structure revealed an unexpected structural variation within the PilZ domain superfamily and showed why PilZ orthologs are not able to bind the important bacterial second messenger, c-diGMP. We assigned the PilZ main chain by NMR and used this information to demonstrate that the PilZ secondary structure in solution is consistent with the PilZ crystal structure. We identified two proteins that interact with PilZ: PilB and FimX. As with PilZ, both PilB and FimX are involved in T4P biogenesis. PilZ binds specifically to the EAL domain of FimX and the conserved residues located in the PilZ unstructured C-terminal region contribute to binding with PilB but not with FimX. Site direct mutagenesis studies showed that PilZ residue Y22 is necessary for its capability to interact with both PilB and FimX. Although PilZ does not bind c-diGMP, her partner, FimX, does. We present evidence that PilZ can bind simultaneously to FimX and PilB, forming a ternary complex that is independent of c-diGMP. These results allow us to propose possible mechanisms by which PilZ and FimX control T4P biogenesis. Other results obtained during this period include the resolution of the crystal structure of the SufE protein from Xac using the molecular replacement technique. We show that SufE induces a 10-fold increase in the cysteine desulfurase activity of SufS, similar to that observed for the SufE-SufS complex from E. coli. Several proteins involved in quorum sensing and c-di-GMP signaling were cloned, expressed and submitted to crystallization trials. Crystals of the HPT (histidine phophotransferase) domain) of the RpfC sensor histidine kinase were obtained
|
88 |
Etude bioinformatique de la stabilité thermique des protéines: conception de potentiels statistiques dépendant de la température et développement d'approches prédictives / Bioinformatic study of protein thermal stability: development of temperature dependent statistical potentials and design of predictive approachesFolch, Benjamin 16 June 2010 (has links)
Cette thèse de doctorat s’inscrit dans le cadre de l’étude in silico des relations qui lient la séquence d’une protéine à sa structure, sa stabilité et sa fonction. Elle a pour objectif de permettre à terme la conception rationnelle de protéines modifiées qui restent actives dans des conditions physico chimiques non physiologiques. Nous nous sommes plus particulièrement penchés sur la stabilité thermique des protéines, qui est définie par leur température de fusion Tm au delà de laquelle leur structure n’est thermodynamiquement plus stable. Notre travail s’articule en trois grandes parties :la recherche de facteurs favorisant la thermostabilité des protéines parmi des familles de protéines homologues, la mise sur pied d’une base de données de protéines de structure et de Tm déterminées expérimentalement, de laquelle sont dérivés des potentiels statistiques dépendant de la température, et enfin la mise au point de deux outils bioinformatiques visant à prédire d’une part la Tm d’une protéine à partir de la Tm de protéines homologues et d’autre part les changements de thermostabilité d’une protéine (Tm) engendrés par l’introduction d’une mutation ponctuelle.<p><p>La première partie a pour objectif l’identification des facteurs de séquence et de structure (e.g. fréquence de ponts salins, d’interactions cation-{pi}) responsables des différentes stabilités thermiques de protéines homologues au sein de huit familles (chapitre 2). La spécificité de chaque famille ne nous a pas permis de généraliser l’impact de ces différents facteurs sur la stabilité thermique des protéines. Cependant, cette approche nous a permis de constater la multitude de stratégies différentes suivies par les protéines pour atteindre une plus grande thermostabilité.<p><p>La deuxième partie concerne le développement d’une approche originale pour évaluer l’influence de la température sur la contribution de différents types d’interactions à l’énergie libre de repliement des protéines (chapitres 3 et 4). Cette approche repose sur la dérivation de potentiels statistiques à partir d’ensembles de protéines de thermostabilité moyenne distincte. Nous avons d’une part collecté le plus grand nombre possible de protéines de structure et de Tm déterminées expérimentalement, et d’autre part développé des potentiels tenant compte de l’adaptation des protéines aux températures extrêmes au cours de leur évolution. Cette méthode originale a mis en évidence la dépendance en la température d’interactions protéiques tels les ponts salins, les interactions cation-{pi}, certains empilements hydrophobes .Elle nous a en outre permis de mettre le doigt sur l’importance de considérer la dépendance en la température non seulement des interactions attractives mais également des interactions répulsives, ainsi que sur l’importance de décrire la résistance thermique par la Tm plutôt que la Tenv, température de l’environnement de l’organisme dont elle provient (chapitre 5).<p><p>La dernière partie de cette thèse concerne l’utilisation des profils énergétiques dans un but prédictif. Tout d’abord, nous avons développé un logiciel bioinformatique pour prédire la thermostabilité d’une protéine sur la base de la thermostabilité de protéines homologues. Cet outil s’est avéré prometteur après l’avoir testé sur huit familles de protéines homologues. Nous avons également développé un deuxième outil bioinformatique pour prédire les changements de thermostabilité d’une protéine engendrés par l’introduction d’une mutation ponctuelle, en s’inspirant d’un logiciel de prédiction des changements de stabilité thermodynamique des protéines développé au sein de notre équipe de recherche. Ce deuxième algorithme de prédiction repose sur le développement d’une grande base de données de mutants caractérisés expérimentalement, d’une combinaison linéaire de potentiels pour évaluer la Tm, et d’un réseau de neurones pour identifier les coefficients de la combinaison. Les prédictions générées par notre logiciel ont été comparées à celles obtenues via la corrélation qui existe entre stabilités thermique et thermodynamique, et se sont avérées plus fiables.<p><p>Les travaux décrits dans notre thèse, et en particulier le développement de potentiels statistiques dépendant de la température, constituent une nouvelle approche très prometteuse pour comprendre et prédire la thermostabilité des protéines. En outre, nos travaux de recherche ont permis de développer une méthodologie qui pourra être adaptée à l’étude et à la prédiction d’autres propriétés physico chimiques des protéines comme leur solubilité, leur stabilité vis à vis de l’acidité, de la pression, de la salinité .lorsque suffisamment de données expérimentales seront disponibles.<p> / Doctorat en Sciences agronomiques et ingénierie biologique / info:eu-repo/semantics/nonPublished
|
89 |
Role of the amino acid sequences in domain swapping of the B1 domain of protein G by computation analysisSirota Leite, Fernanda 12 October 2007 (has links)
Domain swapping is a wide spread phenomenon which involves the association between two or more protein subunits such that intra-molecular interactions between domains in each subunit are replaced by equivalent inter-molecular interactions between the same domains in different subunits. This thesis is devoted to the analysis of the factors that drive proteins to undergo such association modes. The specific system analyzed is the monomer to swapped dimer formation of the B1 domain of the immunoglobulin G binding protein (GB1). The formation of this dimer was shown to be fostered by 4 amino acid substitutions (L5V, F30V, Y33F, A34F) (Byeon et al. 2003). In this work, computational protein design and molecular dynamics simulations, both with detailed atomic models, were used to gain insight into how these 4 mutations may promote the domain swapping reaction.<p>The stability of the wt and quadruple mutant GB1 monomers was assessed using the software DESIGNER, a fully automatic procedure that selects amino acid sequences likely to stabilize a given backbone structure (Wernisch et al. 2000). Results suggest that 3 of the mutations (L5V, F30V, A34F) have a destabilizing effect. The first mutation (L5V) forms destabilizing interactions with surrounding residues, while the second (F30V) is engaged in unfavorable interactions with the protein backbone, consequently causing local strain. Although the A34F substitution itself is found to contribute favorably to the stability of the monomer, this is achieved only at the expense of forcing the wild type W43 into a highly strained conformation concomitant with the formation of unfavorable interactions with both W43 and V54.<p>Finally, we also provide evidence that A34F mutation stabilizes the swapped dimer structure. Although we were unable to perform detailed protein design calculations on the dimer, due to the lower accuracy of the model, inspection of its 3D structure reveals that the 34F side chains pack against one another in the core of the swapped structure, thereby forming extensive non-native interactions that have no counterparts in the individual monomers. Their replacement by the much smaller Ala residue is suggested to be significantly destabilizing by creating a large internal cavity, a phenomenon, well known to be destabilizing in other proteins. Our analysis hence proposes that the A34F mutation plays a dual role, that of destabilizing the GB1 monomer structure while stabilizing the swapped dimer conformation.<p>In addition to the above study, molecular dynamics simulations of the wild type and modeled quadruple mutant GB1 structures were carried out at room and elevated temperatures (450 K) in order to sample the conformational landscape of the protein near its native monomeric state, and to characterize the deformations that occur during early unfolding. This part of the study was aimed at investigating the influence of the amino acid sequence on the conformational properties of the GB1 monomer and the possible link between these properties and the swapping process. Analysis of the room temperature simulations indicates that the mutant GB1 monomer fluctuates more than its wild type counter part. In addition, we find that the C-terminal beta-hairpin is pushed away from the remainder of the structure, in agreement with the fact that this hairpin is the structural element that is exchanged upon domain swapping. The simulations at 450 K reveal that the mutant protein unfolds more readily than the wt, in agreement with its decreased stability. Also, among the regions that unfold early is the alpha-helix C-terminus, where 2 out of the 4 mutations reside. NMR experiments by our collaborators have shown this region to display increased flexibility in the monomeric state of the quadruple mutant.<p>Our atomic scale investigation has thus provided insights into how sequence modifications can foster domain swapping of GB1. Our findings indicate that the role of the amino acid substitutions is to decrease the stability of individual monomers while at the same time increase the stability of the swapped dimer, through the formation of non-native interactions. Both roles cooperate to foster swapping. / Doctorat en sciences, Spécialisation biologie moléculaire / info:eu-repo/semantics/nonPublished
|
90 |
First principles and black box modelling of biological systemsGrosfils, Aline 13 September 2007 (has links)
Living cells and their components play a key role within biotechnology industry. Cell cultures and their products of interest are used for the design of vaccines as well as in the agro-alimentary field. In order to ensure optimal working of such bioprocesses, the understanding of the complex mechanisms which rule them is fundamental. Mathematical models may be helpful to grasp the biological phenomena which intervene in a bioprocess. Moreover, they allow prediction of system behaviour and are frequently used within engineering tools to ensure, for instance, product quality and reproducibility.<p> <p>Mathematical models of cell cultures may come in various shapes and be phrased with varying degrees of mathematical formalism. Typically, three main model classes are available to describe the nonlinear dynamic behaviour of such biological systems. They consist of macroscopic models which only describe the main phenomena appearing in a culture. Indeed, a high model complexity may lead to long numerical computation time incompatible with engineering tools like software sensors or controllers. The first model class is composed of the first principles or white box models. They consist of the system of mass balances for the main species (biomass, substrates, and products of interest) involved in a reaction scheme, i.e. a set of irreversible reactions which represent the main biological phenomena occurring in the considered culture. Whereas transport phenomena inside and outside the cell culture are often well known, the reaction scheme and associated kinetics are usually a priori unknown, and require special care for their modelling and identification. The second kind of commonly used models belongs to black box modelling. Black boxes consider the system to be modelled in terms of its input and output characteristics. They consist of mathematical function combinations which do not allow any physical interpretation. They are usually used when no a priori information about the system is available. Finally, hybrid or grey box modelling combines the principles of white and black box models. Typically, a hybrid model uses the available prior knowledge while the reaction scheme and/or the kinetics are replaced by a black box, an Artificial Neural Network for instance.<p><p>Among these numerous models, which one has to be used to obtain the best possible representation of a bioprocess? We attempt to answer this question in the first part of this work. On the basis of two simulated bioprocesses and a real experimental one, two model kinds are analysed. First principles models whose reaction scheme and kinetics can be determined thanks to systematic procedures are compared with hybrid model structures where neural networks are used to describe the kinetics or the whole reaction term (i.e. kinetics and reaction scheme). The most common artificial neural networks, the MultiLayer Perceptron and the Radial Basis Function network, are tested. In this work, pure black box modelling is however not considered. Indeed, numerous papers already compare different neural networks with hybrid models. The results of these previous studies converge to the same conclusion: hybrid models, which combine the available prior knowledge with the neural network nonlinear mapping capabilities, provide better results.<p><p>From this model comparison and the fact that a physical kinetic model structure may be viewed as a combination of basis functions such as a neural network, kinetic model structures allowing biological interpretation should be preferred. This is why the second part of this work is dedicated to the improvement of the general kinetic model structure used in the previous study. Indeed, in spite of its good performance (largely due to the associated systematic identification procedure), this kinetic model which represents activation and/or inhibition effects by every culture component suffers from some limitations: it does not explicitely address saturation by a culture component. The structure models this kind of behaviour by an inhibition which compensates a strong activation. Note that the generalization of this kinetic model is a challenging task as physical interpretation has to be improved while a systematic identification procedure has to be maintained.<p><p>The last part of this work is devoted to another kind of biological systems: proteins. Such macromolecules, which are essential parts of all living organisms and consist of combinations of only 20 different basis molecules called amino acids, are currently used in the industrial world. In order to allow their functioning in non-physiological conditions, industrials are open to modify protein amino acid sequence. However, substitutions of an amino acid by another involve thermodynamic stability changes which may lead to the loss of the biological protein functionality. Among several theoretical methods predicting stability changes caused by mutations, the PoPMuSiC (Prediction Of Proteins Mutations Stability Changes) program has been developed within the Genomic and Structural Bioinformatics Group of the Université Libre de Bruxelles. This software allows to predict, in silico, changes in thermodynamic stability of a given protein under all possible single-site mutations, either in the whole sequence or in a region specified by the user. However, PoPMuSiC suffers from limitations and should be improved thanks to recently developed techniques of protein stability evaluation like the statistical mean force potentials of Dehouck et al. (2006). Our work proposes to enhance the performances of PoPMuSiC by the combination of the new energy functions of Dehouck et al. (2006) and the well known artificial neural networks, MultiLayer Perceptron or Radial Basis Function network. This time, we attempt to obtain models physically interpretable thanks to an appropriate use of the neural networks.<p> / Doctorat en sciences appliquées / info:eu-repo/semantics/nonPublished
|
Page generated in 0.062 seconds