Global ETD Search

1	Molecular genetic analysis of the penicillin-binding protein 1B gene of Escherichia coli Edelman, A. January 1987 (has links) No description available. 572 Membrane protein structures
2	Possible dynamic roles for the electrostatic force in biological membrane systems Berry, Richard M. January 1992 (has links) No description available. 572 Protein structures
3	Storing Protein Structure in Spatial Database Yeung, Tony 12 May 2005 (has links) In recent years, the field of bioinformatics has exploded in a scale that is unprecedented. The amount of data generated from different genome projects demands a new and efficient way of information storage and retrieval. The analysis and management of the protein structure information has become one of the main focuses. It is well-known that a protein’s functions differ depending on its structure’s position in 3-dimensional space. Due to the fact that protein structures are exceedingly large, complex, and multi-dimensional, there is a need for a data model that can fulfill the requirements of storing protein structures in accordance to its spatial arrangement and topological relationships and, at the same time, provide tools to analyze the information stored. With the emergence of spatial database, first used in the field of Geographical Information Systems, the data model for protein structure could be based on the geographic model, as they share several similar uncanny traits. The geometry of proteins can be modeled using the spatial types provided in a spatial database. In a similar way, special geometry queries used for geographical analysis can also be used to provide information for analysis on the structure of the proteins. This thesis will explore the mechanics of extracting structural information for a protein from a flat file (PDB), storing that information into a spatial data model based on a spatial data model, and making analysis using geometric operators provided by the spatial database. The database used is Oracle 9i. Most features are provided by the Oracle Spatial package. Queries using the ideas aforementioned will be demonstrated. protein structures spatial database Computer Sciences
4	Protein Function Prediction Based on Sequence and Structure Information Smaili, Fatima Z. 25 May 2016 (has links) The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence. protein functions protein structures prediction similarity search
5	3D proteomics : analysis of proteins and protein complexes by chemical cross-linking and mass spectrometry Chen, Zhuo A. January 2011 (has links) The concept of 3D proteomics is a technique that couples chemical cross-linking with mass spectrometry and has emerged as a tool to study protein conformations and protein-protein interactions. In this thesis I present my work on improving the analytical workflow and developing applications for 3D proteomics in the structural analysis of proteins and protein complexes through four major tasks. I. As part of the technical development of an analytical workflow for 3D proteomics, a cross-linked peptide library was created by cross-linking a mixture of synthetic peptides. Analysis of this library generated a large dataset of cross-linked peptides. Characterizing the general features of cross-linked peptides using this dataset allowed me to optimize the settings for mass spectrometric analysis and to establish a charge based enrichment strategy for cross-linked peptides. In addition to this, 1185 manually validated high resolution fragmentation spectra gave an insight into general fragmentation behaviours of cross-linked peptides and facilitated the development of a cross-linked peptide search algorithm. II. The advanced 3D proteomics workflow was applied to study the architecture of the 670 kDa 15-subunit Pol II-TFIIF complex. This work established 3D proteomics as a structure analysis tool for large multi-protein complexes. The methodology was validated by comparing 3D proteomics analysis results and the X-ray crystallographic data on the 12-subunit Pol II core complex. Cross-links observed from the Pol II–TFIIF complex revealed interactions between the Pol II and TFIIF at the peptide level, which also reflected the dynamic nature of Pol II-TFIIF structure and implied possible Pol II conformational changes induced by TFIIF binding. III. Conformational changes of flexible protein molecules are often associated with specific functions of proteins or protein complexes. To quantitatively measure the differences between protein conformations, I developed a quantitative 3D proteomics strategy which combines isotope labelling and cross-linking with mass spectrometry and database searching. I applied this approach to detect in solution the conformational differences between complement component C3 and its active form C3b in solution. The quantitative cross-link data confirmed the previous observation made by X-ray crystallography. Moreover, this analysis detected the spontaneous hydrolysis of C3 in both C3 and C3b samples. The architecture of hydrolyzed C3-C3(H2O) was proposed based on the quantified cross-links and crystal structure of C3 and C3b, which revealed that C3(H2O) adopted the functional domain arrangement of C3b. This work demonstrated that quantitative 3D proteomics is a valuable tool for conformational analysis of proteins and protein complexes. IV. Encouraged by the achievements in the above applications with relatively large amounts of highly purified material, I explored the application of 3D proteomics on affinity purified tagged endogenous protein complexes. Using an on-beads process which connected cross-linking and an affinity purification step directly, provided increased sensitivity through minimized sample handling. A charge-based enrichment step was carried out to improve the detection of cross-linked peptides. The occurrence of cross-links between complexes was monitored by a SILAC based control. Cross-links observed from low micro-gram amounts of single-step purified endogenous protein complexes provided insights into the structural organization of the S. cerevisiae Mad1-Mad2 complex and revealed a conserved coiled-coil interruption in the S. cerevisiae Ndc80 complex. With this endeavour I have demonstrated that 3D proteomics has become a valuable tool for studying structure of proteins and protein complexes. 572
6	Non-repetitive Structures In Proteins : Effects Of Side-chain And Solvent Interactions With The Backbone Narayanan, Eswar 04 1900 (has links) The work presented in this thesis deals with the analysis of protein crystal structures with an emphasis on the stereochemical aspects of the folded conformation of proteins. The various analyses described have been performed on a data-set of 250 high resolution and non-homologous protein structures derived from the Protein Data Bank. The overall objective of the work has been to analyse conformational features of the non-secondary structural regions in proteins and identify structural motifs present therein. The results can be useful in the three-dimensional modelling of proteins, altering the stability of proteins, design of peptide mimics and in understanding the structural rules that guide protein folding. The contents of this thesis can be broadly classified into three parts, (a) Conformational preferences of amino acid residues to occur in the partially allowed regions of the Ramachandran map, (b) conformational features of structural motifs formed by side-chain/main-chain hydrogen bonds by polar residues and (c) analysis and characteristic features of isolated β-strands. Chapter 1 of the thesis gives an introduction, briefly discussing the conformation of polypeptide chains, structural features of globular proteins and applications of protein structural analysis etc. Chapter 2 describes the occurrence of left-handed α-helical conformation in protein structures. A data-set of 250 high resolution (< 2.0A) non-homologous protein crystal structures derived from the Protein Data Bank (PDB) has been analysed for occurrences of left-handed α-helical (αL) conformations. A total of 2,573 αL residues were identified from the data-set. About 59% of the observed examples of at conformations were found to be glycyl residues and about 41% non-glycyl. Continuous long stretches of αL residues are seldom found in protein structures. They are most commonly found as singlets represented by 78% of the observed αL examples. The doublets, triplets and quadruplets account for a very minor fraction of the observed examples. There is only a single example of a stretch of four contiguous αL residues, from the protein thermolysin, which forms a single turn of a left-handed α-helix. A majority of the αL residues are nevertheless part of well-defined substructures in proteins. They play singular roles as part of β-turns and helix termination sites in maintaining the characteristic main-chain hydrogen bonds needed for the stability of these structures. They are also found to be effective in the termination of β-strands. The stereo-chemistry and sequence environment around such structures are discussed. The analysis of the side-chain torsion angles of αL residues indicate that the g+ rotamer is highly unfavourable due to stereo-chemical violations posed by the atoms of the side-chain with those of the backbone. The αL residues are highly conserved by residue type as well as conformation among related proteins indicating their vital importance in protein structures Chapter 3 provides an explanation for the unusual preference of glycyl residues to occur in the bridge regions of the Ramachandran map. The Ramachandran steric map and energy diagrams for the glycyl residue are fully symmetric. Though a plot of the (Φ,Ψ) angles of glycyl residues derived from a data-set of 250 non-homologous and high-resolution protein structures is also largely symmetric, there is a clear aberration in the symmetry. While there is a cluster of points corresponding to the right-handed a-helical region, the "equivalent" cluster is shifted to centre around the (Φ,Ψ)values of (90°, 0°) instead of being centred at the left-handed a-helical region of (60°, 40°). An analysis of glycyl conformations in small peptide structures and in "coil" proteins, which are largely devoid of helical and sheet regions, shows that glycyl residues prefer to adopt conformations around (±90°, 0°) instead of right and left handed a-helical regions. Using theoretical calculations, such conformations are shown to have highest solvent accessibility in a system of two-linked peptide units with glycyl residue at the central Cα atom. This is found to be consistent with the observations from 250 non-homologous protein structures where glycyl residues with conformations close to (±90°, 0°) are seen to have high solvent accessibility. Analysis of a sub-set of non-homologous structures with very high resolution (1.5A or better) shows that water molecules are indeed present at distances suitable for hydrogen bond interaction with glycyl residues possessing conformations close to (±90°, 0°). It is concluded that water molecules play a key role in determining and stabilising these conformations of glycyl residues and explains the aberration in the symmetry of glycyl conformations in proteins. Chapter 4 discusses an analysis of backbone mimicry performed by polar side-chains in protein structures. Backbonemimicry bythe formation of closed loop C7, C10, C13 (mimics of γ-, β- and α-turns) conformations through side-chain main-chain hydrogen bonds by polar groups is found to be a frequent observation in protein structures. A data-set of 250 non-homologous and high-resolution protein structures was used to analyse these conformations for their characteristic features. Seven out of the nine polar residues (Ser, Thr, Asn, Asp, Gin, Glu and His) have hydrogen bonding groups in their side-chains which can participate in such mimicry and as many as 15% of all these polar residues engage in such conformations. The distributions of dihedral angles of these mimics indicate that only certain combinations of the involved dihedral angles aids the formation of these mimics. The observed examples have been categorised into various classes based on these combinations resulting in well-defined motifs. Asn and Asp residues show a very high capability to perform such backbone secondary structural mimicry. The most highly mimicked backbone structure is of the Cio conformation by the Asx residues. The mimics formed by His, Ser, Thr and Glx residues are also discussed. The role of such conformations in initiating the formation of regular secondary structures during the course of protein folding seems significant. Chapter 5 presents a description of deterministic features of side-chain main-chain hydrogen bonds as observed in protein structures. A total of 19,835 polar residues from the data set of 250 non-homologous and highly resolved protein crystal structures were used to identify side-chain main-chain (SC-MC) hydrogen bonds. The ratio of the total number of polar residues to the number of SC-MC hydrogen bonds is close to 2:1, indicating the ubiquitous nature of such hydrogen bonds. Close to 56% of the SC-MC hydrogen bonds are local involving side-chain acceptor/donor (‘i’) and a main-chain donor/acceptor within the window i-5 to i+5. These short-range hydrogen bonds form well defined conformational motifs characterised by specific combinations of backbone and side-chain torsion angles. Some of the salient features of such hydrogen bonds are as follows, (a) The Ser/Thr residues show the greatest preference in forming intra-helical hydrogen bonds between the atoms Oyi and Oi-4 Such hydrogen bonds form motifs of the form αRαRαRαR(g") and are most commonly observed at the middle of α-helices. (b) These residues also show great preference to form hydrogen bonds between OYi and Oi-3, which are closely related to the previous type and though intra-helical, these hydrogen bonds are more often found at the C-termini of helices than at the middle. The motif represented by αRαRαRaR(g+) is most preferred in these cases, (c) The Ser, Thr and Glu (between the side-chain and main-chain of the same residue), (d) The side-chain acceptor atoms of Asn/Asp and Ser/Thr residues show high preference to form hydrogen bonds with acceptors two residues ahead in the chain, which are characterised by the motifs β(tt’)αR and β(t)αR, respectively. These hydrogen bonded segments referred to as Asx turns, are known to provide stability to type I and type I’ β-turns. (e) Ser/Thr residues often form a combination of SC-MC hydrogen bonds, with the side-chain donor hydrogen bonded to the carbonyl oxygen of its own peptide backbone and the side-chain acceptor hydrogen bonded to an amide hydrogen three residues ahead in the sequence. Such motifs are quite often seen at the beginning of a-helices, which are characterised by the β (g+)αRαR motif. A remarkable majority of all these hydrogen bonds are buried from the protein surface, away from the surrounding solvent. This strongly indicates the possibility of side-chains playing the role of the backbone, in the protein interiors, to satisfy the potential hydrogen bonding sites and maintaining the network of hydrogen bonds which is crucial to the structure of the protein. Chapter 6 provides a detailed characterisation of isolated β-strands. Reason for the formation of β-strands in proteins is often associated with the formation of β -sheets. However β-strands, not part of β-sheets, commonly occur in proteins. This raises questions about the structural role and stability of such isolated β-strands. Using a data set consisting of 250 proteins, 518 isolated β-strands have been identified from 187 proteins. The two important features that distinguish isolated β-strands from p-strands occurring in β-sheets are (i) the high preponderance of prolyl residues to occur in isolated β-strands and (ii) their high solvent exposure. It is shown that the high propensity for proline residues to occur in isolated β-strands is not due to the occurrence of polyproline type segments in the data-set. The propensities of other amino acids to occur in isolated β-strands follows the same trend as those for β-sheet forming β-strands. Isolated β-strands are characterised often by their main-chain amide and carbonyl groups involved in hydrogen bonding with polar side-chains or water. They are often flanked by irregular loop structures indicating that they are part of long of loops. Analysis of the conservation of such strands among families of homologous protein structures indicates that a sizeable fraction of them are highly conserved. It is suggested that though the formation of isolated β-strands are driven by the intrinsic preferences of amino acid residues, they have many characteristics like loop segments but with repetitive (Φ,Ψ) values falling within the β-region of the Ramachandran map. In addition of the material described in the six chapters above, the thesis also contains the details of work carried out on an aspect slightly different from the main theme of the thesis. This pertains to the comparative analysis of the members of a family of cytokine receptors to derive information to model new members of the family. The three dimensional modelling of the leptin receptor has been used as a case study and the details are included as an appendix. Appendix describes the 3-dimensional model of the satiety factor receptor (the leptin receptor) modelled using principles of homology modelling. Recessive mutations in the mouse obese (ob) and diabetes (db) genes result in obesity and diabetes in a syndrome resembling human obesity. Data from parabiosis (cross circulation) experiments suggested that the ob gene coded, and was responsible for the generation of a circulating factor called leptin which regulated energy balance and the db gene encoded the receptor for this factor. While the structure of the leptin has been determined that of its cognate receptor is as yet unknown. The leptin receptor shows low but clear sequence similarity to the members of the interleukin type 6 family of receptors. The structures of the members of this family are characterised by two p-sandwich like domains connected by a short 4-residue helical linker. The 3-dimensional models for the N- and C-terminal domains of the leptin receptor was generated using the corresponding structures of the signal transducing component of gpl30, the erythropoetin receptor and the prolactin receptor. Further using the evidence that the leptin binds to its receptor with a stoichiometry of 1:1, the relative orientation of the two domains was modelled based on the structure of the human growth hormone receptor, which also binds its ligand with similar stoichiometry. The complex of leptin with its receptor was also modelled based on the structure of human growth hormone/receptor complex. The final energy minimised model of the complex elucidates the mode of interaction between the leptin and its receptor. Biochemistry Protein Structures Proteins Polypeptides Backbone Mimicry Leptin Receptor Leptin Amino Acid Sequence
7	Preprocessing and Reduction for Semidefinite Programming via Facial Reduction: Theory and Practice Cheung, Yuen-Lam 05 November 2013 (has links) Semidefinite programming is a powerful modeling tool for a wide range of optimization and feasibility problems. Its prevalent use in practice relies on the fact that a (nearly) optimal solution of a semidefinite program can be obtained efficiently in both theory and practice, provided that the semidefinite program and its dual satisfy the Slater condition. This thesis focuses on the situation where the Slater condition (i.e., the existence of positive definite feasible solutions) does not hold for a given semidefinite program; the failure of the Slater condition often occurs in structured semidefinite programs derived from various applications. In this thesis, we study the use of the facial reduction technique, originally proposed as a theoretical procedure by Borwein and Wolkowicz, as a preprocessing technique for semidefinite programs. Facial reduction can be used either in an algorithmic or a theoretical sense, depending on whether the structure of the semidefinite program is known a priori. The main contribution of this thesis is threefold. First, we study the numerical issues in the implementation of the facial reduction as an algorithm on semidefinite programs, and argue that each step of the facial reduction algorithm is backward stable. Second, we illustrate the theoretical importance of the facial reduction procedure in the topic of sensitivity analysis for semidefinite programs. Finally, we illustrate the use of facial reduction technique on several classes of structured semidefinite programs, in particular the side chain positioning problem in protein folding. semidefinite programming preprocessing backward stability numerical optimization sensitvity analysis perturbation theory side chain positioning protein structures
8	Development of genetic algorithm for optimisation of predicted membrane protein structures Minaji-Moghaddam, Noushin January 2007 (has links) Due to the inherent problems with their structural elucidation in the laboratory, the computational prediction of membrane protein structure is an essential step toward understanding the function of these leading targets for drug discovery. In this work, the development of a genetic algorithm technique is described that is able to generate predictive 3D structures of membrane proteins in an ab initio fashion that possess high stability and similarity to the native structure. This is accomplished through optimisation of the distances between TM regions and the end-on rotation of each TM helix. The starting point for the genetic algorithm is from the model of general TM region arrangement predicted using the TMRelate program. From these approximate starting coordinates, the TMBuilder program is used to generate the helical backbone 3D coordinates. The amino acid side chains are constructed using the MaxSprout algorithm. The genetic algorithm is designed to represent a TM protein structure by encoding each alpha carbon atom starting position, the starting atom of the initial residue of each helix, and operates by manipulating these starting positions. To evaluate each predicted structure, the SwissPDBViewer software (incorporating the GROMOS force field software) is employed to calculate the free potential energy. For the first time, a GA has been successfully applied to the problem of predicting membrane protein structure. Comparison between newly predicted structures (tests) and the native structure (control) indicate that the developed GA approach represents an efficient and fast method for refinement of predicted TM protein structures. Further enhancement of the performance of the GA allows the TMGA system to generate predictive structures with comparable energetic stability and reasonable structural similarity to the native structure. 572
9	Resolução de estruturas de proteínas utilizando-se dados de RMN a partir de um algorítmo genético de múltiplos mínimos / Resolution of Protein Structures Using NMR Data by Means of a Genetic Algorithm of Multiple Minima Linden, Marx Gomes Van Der 15 April 2009 (has links) Made available in DSpace on 2015-03-04T18:51:09Z (GMT). No. of bitstreams: 1 tese Marx.pdf: 13309696 bytes, checksum: c9f52e4492b104fdb02b2a485e043412 (MD5) Previous issue date: 2009-04-15 / Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior / Proteins are biological macromolecules comprised of amino acid polymers that play a wide range of biological roles involved in every process of living organisms. Together with X-ray diffraction, Nuclear Resonance Spectroscopy (NMR) is one of the two main experimental techniques that are capable of delivering atomic-level resolution of protein structures. Prediction of proteic structures using experimental information from NMR experiments is a global optimization problem with restraints. GAPF is a computer program that uses a Genetic Algorithm (GA) developed for ab initio prediction -- the determination of the structure of a protein from its amino acid sequence only -- using a multiple minima approach and a fitness function derived from a classic molecular force field. The work presented here describes GAPF-NMR, an alternate version of GAPF that uses experimental restraints from NMR to support the search for the best protein structures that correspond to a given sequence. Five different versions of the algorithm have been developed, each with a variation on how the energy function is calculated during the course of the program run. GAPF was tested on a test set comprised of 7 proteins with known structure and it was capable of achieving a correct or approximate fold for every one of these proteins. It was noted that the versions of the algorithm that progressively increase the area of the protein used in the energy function have performed better than the other versions, and that the multiple minima approach was important to the achievement of good results. Results were compared to those obtained by GENFOLD -- to the moment, the only known alternate implementation of a GA to the problem of protein structure prediction using NMR data -- and the current version of GAPF was shown to be superior to GENFOLD for two of the three proteins that compose its test set. / Proteínas são macromoléculas biológicas formadas por polímeros de aminoácidos, as quais estão envolvidas em todos os processos vitais dos organismos, compreendendo um amplo leque de funções. A espectropia por Ressonância Magnética Nuclear (RMN) é, ao lado da difração de raios-X em cristais, uma das duas principais técnicas experimentais capazes de permitir a elucidação da estrutura de proteínas em resolução atômica. A predição de estruturas protéicas utilizando informações experimentais de RMN é um problema de otimização global com restrições. O GAPF é um programa que utiliza um Algoritmo Genético (AG) desenvolvido para predição ab initio -- isto é, para determinação da estrutura de uma proteína apenas a partir do conhecimento de sua seqüência de aminoácidos -- utilizando uma abordagem de múltiplos mínimos, baseada em uma função aptidão derivada de um campo de força molecular clássico. Neste trabalho, é descrito o GAPF-NMR, uma versão derivada do GAPF, que utiliza restrições experimentais de RMN para auxiliar na busca pelas melhores estruturas protéicas correspondentes a uma seqüência dada. Cinco versões diferentes do algoritmo foram desenvolvidas, com diferentes variações na maneira como a função de energia é calculada ao longo da execução. O programa desenvolvido foi aplicado a um conjunto-teste de 7 proteínas de estrutura já conhecida e, para todas elas, foi capaz de chegar a uma estrutura com o enovelamento correto ou aproximado. Foi observado que as versões do algoritmo que aumentam progressivamente a região da proteína usada no cálculo de energia tiveram desempenho superior às demais, e que a abordagem de múltiplos mínimos foi importante para a obtenção de bons resultados. Os resultados foram comparados aos descritos para o GENFOLD -- que é, até o momento, a única implementação alternativa conhecida de um AG para o problema de predição de estruturas a partir de dados de RMN -- e a versão atual do GAPF-NMR se mostrou superior ao GENFOLD na determinação de duas das três proteínas do conjunto-teste deste. Proteínas-Estrutura Algorítmos genéticos Ressonância Magnética Nuclear Protein Structures Genetic Algorithm Nuclear Resonance Spectroscopy
10	Utilização de máquinas de suporte vetorial para predição de estruturas terciárias de proteínas / Support vector machine for tertiary structure prediction Bisognin, Gustavo 08 March 2007 (has links) Made available in DSpace on 2015-03-05T13:58:25Z (GMT). No. of bitstreams: 0 Previous issue date: 8 / Nenhuma / A estrutura tridimensional de uma proteína está diretamente ligada a sua função. Diversos projetos de seqüenciamento genéticos acumulam um grande número de seqüências de proteínas cujas estruturas primárias e secundárias são conhecidas. Entretanto, as informações sobre suas estruturas tridimensionais estão disponíveis somente para uma pequena fração destas proteínas. Este fato evidencia a necessidade da criação de métodos automáticos para a predição de estruturas terciárias de proteínas a partir de suas estruturas primárias. Conseqüentemente, ferramentas computacionais são utilizadas para o tratamento, seleção e análise destes dados. Atualmente, um novo método de aprendizado de máquina denominado Máquina de Suporte Vetorial (MSV) tem superado métodos tradicionais como as Redes Neurais Artificiais (RNA) no tratamento de problemas de classicação. Nesta dissertação utilizamos as MSV para a classicação automática de proteínas. A principal contribuição deste trabalho foi a metodologia proposta para o tratamen / The three-dimensional structure of a protein is directly related to its function. Many projects of genetic sequence analysis accumulate a great number of protein sequences whose primary and secondary structures are known. However, the information on its three-dimensional structures are available only for a small fraction of these proteins. This fact evidences the necessity of creation of automatic methods for the prediction of tertiary protein structures from its primary structures. Consequently, computational tools are used for the treatment, election and analysis of these data. Currently, a new method of machine learning called Support Vector Machine (SVM) has surpassed traditional methods as Artificial Neural Networks (ANN) in the treatment of classication problems. In this master thesis we use the SVM for the automatic protein classication. The main contribution of this work was the methodology proposal for the treatment of the problem. This methodology consists in composing the support vectors with the v Ciências Exatas e da Terra biologia molecular computação máquina de suporte vetorial bioinformática machine learning support vector machine tertiary protein structures

Search results