• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 194
  • 48
  • 38
  • 12
  • 8
  • 8
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 397
  • 397
  • 74
  • 74
  • 72
  • 57
  • 57
  • 48
  • 45
  • 45
  • 39
  • 37
  • 36
  • 35
  • 34
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
211

Algoritmos evolutivos e modelos simplificados de proteínas para predição de estruturas terciárias / Evolutionary algorithms and simplified models for tertiary protein structure prediction

Gabriel, Paulo Henrique Ribeiro 23 March 2010 (has links)
A predição de estruturas de proteínas (Protein Structure Prediction PSP) é um problema computacionalmente complexo. Para tratar esse problema, modelos simplificados de proteínas, como o Modelo HP, têm sido empregados para representar as conformações e Algoritmos Evolutivos (AEs) são utilizados na busca por soluções adequadas para PSP. Entretanto, abordagens utilizando AEs muitas vezes não tratam adequadamente as soluções geradas, prejudicando o desempenho da busca. Neste trabalho, é apresentada uma formulação multiobjetivo para PSP em Modelo HP, de modo a avaliar de forma mais robusta as conformações produzidas combinando uma avaliação baseada no número de contatos hidrofóbicos com a distância entre os monômeros. Foi adotado o Algoritmo Evolutivo Multiobjetivo em Tabelas (AEMT) a fim de otimizar essas métricas. O algoritmo pode adequadamente explorar o espaço de busca com pequeno número de indivíduos. Como consequência, o total de avaliações da função objetivo é significativamente reduzido, gerando um método para PSP utilizando Modelo HP mais rápido e robusto / Protein Structure Prediction (PSP) is a computationally complex problem. To overcome this drawback, simplified models of protein structures, such as the HP Model, together with Evolutionary Algorithms (EAs) have been investigated in order to find appropriate solutions for PSP. EAs with the HP Model have shown interesting results, however, they do not adequately evaluate potential solutions by using only the usual metric of hydrophobic contacts, hamming the performance of the algorithm. In this work, we present a multi-objective approach for PSP using HP Model that performs a better evaluation of the solutions by combining the evaluation based on the number of hydrophobic contacts with the distance among the hydrophobic amino acids. We employ a Multi-objective Evolutionary Algorithm based on Sub-population Tables (MEAT) to deal with these two metrics. MEAT can adequately explore the search space with relatively low number of individuals. As a consequence, the total assessments of the objective function is significantly reduced generating a method for PSP using HP Model that is faster and more robust
212

Accelerated many-body protein side-chain repacking using gpus: application to proteins implicated in hearing loss

Tollefson, Mallory RaNae 15 December 2017 (has links)
With recent advances and cost reductions in next generation sequencing (NGS), the amount of genetic sequence data is increasing rapidly. However, before patient specific genetic information reaches its full potential to advance clinical diagnostics, the immense degree of genetic heterogeneity that contributes to human disease must be more fully understood. For example, although large numbers of genetic variations are discovered during clinical use of NGS, annotating and understanding the impact of such coding variations on protein phenotype remains a bottleneck (i.e. what is the molecular mechanism behind deafness phenotypes). Fortunately, computational methods are emerging that can be used to efficiently study protein coding variants, and thereby overcome the bottleneck brought on by rapid adoption of clinical sequencing. To study proteins via physics-based computational algorithms, high-quality 3D structural models are essential. These protein models can be obtained using a variety of numerical optimization methods that operate on physics-based potential energy functions. Accurate protein structures serve as input to downstream variation analysis algorithms. In this work, we applied a novel amino acid side-chain optimization algorithm, which operated on an advanced model of atomic interactions (i.e. the AMOEBA polarizable force field), to a set of 164 protein structural models implicated in deafness. The resulting models were evaluated with the MolProbity structure validation tool. MolProbity “scores” were originally calibrated to predict the quality of X-ray diffraction data used to generate a given protein model (i.e. a 1.0 Å or lower MolProbity score indicates a protein model from high quality data, while a score of 4.0 Å or higher reflects relatively poor data). In this work, the side-chain optimization algorithm improved mean MolProbity score from 2.65 Å (42nd percentile) to nearly atomic resolution at 1.41 Å (95th percentile). However, side-chain optimization with the AMOEBA many-body potential function is computationally expensive. Thus, a second contribution of this work is a parallelization scheme that utilizes nVidia graphical processing units (GPUs) to accelerate the side-chain repacking algorithm. With the use of one GPU, our side-chain optimization algorithm achieved a 25 times speed-up compared to using two Intel Xeon E5-2680v4 central processing units (CPUs). We expect the GPU acceleration scheme to lessen demand on computing resources dedicated to protein structure optimization efforts and thereby dramatically expand the number of protein structures available to aid in interpretation of missense variations associated with deafness.
213

Structural Studies of Binding Proteins: Investigations of Flexibility, Specificity and Stability

Magnusson, Ulrika January 2003 (has links)
<p>Binding proteins are present both in gram-negative and gram-positive bacteria. They are the recognition components of the ABC transport systems that transport different nutrients into the cell, and are in some cases also involved in chemotaxis. In gram-negative bacteria, they are present in the periplasm between the inner and the porous outer membrane. Here, these highly specific proteins can bind to a certain ligand such as ions, sugars and amino acids. The protein-ligand complex can then interact with permeases bound to the inner membrane that transport the nutrient into the cell. Gram-positive bacteria lack an outer membrane and the binding protein must therefore be anchored to the cell membrane.</p><p>In this thesis different aspects of three members of the super-family of the periplasmic binding proteins have been studied. In the case of the allose-binding protein (ALBP) from <i>E. coli</i> we focused on the movement of the protein when ligand is bound and released. This protein was also compared with the ribose-binding protein (RBP) which belongs to the same structural cluster and from which both open and closed structures are available. The leucine-binding protein (LBP) from <i>E. coli</i> was studied with regards to the structural basis of its specificity for different ligands as well as its conformational changes. The leucine-isoleucine-valine protein has 80% sequence identity with LBP but still exhibits a different preference for ligands. The structure of the maltose-binding protein (MBP) was obtained from a gram-positive thermoacidophile, <i>A. acidocaldarius. </i>Here, our goal was to study acid-stability of proteins. Since little is known about this and structures of the mesophilic counterpart in <i>E. coli</i> are available, as well as structures from two hyperthermophiles, we had an opportunity to study differences in their structural properties that could explain their differing stabilities.</p>
214

Sample preparation of membrane proteins suitable for solid-state MAS NMR and development of assignment strategies

Hiller, Matthias January 2009 (has links)
Although the basic structure of biological membranes is provided by the lipid bilayer, most of the specific functions are carried out by membrane proteins (MPs) such as channels, ion-pumps and receptors. Additionally, it is known, that mutations in MPs are directly or indirectly involved in many diseases. Thus, structure determination of MPs is of major interest not only in structural biology but also in pharmacology, especially for drug development. Advances in structural biology of membrane proteins (MPs) have been strongly supported by the success of three leading techniques: X-ray crystallography, electron microscopy and solution NMR spectroscopy. However, X-ray crystallography and electron microscopy, require highly diffracting 3D or 2D crystals, respectively. Today, structure determination of non-crystalline solid protein preparations has been made possible through rapid progress of solid-state MAS NMR methodology for biological systems. Castellani et. al. solved and refined the first structure of a microcrystalline protein using only solid-state MAS NMR spectroscopy. These successful application open up perspectives to access systems that are difficult to crystallise or that form large heterogeneous complexes and insoluble aggregates, for example ligands bound to a MP-receptor, protein fibrils and heterogeneous proteins aggregates. Solid-state MAS NMR spectroscopy is in principle well suited to study MP at atomic resolution. In this thesis, different types of MP preparations were tested for their suitability to be studied by solid-state MAS NMR. Proteoliposomes, poorly diffracting 2D crystals and a PEG precipitate of the outer membrane protein G (OmpG) were prepared as a model system for large MPs. Results from this work, combined with data found in the literature, show that highly diffracting crystalline material is not a prerequirement for structural analysis of MPs by solid-state MAS NMR. Instead, it is possible to use non-diffracting 3D crystals, MP precipitates, poorly diffracting 2D crystals and proteoliposomes. For the latter two types of preparations, the MP is reconstituted into a lipid bilayer, which thus allows the structural investigation in a quasi-native environment. In addition, to prepare a MP sample for solid-state MAS NMR it is possible to use screening methods, that are well established for 3D and 2D crystallisation of MPs. Hopefully, these findings will open a fourth method for structural investigation of MP. The prerequisite for structural studies by NMR in general, and the most time consuming step, is always the assignment of resonances to specific nuclei within the protein. Since the last few years an ever-increasing number of assignments from solid-state MAS NMR of uniformly carbon and nitrogen labelled samples is being reported, mostly for small proteins of up to around 150 amino acids in length. However, the complexity of the spectra increases with increasing molecular weight of the protein. Thus the conventional assignment strategies developed for small proteins do not yield a sufficiently high degree of assignment for the large MP OmpG (281 amino acids). Therefore, a new assignment strategy to find starting points for large MPs was devised. The assignment procedure is based on a sample with [2,3-13C, 15N]-labelled Tyr and Phe and uniformly labelled alanine and glycine. This labelling pattern reduces the spectral overlap as well as the number of assignment possibilities. In order to extend the assignment, four other specifically labelled OmpG samples were used. The assignment procedure starts with the identification of the spin systems of each labelled amino acid using 2D 13C-13C and 3D NCACX correlation experiments. In a second step, 2D and 3D NCOCX type experiments are used for the sequential assignment of the observed resonances to specific nuclei in the OmpG amino acid sequence. Additionally, it was shown in this work, that biosynthetically site directed labelled samples, which are normally used to observe long-range correlations, were helpful to confirm the assignment. Another approach to find assignment starting points in large protein systems, is the use of spectroscopic filtering techniques. A filtering block that selects methyl resonances was used to find further assignment starting points for OmpG. Combining all these techniques, it was possible to assign nearly 50 % of the observed signals to the OmpG sequence. Using this information, a prediction of the secondary structure elements of OmpG was possible. Most of the calculated motifs were in good aggreement with the crystal structures of OmpG. The approaches presented here should be applicable to a wide variety of MPs and MP-complexes and should thus open a new avenue for the structural biology of MPs. / Biologische Membranen bestehen hauptsächlich aus Lipiden, ihre Funktion wird jedoch vor allem durch die eingebetteten Membranproteine (z.B. Kanäle, Ionenpumpen und Rezeptoren) bestimmt. Mutationen in dieser Proteinklasse können zum Auftreten verschiedener Krankheitsbilder führen, weshalb die Untersuchung der dreidimensionalen Struktur von Membranproteinen nicht nur von strukturbiologischem, sondern auch von pharmakologischem Interesse ist. In den letzten Jahren wurde eine Methode, die Festkörper NMR Spektroskopie, für Strukturuntersuchungen an Proteinproben im festen Aggregatzustand entwickelt. Diese Arbeit beschäftigt sich mit drei verschiedenen Präparationsarten von Membranproteinen, die eine Aufnahme von hochaufgelösten Festkörper NMR Spektren erlauben. Als Modelsystem wurde das Protein G der äußeren Membrane (outer membrane protein G, OmpG) von Escherichia coli gewählt. Eine wichtige Vorraussetzung zur Berechnung der Proteinstruktur aus den NMR-Spektren, ist die Zuordnung der einzelnen Signale zur jeweiligen Aminosäure in der Proteinsequenz. In dieser Arbeit wurde eine Methode entwickelt, die das Auffinden von Startpunkten für die sequentielle Zuordnung in großen Membranproteinen, wie zum Bsp. OmpG (281 Aminosäuren), erlaubt. Multidimensionale NMR Experimente mit verschieden spezifisch markierten Proben wurden durchgeführt und ermöglichten die Zuordnung von 50 % der NMR Signale der OmpG Proteinsequenz. Zur Überprüfung der gewonnenen Daten wurden diese zur Vorhersage von Sekundärstrukturelementen genutzt. Es konnte gezeigt werden, dass die berechneten Strukturmotive in guter Übereinstimmung zu den bisher veröffentlichten OmpG Strukturen liegen. Die in dieser Arbeit angewendeten Methoden sollten auf eine Vielzahl anderer Membranprotein anwendbar und somit einen neuen Weg zur Strukturbiologischen Untersuchung von Membranproteinen eröffnen.
215

Structural Studies of Binding Proteins: Investigations of Flexibility, Specificity and Stability

Magnusson, Ulrika January 2003 (has links)
Binding proteins are present both in gram-negative and gram-positive bacteria. They are the recognition components of the ABC transport systems that transport different nutrients into the cell, and are in some cases also involved in chemotaxis. In gram-negative bacteria, they are present in the periplasm between the inner and the porous outer membrane. Here, these highly specific proteins can bind to a certain ligand such as ions, sugars and amino acids. The protein-ligand complex can then interact with permeases bound to the inner membrane that transport the nutrient into the cell. Gram-positive bacteria lack an outer membrane and the binding protein must therefore be anchored to the cell membrane. In this thesis different aspects of three members of the super-family of the periplasmic binding proteins have been studied. In the case of the allose-binding protein (ALBP) from E. coli we focused on the movement of the protein when ligand is bound and released. This protein was also compared with the ribose-binding protein (RBP) which belongs to the same structural cluster and from which both open and closed structures are available. The leucine-binding protein (LBP) from E. coli was studied with regards to the structural basis of its specificity for different ligands as well as its conformational changes. The leucine-isoleucine-valine protein has 80% sequence identity with LBP but still exhibits a different preference for ligands. The structure of the maltose-binding protein (MBP) was obtained from a gram-positive thermoacidophile, A. acidocaldarius. Here, our goal was to study acid-stability of proteins. Since little is known about this and structures of the mesophilic counterpart in E. coli are available, as well as structures from two hyperthermophiles, we had an opportunity to study differences in their structural properties that could explain their differing stabilities.
216

Determinants Of Globular Protein Stability And Temperature Sensitivity Inferred From Saturation Mutagenesis Of CcdB

Bajaj, Kanika 12 1900 (has links)
The unique native structure is a basic requirement for normal functioning of most proteins. Many diseases stem from mutations in proteins that destabilize the protein structure thereby resulting in impairment or loss of function (Sunyaev et al. 2000). Therefore, it is important from both fundamental and applied points of view, to elucidate the sequence determinants of protein structure and function. With the advent of recombinant DNA techniques for modifying protein sequences, studies on the effect of amino acid replacements on protein structure and function have acquired momentum. It is well established from previous mutagenesis studies that buried residues in a protein are important determinants of protein structure or stability while surface residues are involved in protein function (Rennell et al. 1991; Terwilliger et al. 1994; Axe et al. 1998). Inspite of this, there is no universally accepted definition and probe to distinguish and identify buried residues from exposed residues. A part of this thesis aims to examine the feasibility of using scanning mutagenesis to distinguish between buried and exposed positions in the absence of three-dimensional structure and also to arrive at an experimental definition of the appropriate accessibility cut-off to distinguish between buried and exposed residues. Proline, being an unusual amino acid is usually exploited to determine sites in a protein important for protein stability (Sauer et al. 1992). This thesis also explores the use of proline scanning mutagenesis to make inferences about protein structure and stability. Temperature sensitive mutant proteins, which result from single amino acid substitutions, are particularly useful in elucidating the determinants of protein folding and stability (Grutter et al. 1987; Sturtevant et al. 1989). Temperature sensitive (ts) mutants are an important class of conditional mutants which are widely used to study gene function in vivo and in cell culture (Novick and Schekman 1979; Novick and Botstein 1985). They display a marked drop in the level or activity of the gene product when the gene is expressed above a certain temperature (restrictive temperature). Below this temperature (permissive temperature), the level or activity of the mutant is very similar to that of the wild type. Inspite of their widespread use, little is known about the molecular mechanisms responsible for generating a Ts phenotype. A part of this thesis discusses a set of sequence/structure-based strategies for the successful design and isolation of ts mutants of a globular protein, inferred from saturation mutagenesis of CcdB. The experimental system, CcdB (Controller of Cell Division or Death B protein), is a 101 residue, homodimeric protein encoded by F plasmid. The protein is an inhibitor of DNA gyrase and is a potent cytotoxin in E.coli (Bernard et al. 1993). Crystallographic structures of CcdB in the free and gyrase bound forms (Loris et al. 1999; Dao-Thi et al. 2005) are also available. Expression of the CcdB functional protein results in cell death, thus providing a rapid and easy assay for the protein (Chakshusmathi et al. 2004). This dissertation focuses on understanding the determinants of globular protein stability and temperature sensitivity using saturation mutagenesis of E.coli CcdB. Towards this objective, we attempted to replace each of the 101 residues of CcdB with 19 other amino acids using high throughput mutagenesis tools. A total of 1430 (~75%) of all possible single site mutants of the CcdB saturation mutagenesis library could be isolated. These mutants were characterized in terms of their activity at different expression levels. The correlation between the observed mutant phenotypes with residue burial, nature of substitution and expression level was examined. The introductory chapter (Chapter 1) describes the use of mutagenesis as a tool to understand the relationship between protein sequence, structure and function. It represents an overview of previous large scale mutagenesis studies from the literature. It also addresses the motivation behind this work and problems which we have attempted to address in these studies. Chapter 2 discusses mutagenesis based definitions and probes for residue burial in proteins as derived from alanine and charged scanning mutagenesis of CcdB. Every residue of the 101 amino acid E. coli toxin CcdB was substituted with Ala, Asp, Glu, Lys and Arg using site directed mutagenesis. The activity of each mutant in vivo was characterized as a function of CcdB transcriptional level. The mutation data suggest that an accessibility value of 5% is an appropriate cutoff for definition of buried residues. At all buried positions, introduction of Asp results in an inactive phenotype at all CcdB transcriptional levels. The average amount of destabilization upon substitution at buried positions decreases in the order Asp>Glu>Lys>Arg>Ala. Asp substitutions at buried sites in two other proteins, MBP and Thioredoxin were also shown to be severely destabilizing. Ala and Asp scanning mutagenesis, in combination with dose dependent expression phenotypes, was shown to yield important information on protein structure and activity. These results also suggest that such scanning mutagenesis data can be used to rank order sequence alignments and their corresponding homology models, as well as to distinguish between correct and incorrect structural alignments. When incorporated into a polypeptide chain, Proline (Pro) differs from all other naturally occurring amino acids in two important respects. The  dihedral angle of Pro is constrained to values close to –65o and Pro lacks an amide hydrogen. Chapter 3 describes a procedure to accurately predict the effects of proline introduction on protein stability. 77 of the 97 non-Pro amino acid residues in the model protein, CcdB, were individually mutated to proline and the in vivo activity of each mutant was characterized. A decision tree to classify the mutation as perturbing or non-perturbing was created by correlating stereochemical properties of mutants to activity data. The stereochemical properties, including main chain dihederal angle and main chain amide hydrogen bonds, were determined from 3D models of the mutant proteins built using MODELLER. The performance of the decision tree was assessed on 74 nsSNPs and 37 other proline substitutions from the literature. The overall accuracy of this algorithm was found to be 89% in case of CcdB, 71% in case of nsSNPs and 83% in case of other proline substitution data. Contrary to previous assertions, Proline scanning mutagenesis cannot be reliably used to make secondary structural assignments in proteins. The studies will be useful in annotating uncharacterized nsSNPs of disease-associated proteins and for protein engineering and design. Mutants of CcdB were also characterized in terms of their activity at two different temperatures (30oC and 37oC) to screen for temperature sensitive (ts) mutants. The isolation and structural analysis of Ts mutants of CcdB is dealt with in Chapter 4. Of the total 1430 single site mutants, 12% showed a ts phenotype and were mapped onto the crystal structure of the protein. Almost all the ts mutants could be interpreted in terms of the wild type, native structure. ts mutants were found at all buried sites and all active sites (except one). ts mutants were also obtained at sites in close proximity to active site residues where polar side-chains were involved in H-bonding interaction with active site residues. Several proline substitutions also displayed a ts phenotype. The effect of expression level on ts phenotype was also studied. 78% of the mutants that showed an inactive phenotype at the lowest expression level and an active phenotype at highest expression level, resulted in a ts phenotype at an intermediate expression level. The molecular determinant responsible for the ts phenotype of buried site ts mutant is suggested to be the thermodynamic destabilization of the protein which results in a reduced steady state in vivo level of soluble, functional protein relative to wild type. The active site ts mutants probably lower the specific activity of the protein and hence the total activity relative to wild type. However these effects might be less severe at lower temperature. Specific structure/function based mutagenesis strategies are suggested to design ts mutant of a protein. These studies will simplify the design of ts mutants for any globular protein and will have applications in diverse biological systems to study gene function in vivo. Chapter 5 represents the structural and sequence correlations of a CcdB saturation mutagenesis library which was obtained by replacing each of 101 amino acid residues with 19 other amino acids. Polar substitutions i.e. Asn, Gln, Ser, Thr and His were poorly tolerated at buried sites at lower expression levels. Aromatic substitutions and Gly were also not well tolerated at buried positions at lower expression levels. Trp was poorly tolerated at residues with accessibility <15%. However, most of the surface exposed residues with accessibility >40% (except functional ones) could tolerate all kinds of substitutions. Chapter 6 deals with the thermodynamic characterization of monomeric and dimeric forms of CcdB. The stability and aggregation state of CcdB have been characterized as a function of pH and temperature. Size exclusion chromatography revealed that the protein is a dimer at pH 7.0, but a monomer at pH 4.0. CD analysis and fluorescence spectroscopy showed that the monomer is well folded, and has similar tertiary structure to the dimer. Hence intersubunit interactions are not required for folding of individual subunits. The oligomeric status of CcdB at pH 7.0 at physiologically relevant low concentrations of protein, was characterized by labeling the protein with two different pairs of donor and acceptor fluorescent dyes (Acrylodan-Pyrene and IAF-IAEDANS) separately and carrying out fluorescence resonance energy transfer (FRET) measurements by mixing them together. CcdB exists in a dimeric state even at nanomolar concentrations, thus indicating that the dimeric form is likely to be the physiologically active form of CcdB. The stability of the dimeric form at pH 7.0 and the monomeric form at pH 4.0 was characterized by isothermal denaturant unfolding and calorimetry. The free energies of unfolding were found to be 9.2 kcal/mol (1 cal=4.184 J) and 21 kcal/mol at 298 K for the monomer and dimer respectively. The denaturant concentration at which one-half of the protein molecules are unfolded (Cm) for the dimer is dependent on protein concentration, whereas the Cm of the monomer is independent of protein concentration, as expected. Although thermal unfolding of the protein in aqueous solution is irreversible at neutral pH, it was found that thermal unfolding is reversible in the presence of GdnCl (guanidinium chloride). Differential scanning calorimetry in the presence of low concentrations of GdnCl in combination with isothermal denaturation melts as a function of temperature were used to derive the stability curve for the protein. The value of Cp (representing the change in excess heat capacity upon protein denaturation) is 2.8 ± 0.2 kcalmol-1K-1 for unfolding of dimeric CcdB, and only has a weak dependence on denaturant concentration. These studies advanced the understanding of protein folding of oligomeric proteins. The concluding section summarizes all the chapters in a nutshell and addresses the future directions provided by these investigations.
217

Protein structure prediction : Zinc-binding sites, one-dimensional structure and remote homology

Shu, Nanjiang January 2010 (has links)
Predicting the three-dimensional (3D) structure of proteins is a central problem in biology. These computationally predicted 3D protein structures have been successfully applied in many fields of biomedicine, e.g. family assignments and drug discovery. The accurate detection of remotely homologous templates is critical for the successful prediction of the 3D structure of proteins. Also, the prediction of one-dimensional (1D) protein structures such as secondary structures and shape strings are useful for predicting the 3D structure of proteins and important for understanding the sequence-structure relationship. In addition, the prediction of the functional sites of proteins, such as metal-binding sites, can not only reveal the important function of proteins (even in the absence of the 3D structure) but also facilitate the prediction of the 3D structure. Here, three novel methods in the field of protein structure prediction are presented: PREDZINC, a method for predicting zinc-binding sites in proteins; Frag1D, a method for predicting the 1D structure of proteins; and FragMatch, a method for detecting remotely homologous proteins. These methods compete satisfactorily with the best methods previously published and contribute to the task of protein structure prediction. / At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript. / Protein structure prediction
218

Clustering System and Clustering Support Vector Machine for Local Protein Structure Prediction

Zhong, Wei 02 August 2006 (has links)
Protein tertiary structure plays a very important role in determining its possible functional sites and chemical interactions with other related proteins. Experimental methods to determine protein structure are time consuming and expensive. As a result, the gap between protein sequence and its structure has widened substantially due to the high throughput sequencing techniques. Problems of experimental methods motivate us to develop the computational algorithms for protein structure prediction. In this work, the clustering system is used to predict local protein structure. At first, recurring sequence clusters are explored with an improved K-means clustering algorithm. Carefully constructed sequence clusters are used to predict local protein structure. After obtaining the sequence clusters and motifs, we study how sequence variation for sequence clusters may influence its structural similarity. Analysis of the relationship between sequence variation and structural similarity for sequence clusters shows that sequence clusters with tight sequence variation have high structural similarity and sequence clusters with wide sequence variation have poor structural similarity. Based on above knowledge, the established clustering system is used to predict the tertiary structure for local sequence segments. Test results indicate that highest quality clusters can give highly reliable prediction results and high quality clusters can give reliable prediction results. In order to improve the performance of the clustering system for local protein structure prediction, a novel computational model called Clustering Support Vector Machines (CSVMs) is proposed. In our previous work, the sequence-to-structure relationship with the K-means algorithm has been explored by the conventional K-means algorithm. The K-means clustering algorithm may not capture nonlinear sequence-to-structure relationship effectively. As a result, we consider using Support Vector Machine (SVM) to capture the nonlinear sequence-to-structure relationship. However, SVM is not favorable for huge datasets including millions of samples. Therefore, we propose a novel computational model called CSVMs. Taking advantage of both the theory of granular computing and advanced statistical learning methodology, CSVMs are built specifically for each information granule partitioned intelligently by the clustering algorithm. Compared with the clustering system introduced previously, our experimental results show that accuracy for local structure prediction has been improved noticeably when CSVMs are applied.
219

The Development of Novel Protein Topology Mapping Strategies using Crosslinking, Cyanogen Bromide Cleavage, and Mass Spectrometry

Weerasekera, Rasanjala Kumari 11 January 2012 (has links)
Advances in protein topology mapping methods are urgently needed to complement the wealth of interactome data that is presently being generated at a rapid pace. Chemical crosslinking followed by mass spectrometry (MS) has evolved over the last decade as an attractive method for protein topology and interface mapping, and holds great promise as a counterpart to modern interactome studies in the field of proteomics. Furthermore, stabilization of proteins and protein complexes with crosslinking offers many advantages over high-resolution structural mapping methods, including the ability to study protein topologies in vivo. The reliance on direct detection of crosslinked peptides, however, continues to pose challenges to protein topology and interface mapping with chemical crosslinking plus MS. The present body of work aimed to develop a novel generic methodology that utilizes chemical crosslinking, cyanogen bromide (CNBr) cleavage and MS for the low-resolution mapping of protein topologies and interfaces. Through such low-resolution mapping of crosslinked regions, this novel strategy overcomes limitations associated with the direct detection of crosslinked peptides. Following optimization of various steps, the present method was validated with the bacterial DNA-directed RNA polymerase core complex and was subsequently applied to probe the tetrameric assembly of yeast Skp1p-Cdc4p heterodimers. Further improvements were made through the enrichment of crosslinked CNBr-cleaved protein fragments prior to their identification via MS. Two enrichment strategies were developed which depended upon the conjugation of tags to CNBr-cleaved peptide C-termini followed by either tandem affinity purification or tandem reversed-phase HPLC purification. These strategies were successfully applied for the efficient purification of disulfide-linked peptides from peptide mixtures. It is expected that the potential to achieve sensitive mapping of topologies and interfaces of multi-subunit protein complexes in vivo, in combination with further enhancements to permit studies on complex protein samples, will extend the utility of this method to complement large-scale interactome studies.
220

The Development of Novel Protein Topology Mapping Strategies using Crosslinking, Cyanogen Bromide Cleavage, and Mass Spectrometry

Weerasekera, Rasanjala Kumari 11 January 2012 (has links)
Advances in protein topology mapping methods are urgently needed to complement the wealth of interactome data that is presently being generated at a rapid pace. Chemical crosslinking followed by mass spectrometry (MS) has evolved over the last decade as an attractive method for protein topology and interface mapping, and holds great promise as a counterpart to modern interactome studies in the field of proteomics. Furthermore, stabilization of proteins and protein complexes with crosslinking offers many advantages over high-resolution structural mapping methods, including the ability to study protein topologies in vivo. The reliance on direct detection of crosslinked peptides, however, continues to pose challenges to protein topology and interface mapping with chemical crosslinking plus MS. The present body of work aimed to develop a novel generic methodology that utilizes chemical crosslinking, cyanogen bromide (CNBr) cleavage and MS for the low-resolution mapping of protein topologies and interfaces. Through such low-resolution mapping of crosslinked regions, this novel strategy overcomes limitations associated with the direct detection of crosslinked peptides. Following optimization of various steps, the present method was validated with the bacterial DNA-directed RNA polymerase core complex and was subsequently applied to probe the tetrameric assembly of yeast Skp1p-Cdc4p heterodimers. Further improvements were made through the enrichment of crosslinked CNBr-cleaved protein fragments prior to their identification via MS. Two enrichment strategies were developed which depended upon the conjugation of tags to CNBr-cleaved peptide C-termini followed by either tandem affinity purification or tandem reversed-phase HPLC purification. These strategies were successfully applied for the efficient purification of disulfide-linked peptides from peptide mixtures. It is expected that the potential to achieve sensitive mapping of topologies and interfaces of multi-subunit protein complexes in vivo, in combination with further enhancements to permit studies on complex protein samples, will extend the utility of this method to complement large-scale interactome studies.

Page generated in 0.3046 seconds