Spelling suggestions: "subject:"1protein / structure"" "subject:"2protein / structure""
281 |
Construção e análise de mutantes fluorescentes da troponina I / Construction and analysis of fluorescent mutants of troponin IOliveira, Deodoro Camargo Silva Gonçalves de 10 August 2001 (has links)
A troponina (Tn) regula a contração do músculo estriado esquelético de vertebrados. Ela é composta de três subunidades: troponina I (TnI), troponina C (TnC) e troponina T (TnT). A TnI tem a função inibitória que é neutralizada pela ligação de Ca2+ nos sítios regulatórios do N-domínio da TnC, e a TnT posiciona o complexo no filamento fino. Para monitorar o sinal do Ca2+ sendo transmitido da TnC para a TnI as propriedades espectrais únicas do 5-hidroxitriptofano (5HW) foram utilizadas. O 5HW foi incorporado em mutantes pontuais de TnI com um único códon para triptofano. Foram identificadas duas sondas espectrais intrínsecas na TnI capazes de detectar a ligação de Ca2+ na Tn: as TnIs com 5HW nas posições 100 e 121. Complexos troponina reconstituídos com estes mutantes fluorescentes de TnI, Tn-TnIF100HW e Tn-TnIM121HW, apresentaram respectivamente 12 e 70 % de aumento na intensidade do espectro de emissão devido à ligação de Ca2+ na TnC. Nos complexos binários (TnC-TnI) as TnIs com 5HW nas posições 106 e 121 também captam a ligação do Ca2+ na TnC. A análise da fluorescência destas sondas demonstrou que: 1) as regiões da TnI que respondem ao N-domínio regulatório da TnC ocupado com Ca2+ são a região inibitória da TnI, resíduos 96 até 116, e a região vizinha que inclui a posição 121 da TnI; 2) mutações pontuais e a incorporação de 5HW na TnI podem afetar tanto a afinidade como a cooperatividade da ligação de Ca2+ na TnC, confirmando o papel da TnI em modular a afinidade da TnC por Ca2+; 3) as constantes de dissociação de Ca2+ surpreendentemente altas, Kd ~ 10-8 M, calculadas a partir dos sinais das sondas na região inibitória da TnI, sugerem a possibilidade de que os sítios do domínio N-terminal da TnC sejam os sítios de ligação de Ca2+ de maior afinidade no complexo troponina. / Vertebrate striated muscle contraction is regulated by troponin (Tn). Tn is composed of three subunits: troponin I (TnI), troponin C (TnC) and troponin T (TnT). TnI has an inhibitory role that is neutralized by calcium binding to the regulatory sites in the N-domain of TnC, and TnT positions the troponin complex on the thin filament. In order to follow the Ca2+ induced conformational change that is transmitted from TnC to TnI, the unique spectral properties of 5-hydroxytryptophan (5HW) incorporated as point-mutants of TnI were used. It was possible to identify two new TnI intrinsic spectral probes sensitive to Ca2+ binding to Tn: TnI with single 5HW at positions 100 and 121. Trimeric troponin complexes reconstituted with two fluorescent mutants of TnI, Tn-TnIF100HW and Tn-TnIM121HW, showed respectively 12 and 70 % increase in the emission spectra when Ca2+ bound to TnC. In the binary complexes (TnC-TnI) two TnIs with 5HW at positions 106 and 121 were also sensitive to Ca2+ binding to TnC. Fluorescence analysis of these probes showed: 1) the regions in TnI that respond to Ca2+ binding to the regulatory N-domain of TnC are the inhibitory region of TnI (residues 96 to 116), and a neighbor region that includes position 121; 2) point mutations and incorporation of 5HW in TnI can affect both the affinity and the cooperativity of Ca2+ binding to TnC, confirming the role of TnI as a modulator of the Ca2+ affinity of TnC; 3) the high dissociation constant for sites in the N-terminal domain of TnC (Kd ~ 10-8 M), derived from data using probes in the inhibitory region of TnI suggested the possibility that these sites are the high affinity Ca2+ binding sites in the troponin complex.
|
282 |
Algoritmo evolutivo de muitos objetivos para predição ab initio de estrutura de proteínas / Multiobjective evolutionary algorithm with many tables to ab initio protein structure predictionChristiane Regina Soares Brasil 10 May 2012 (has links)
Este trabalho foca o desenvolvimento de algoritmos de otimização para o problema de PSP puramente ab initio. Algoritmos que melhor exploram o espaço de potencial de soluções podem, em geral, encontrar melhores soluções. Esses algoritmos podem beneficiar ambas abordagens de PSP, tanto o modelo ab initio quanto os baseados em conhecimento a priori. Pesquisadores tem mostrado que Algoritmos Evolutivos Multiobjetivo podem contribuir significativamente no contexto do problema de PSP puramente ab initio. Neste contexto, esta pesquisa investiga o Algoritmo Evolutivo Multiobjetivo baseado em Tabelas aplicado ao PSP puramente ab initio, que apresenta interessantes resultados para proteínas relativamente simples. Por exemplo, um desafio para o PSP puramente ab initio é a predição de estruturas com folhas-. Para trabalhar com tais proteínas, foi desenvolvido procedimentos computacionalmente eficientes para estimar energias de ligação de hidrogênio e solvatação. Em geral, estas não são consideradas no PSP por abordagens que combinam métodos de otimização e conhecimento a priori. Considerando somente van der Waals e eletrostática, as duas energias de interação que mais contribuem para a definição da estrutura de uma proteína, com as energias de ligação de hidrogênio e solvatação, o problema de PSP tem quatro objetivos. Problemas combinatórios (tais como o PSP), com mais de três objetivos, geralmente requerem métodos específicos capazes de lidar com muitos critérios. Para resolver essa limitação, este trabalho propõe um novo método para a otimização dos muitos objetivos, chamado Algoritmo Evolutivo Multiobjetivo com Muitas Tabelas (AEMMT). Esse método executa uma amostragem mais adequada do espaço de funções objetivo e, portanto, pode mapear melhor as regiões promissoras deste espaço. A capacidade de lidar com muitos objetivos capacita o AEMMT a utilizar melhor a informação oriunda das energias de solvatação e de ligação de hidrogênio, e então predizer estruturas com folhas- e algumas proteínas relativamente mais complexas. Do ponto de vista computacional, o AEMMT é um novo método que lida com muitos objetivos (mais de dez) encontrando soluções relevantes / This work focuses on the development of optimization algorithms for the purely ab initio Protein Structure Prediction (PSP) problem. Algorithms that better explore the space of potential solutions can in general find better solutions. Such algorithms can benefit both ab initio and template-based PSP, that uses priori knowledge. Researches have shown that Multiobjective evolutionary algorithms can contribute significantly in the context of purely ab initio PSP. In this context, this research investigates the Multiobjective Evolutionary Algorithm based on Tables applied to purely ab initio PSP, which has shown interesting results for relatively simple proteins. For example, one challenge for purely ab initio PSP is the prediction of structures with -sheets. To work with such proteins, this research has developed computationally efficient procedures to estimate hydrogen bond and solvation energies. In general, they are not considered by PSP approaches combining optimization methods with priori knowledge. Only by considering van der Waals and electrostatic, the two interaction energies that mostly contribute to defining a protein structure, and the hydrogen bond and solvation energies, the PSP problem has four objectives. Combinatorial problems (such as the PSP) with more than three objective usually require specific methods capable of dealing with many goals. To address this limitation, we propose a new method for many objective optimization, called Multiobjective Evolutionary Algorithm with Many Tables (MEAMT). This method performs a more adequate sampling of the space of objective functions and, therefore, can better map the promising regions of this space. The ability of dealing with many objectives enables the MEANT to better use information generated by solvation and hydrogen bond energies, and then predict structures with -sheets and some relatively complex proteins. From the computational point of view, the MEAMT is a new method for dealing with many objectives (more than ten) finding relevant solutions
|
283 |
On the analysis of remd protein structure prediction simulations for reducing volume of analytical dataMacedo, Rafael Cauduro Oliveira 30 August 2017 (has links)
Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-09-03T14:00:58Z
No. of bitstreams: 1
RAFAEL CAUDURO OLIVEIRA MACEDO_DIS.pdf: 6178948 bytes, checksum: 6ed3599e31f122e78b11b322a8c0ac06 (MD5) / Approved for entry into archive by Sheila Dias (sheila.dias@pucrs.br) on 2018-09-04T12:17:04Z (GMT) No. of bitstreams: 1
RAFAEL CAUDURO OLIVEIRA MACEDO_DIS.pdf: 6178948 bytes, checksum: 6ed3599e31f122e78b11b322a8c0ac06 (MD5) / Made available in DSpace on 2018-09-04T12:47:15Z (GMT). No. of bitstreams: 1
RAFAEL CAUDURO OLIVEIRA MACEDO_DIS.pdf: 6178948 bytes, checksum: 6ed3599e31f122e78b11b322a8c0ac06 (MD5)
Previous issue date: 2017-08-30 / Prote?nas executam um papel vital em todos os seres vivos, mediando uma s?rie de processos necess?rios para a vida. Apesar de existirem maneiras de determinar a composi??o dessas mol?culas, ainda falta-nos conhecimentos suficiente para determinar de uma maneira r?pida e barata a sua estrutura 3D, que desempenha um papel importante na suas fun??es. Um dos principais m?todos computacionais aplicados ao estudo das
prote?nas e o seu processo de enovelamento, o qual determina a sua estrutura, ? Din?mica Molecular. Um aprimoramento deste m?todo, conhecido como Replica Exchange Molecular Dynamics (ou REMD), ? capaz de produzir resultados muito melhores, com o rev?s de significativamente aumentar o seu custo computacional e gerar um volume muito maior de
dados. Esta disserta??o apresenta um novo m?todo de otimiza??o deste m?todo, intitulado Filtragem de Dados Anal?ticos, que tem como objetivo otimizar a an?lise p?s-simula??o filtrando as estruturas preditas insatisfat?rias atrav?s do uso de m?tricas de qualidade absolutas. A metodologia proposta tem o potencial de operar em conjunto com outras
abordagens de otimiza??o e tamb?m cobrir uma ?rea ainda n?o abordada por elas. Adiante, a ferramenta SnapFi ? apresentada, a qual foi designada especialmente para o prop?sito de filtrar estruturas preditas insatisfat?rias e ainda operar em conjunto com as diferentes abordagens de otimiza??o do m?todo REMD. Um estudo foi ent?o conduzido sobre um conjunto teste de simula??es REMD de predi??o de estruturas de prote?nas afim de elucidar
uma s?ries de hip?teses formuladas sobre o impacto das diferentes temperaturas na qualidade final do conjunto de estruturas preditas do processo REMD, a efici?ncia das diferentes m?tricas de qualidade absolutas e uma poss?vel configura??o de filtragem que utiliza essas m?tricas. Foi observado que as temperaturas mais altas do m?todo REMD para predi??o de estruturas de prote?nas podem ser descartadas de forma segura da an?lise posterior ao seu t?rmino e tamb?m que as m?tricas de qualidade absolutas possuem uma alta vari?ncia (em termos de qualidade) entre diferentes simula??es de predi??es de estruturas de prote?nas. Al?m disso, foi observado que diferentes configura??es de filtragem que utilize tais m?tricas carrega consigo esta vari?ncia. / Proteins perform a vital role in all living beings, mediating a series of processes necessary to life. Although we have ways to determine the composition of such molecules, we lack sufficient knowledge regarding the determination of their 3D structure in a cheap and fast manner, which plays an important role in their functions. One of the main computational methods applied to the study of proteins and their folding process, which determine its structure, is Molecular Dynamics. An enhancement of this method, known as Replica-Exchange Molecular Dynamics (or REMD) is capable of producing much better results, at the expense of a significant increase in computational costs and volume of raw data generated. This dissertation presents a novel optimization for this method, titled Analytical Data Filtering, which aims to optimize post-simulation analysis by filtering unsatisfactory predicted structures via the use of different absolute quality metrics. The proposed methodology has the potential of working together with other optimization approaches as well as covering an area still untouched at large by them to the best of the author knowledge. Further on, the SnapFi tool is presented, a tool designed specially for the purpose of filtering unsatisfactory structure predictions and also being able to work with the different optimization approaches of the Replica-Exchange Molecular Dynamics method. A study was then conducted on a test dataset of REMD protein structure prediction simulations aiming to elucidate a series of formulated hypothesis regarding the impact of the different temperatures of the REMD process in the final quality of the predicted structures, the efficiency of the different absolute quality metrics and a possible filtering configuration that take advantage of such metrics. It was observed that high temperatures may be safely discarded from post-simulation analysis of REMD protein structure prediction simulations, that absolute quality metrics posses a high variance of efficiency (regarding quality terms) between different protein structure prediction simulations and that different filtering configurations composed of such quality metrics carry on this inconvenient variance.
|
284 |
Efficient use of a protein structure annotation databaseRother, Kristian 14 August 2007 (has links)
Im Rahmen dieser Arbeit wird eine Vielzahl von Daten zur Struktur und Funktion von Proteinen gesammelt. Anschließend wird in strukturellen Daten die atomare Packungsdichte untersucht. Untersuchungen an Strukturen benötigen oftmals maßgeschneiderte Datensätze von Proteinen. Kriterien für die Auswahl einzelner Proteine sind z.B. Eigenschaften der Sequenzen, die Faltung oder die Auflösung einer Struktur. Solche Datensätze mit den im Netz verfügbaren Mitteln herzustellen ist mühselig, da die notwendigen Daten über viele Datenbanken verteilt liegen. Um diese Aufgabe zu vereinfachen, wurde Columba, eine integrierte Datenbank zur Annotation von Proteinstrukturen, geschaffen. Columba integriert insgesamt sechzehn Datenbanken, darunter u.a. die PDB, KEGG, Swiss-Prot, CATH, SCOP, die Gene Ontology und ENZYME. Von den in Columba enthaltenen Strukturen der PDB sind zwei Drittel durch viele andere Datenbanken annotiert. Zum verbliebenen Drittel gibt es nur wenige zusätzliche Angaben, teils da die entsprechenden Strukturen erst seit kurzem in der PDB sind, teils da es gar keine richtigen Proteine sind. Die Datenbank kann über eine Web-Oberfläche unter www.columba-db.de spezifisch für einzelne Quelldatenbanken durchsucht werden. Ein Benutzer kann sich auf diese Weise schnell einen Datensatz von Strukturen aus der PDB zusammenstellen, welche den gewählten Anforderungen entsprechen. Es wurden Regeln aufgestellt, mit denen Datensätze effizient erstellt werden können. Diese Regeln wurden angewandt, um Datensätze zur Analyse der Packungsdichte von Proteinen zu erstellen. Die Packungsanalyse quantifiziert den Raum zwischen Atomen, und kann Regionen finden, in welchen eine hohe lokale Beweglichkeit vorliegt oder welche Fehler in der Struktur beinhalten. In einem Referenzdatensatz wurde so eine große Zahl von atomgroßen Höhlungen dicht unterhalb der Proteinoberfläche gefunden. In Transmembrandomänen treten diese Höhlungen besonders häufig in Kanal- und Transportproteinen auf, welche Konformationsänderungen vollführen. In proteingebundenen Liganden und Coenzymen wurde eine zu den Referenzdaten ähnliche Packungsdichte beobachtet. Mit diesen Ergebnissen konnten mehrere Widersprüche in der Fachliteratur ausgeräumt werden. / In this work, a multitude of data on structure and function of proteins is compiled and subsequently applied to the analysis of atomic packing. Structural analyses often require specific protein datasets, based on certain properties of the proteins, such as sequence features, protein folds, or resolution. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, Columba, an integrated database containing annotation of protein structures was created. Columba integrates sixteen databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The data in Columba revealed that two thirds of the structures in the PDB database are annotated by many other databases. The remaining third is poorly annotated, partially because the according structures have only recently been published, and partially because they are non-protein structures. The Columba database can be searched by a data source-specific web interface at www.columba-db.de. Users can thus quickly select PDB entries of proteins that match the desired criteria. Rules for creating datasets of proteins efficiently have been derived. These rules were applied to create datasets for analyzing the packing of proteins. Packing analysis measures how much space there is between atoms. This indicates regions where a high local mobility of the structure is required, and errors in the structure. In a reference dataset, a high number of atom-sized cavities was found in a region near the protein surface. In a transmembrane protein dataset, these cavities frequently locate in channels and transporters that undergo conformational changes. A dataset of ligands and coenzymes bound to proteins was packed as least as tightly as the reference data. By these results, several contradictions in the literature have been resolved.
|
285 |
Structure-Function Studies of Enzymes from Ribose MetabolismAndersson, C. Evalena January 2004 (has links)
<p>In the pentose phosphate pathway, carbohydrates such as glucose and ribose are degraded with production of reductive power and energy. Another important function is to produce essential pentoses, such as ribose 5-phosphate, which later can be used in biosynthesis of nucleic acids and cofactors. </p><p>This thesis presents structural and functional studies on three enzymes involved in ribose metabolism in <i>Escherichia coli</i>. </p><p>Ribokinase is an enzyme that phosphorylates ribose in the presence of ATP and magnesium, as the first step of exogenous ribose metabolism. Two important aspects of ribokinase function, not previously known, have been elucidated. Ribokinase was shown to be activated by monovalent cations, specifically potassium. Structural analysis of the monovalent ion binding site indicates that the ion has a structural rather than catalytic role; a mode of activation involving a conformational change has been suggested. Product inhibition studies suggest that ATP is the first substrate to bind the enzyme. Independent K<sub>d</sub> measurements with the ATP analogue AMP-PCP support this. The results presented here will have implications for several enzymes in the protein family to which ribokinase belongs, in particular the medically interesting enzyme adenosine kinase. </p><p>Ribose 5-phosphate isomerases convert ribose 5-phosphate into ribulose 5-phosphate or <i>vice versa</i>. Structural studies on the two genetically distinct isomerases in <i>E. coli</i> have shown them to be fundamentally different in many aspects, including active site architecture. However, a kinetic study has demonstrated both enzymes to be efficient in terms of catalysis. Sequence searches of completed genomes show ribose 5-phosphate isomerase B to be the sole isomerase in many bacteria, although ribose 5-phosphate isomerase A is a nearly universal enzyme. All genomes contain at least one of the two enzymes. These results confirm that both enzymes must be independently capable of supporting ribose metabolism, a fact that had not previously been established.</p>
|
286 |
Exploring the Molecular Dynamics of Proteins and VirusesLarsson, Daniel January 2012 (has links)
Knowledge about structure and dynamics of the important biological macromolecules — proteins, nucleic acids, lipids and sugars — helps to understand their function. Atomic-resolution structures of macromolecules are routinely captured with X-ray crystallography and other techniques. In this thesis, simulations are used to explore the dynamics of the molecules beyond the static structures. Viruses are machines constructed from macromolecules. Crystal structures of them reveal little to no information about their genomes. In simulations of empty capsids, we observed a correlation between the spatial distribution of chloride ions in the solution and the position of RNA in crystals of satellite tobacco necrosis virus (STNV) and satellite tobacco mosaic virus (STMV). In this manner, structural features of the non-symmetric RNA could also be inferred. The capsid of STNV binds calcium ions on the icosahedral symmetry axes. The release of these ions controls the activation of the virus particle upon infection. Our simulations reproduced the swelling of the capsid upon removal of the ions and we quantified the water permeability of the capsid. The structure and dynamics of the expanded capsid suggest that the disassembly is initiated at the 3-fold symmetry axis. Several experimental methods require biomolecular samples to be injected into vacuum, such as mass-spectrometry and diffractive imaging of single particles. It is therefore important to understand how proteins and molecule-complexes respond to being aerosolized. In simulations we mimicked the dehydration process upon going from solution into the gas phase. We find that two important factors for structural stability of proteins are the temperature and the level of residual hydration. The simulations support experimental claims that membrane proteins can be protected by a lipid micelle and that a non-membrane protein could be stabilized in a reverse micelle in the gas phase. A water-layer around virus particles would impede the signal in diffractive experiments, but our calculations estimate that it should be possible to determine the orientation of the particle in individual images, which is a prerequisite for three-dimensional reconstruction. / BMC B41, 25/5, 9:15
|
287 |
Structure-Function Studies of Enzymes from Ribose MetabolismAndersson, C. Evalena January 2004 (has links)
In the pentose phosphate pathway, carbohydrates such as glucose and ribose are degraded with production of reductive power and energy. Another important function is to produce essential pentoses, such as ribose 5-phosphate, which later can be used in biosynthesis of nucleic acids and cofactors. This thesis presents structural and functional studies on three enzymes involved in ribose metabolism in Escherichia coli. Ribokinase is an enzyme that phosphorylates ribose in the presence of ATP and magnesium, as the first step of exogenous ribose metabolism. Two important aspects of ribokinase function, not previously known, have been elucidated. Ribokinase was shown to be activated by monovalent cations, specifically potassium. Structural analysis of the monovalent ion binding site indicates that the ion has a structural rather than catalytic role; a mode of activation involving a conformational change has been suggested. Product inhibition studies suggest that ATP is the first substrate to bind the enzyme. Independent Kd measurements with the ATP analogue AMP-PCP support this. The results presented here will have implications for several enzymes in the protein family to which ribokinase belongs, in particular the medically interesting enzyme adenosine kinase. Ribose 5-phosphate isomerases convert ribose 5-phosphate into ribulose 5-phosphate or vice versa. Structural studies on the two genetically distinct isomerases in E. coli have shown them to be fundamentally different in many aspects, including active site architecture. However, a kinetic study has demonstrated both enzymes to be efficient in terms of catalysis. Sequence searches of completed genomes show ribose 5-phosphate isomerase B to be the sole isomerase in many bacteria, although ribose 5-phosphate isomerase A is a nearly universal enzyme. All genomes contain at least one of the two enzymes. These results confirm that both enzymes must be independently capable of supporting ribose metabolism, a fact that had not previously been established.
|
288 |
Structural Information and Hidden Markov Models for Biological Sequence AnalysisTångrot, Jeanette January 2008 (has links)
Bioinformatics is a fast-developing field, which makes use of computational methods to analyse and structure biological data. An important branch of bioinformatics is structure and function prediction of proteins, which is often based on finding relationships to already characterized proteins. It is known that two proteins with very similar sequences also share the same 3D structure. However, there are many proteins with similar structures that have no clear sequence similarity, which make it difficult to find these relationships. In this thesis, two methods for annotating protein domains are presented, one aiming at assigning the correct domain family or families to a protein sequence, and the other aiming at fold recognition. Both methods use hidden Markov models (HMMs) to find related proteins, and they both exploit the fact that structure is more conserved than sequence, but in two different ways. Most of the research presented in the thesis focuses on the structure-anchored HMMs, saHMMs. For each domain family, an saHMM is constructed from a multiple structure alignment of carefully selected representative domains, the saHMM-members. These saHMM-members are collected in the so called "midnight ASTRAL set", and are chosen so that all saHMM-members within the same family have mutual sequence identities below a threshold of about 20%. In order to construct the midnight ASTRAL set and the saHMMs, a pipe-line of software tools are developed. The saHMMs are shown to be able to detect the correct family relationships at very high accuracy, and perform better than the standard tool Pfam in assigning the correct domain families to new domain sequences. We also introduce the FI-score, which is used to measure the performance of the saHMMs, in order to select the optimal model for each domain family. The saHMMs are made available for searching through the FISH server, and can be used for assigning family relationships to protein sequences. The other approach presented in the thesis is secondary structure HMMs (ssHMMs). These HMMs are designed to use both the sequence and the predicted secondary structure of a query protein when scoring it against the model. A rigorous benchmark is used, which shows that HMMs made from multiple sequences result in better fold recognition than those based on single sequences. Adding secondary structure information to the HMMs improves the ability of fold recognition further, both when using true and predicted secondary structures for the query sequence. / Bioinformatik är ett område där datavetenskapliga och statistiska metoder används för att analysera och strukturera biologiska data. Ett viktigt område inom bioinformatiken försöker förutsäga vilken tredimensionell struktur och funktion ett protein har, utifrån dess aminosyrasekvens och/eller likheter med andra, redan karaktäriserade, proteiner. Det är känt att två proteiner med likande aminosyrasekvenser också har liknande tredimensionella strukturer. Att två proteiner har liknande strukturer behöver dock inte betyda att deras sekvenser är lika, vilket kan göra det svårt att hitta strukturella likheter utifrån ett proteins aminosyrasekvens. Den här avhandlingen beskriver två metoder för att hitta likheter mellan proteiner, den ena med fokus på att bestämma vilken familj av proteindomäner, med känd 3D-struktur, en given sekvens tillhör, medan den andra försöker förutsäga ett proteins veckning, d.v.s. ge en grov bild av proteinets struktur. Båda metoderna använder s.k. dolda Markov modeller (hidden Markov models, HMMer), en statistisk metod som bland annat kan användas för att beskriva proteinfamiljer. Med hjälp en HMM kan man förutsäga om en viss proteinsekvens tillhör den familj modellen representerar. Båda metoderna använder också strukturinformation för att öka modellernas förmåga att känna igen besläktade sekvenser, men på olika sätt. Det mesta av arbetet i avhandlingen handlar om strukturellt förankrade HMMer (structure-anchored HMMs, saHMMer). För att bygga saHMMerna används strukturbaserade sekvensöverlagringar, vilka genereras utifrån hur proteindomänerna kan läggas på varandra i rymden, snarare än utifrån vilka aminosyror som ingår i deras sekvenser. I varje proteinfamilj används bara ett särskilt, representativt urval av domäner. Dessa är valda så att då sekvenserna jämförs parvis, finns det inget par inom familjen med högre sekvensidentitet än ca 20%. Detta urval görs för att få så stor spridning som möjligt på sekvenserna inom familjen. En programvaruserie har utvecklats för att välja ut representanter för varje familj och sedan bygga saHMMer baserade på dessa. Det visar sig att saHMMerna kan hitta rätt familj till en hög andel av de testade sekvenserna, med nästan inga fel. De är också bättre än den ofta använda metoden Pfam på att hitta rätt familj till helt nya proteinsekvenser. saHMMerna finns tillgängliga genom FISH-servern, vilken alla kan använda via Internet för att hitta vilken familj ett intressant protein kan tillhöra. Den andra metoden som presenteras i avhandlingen är sekundärstruktur-HMMer, ssHMMer, vilka är byggda från vanliga multipla sekvensöverlagringar, men också från information om vilka sekundärstrukturer proteinsekvenserna i familjen har. När en proteinsekvens jämförs med ssHMMen används en förutsägelse om sekundärstrukturen, och den beräknade sannolikheten att sekvensen tillhör familjen kommer att baseras både på sekvensen av aminosyror och på sekundärstrukturen. Vid en jämförelse visar det sig att HMMer baserade på flera sekvenser är bättre än sådana baserade på endast en sekvens, när det gäller att hitta rätt veckning för en proteinsekvens. HMMerna blir ännu bättre om man också tar hänsyn till sekundärstrukturen, både då den riktiga sekundärstrukturen används och då man använder en teoretiskt förutsagd. / Jeanette Hargbo.
|
289 |
Geometric Build-up Solutions for Protein Determination via Distance GeometryDavis, Robert Tucker 01 August 2009 (has links)
Proteins carry out an almost innumerable amount of biological processes that are absolutely necessary to life and as a result proteins and their structures are very often the objects of study in research. As such, this thesis will begin with a description of protein function and structure, followed by brief discussions of the two major experimental structure determination methods. Another problem that often arises in molecular modeling is referred to as the Molecular Distance Geometry Problem (MDGP). This problem seeks to find coordinates for the atoms of a protein or molecule when given only a set of pair-wise distances between atoms. To introduce the complexities of the MDGP we begin at its origins in distance geometry and progress to the specific sub-problems and some of the solutions that have been developed. This is all in preparation for a discussion of what is known as the Geometric Build-up (GBU) Solution. This solution has lead to the development of several algorithms and continues to be modified to account for more and different complexities. The culmination of this thesis, then, is a new algorithm, the Revised Updated Geometric Build-up, that is faster than previous GBU’s while maintaining the accuracy of the resulting structure.
|
290 |
Training of Template-Specific Weighted Energy Function for Sequence-to-Structure AlignmentLee, En-Shiun Annie January 2008 (has links)
Threading is a protein structure prediction method that uses a library of template protein structures in the following steps: first the target sequence is matched to the template library and the best template structure is selected, secondly the predicted target structure of the target sequence is modeled by this selected template structure. The deceleration of new folds which are added to the protein data bank promises completion of the template structure library. This thesis uses a new set of template-specific weights to improve the energy function for sequence-to-structure alignment in the template selection step of the threading process. The weights are estimated using least squares methods with the quality of the modelling step in the threading process as the label. These new weights show an average 12.74% improvement in estimating the label. Further family analysis show a correlation between the performance of the new weights to the number of seeds in pFam.
|
Page generated in 0.0956 seconds