Spelling suggestions: "subject:"protein:protein interactions."" "subject:"proteinprotein interactions.""
401 |
Conception de ligands protéiques artificiels par ingénierie moléculaire in silico / Design of artificial protein binders by in silico molecular engineeringBaccouche, Rym 30 November 2012 (has links)
Les travaux réalisés portent sur la conception de ligands protéiques capables de cibler le site catalytique des métalloprotéases matricielles (MMPs) grâce à une méthode d’ingénierie développée au laboratoire qui repose sur le greffage de motifs fonctionnels. Le motif fonctionnel choisi correspond aux 4 résidus N-terminaux du TIMP-2, un inhibiteur naturel des MMPs. Des plates-formes protéiques possédant des motifs d’acides aminés dans une topologie similaire à celle du motif de référence dans le complexe TIMP-2/MMP-14 ont été identifiées par criblage systématique de la PDB à l’aide du logiciel STAMPS (Search for Three-dimensional Atom Motif in Protein Structure). Dix candidats ligands satisfaisant les contraintes topologiques, stériques et de similarité électrostatique avec le ligand naturel TIMP-2 ont été sélectionnés. Ces ligands ont été produits par synthèse chimique ou par voie recombinante puis leur capacité à inhiber une série de 6 MMPs a été évaluée. Les résultats indiquent que tous les ligands protéiques conçus in silico sont capables de lier les sites catalytiques des MMPs avec des constantes d’association allant de 450 nM à 590 mM, sans optimisation supplémentaire. La caractérisation structurale par diffraction X de 2 variants d’un de ces ligands protéiques a permis de montrer que les interactions établies par le motif 1-4 dans ces ligands étaient similaires à celles observées dans le complexe TIMP-2/MMP-14, avec cependant des différences dans la géométrie de certaines d’entre elles. Des études de simulation par dynamique moléculaire ont également permis de mettre en évidence de possibles différences dans la géométrie et la stabilité de certaines des interactions reproduites dans les 10 plates-formes, pouvant contribuer aux affinités modestes observées pour ces ligands. Cependant, les résultats obtenus montrent que la méthode de conception in silico utilisée est capable de fournir une série de ligands protéiques de 1ère génération ciblant de manière spécifique un site catalytique d’intérêt avec un bon rendement. Cette méthode pourrait constituer la 1ère étape d’une approche hybride de conception in silico de ligands combinée à des techniques de sélection expérimentales. / Artificial mini-proteins able to target catalytic sites of matrix metalloproteinases (MMPs) were designed using a functional motif grafting approach. The motif corresponded to the 4 N-terminal residues of TIMP-2, a broad-spectrum natural protein inhibitor of MMPs. Scaffolds able to reproduce the functional topology of this motif as described in the TIMP-2/MMP-14 complex were obtained by exhaustive screening of the Protein Data Bank (PDB) using the STAMPS software (Search for Three-dimensional Atom Motif in Protein Structure). Ten artificial protein binders satisfying all topologic, steric and electrostatic criteria applied for selection were produced for experimental evaluation. These binders targeted catalytic sites of MMPs with affinities ranging from 450 nM and 590 μM prior to optimization. The crystal structures of two artificial binders in complex with the catalytic domain of MMP-12 showed that the intermolecular interactions established by the functional motif in these artificial binders corresponded to those found in the TIMP-2/MMP-14 complex, albeit with some differences in their geometry. Molecular dynamics simulations of the 10 binders in complex with MMP-14 suggested that these scaffolds could allow reproducing in part the native intermolecular interactions, but some differences in geometry and stability could contribute to the lower affinity of the artificial protein binders as compared to the natural one. Nevertheless, these results show that the in silico design method used can provide sets of starting protein binders targeting a specific binding site with a good rate of success. This approach could constitute the first step of an efficient hybrid computational-experimental protein binder design approach.
|
402 |
Structural bioinformatics tools for the comparison and classification of protein interactionsGarma, L. D. (Leonardo D.) 08 August 2017 (has links)
Abstract
Most proteins carry out their functions through interactions with other molecules. Thus, proteins taking part in similar interactions are likely to carry out related functions. One way to determine whether two proteins do take part in similar interactions is by quantifying the likeness of their structures. This work focuses on the development of methods for the comparison of protein-protein and protein-ligand interactions, as well as their application to structure-based classification schemes.
A method based on the MultiMer-align (or MM-align) program was developed and used to compare all known dimeric protein complexes. The results of the comparison demonstrates that the method improves over MM-align in a significant number of cases. The data was employed to classify the complexes, resulting in 1,761 different protein-protein interaction types. Through a statistical model, the number of existing protein-protein interaction types in nature was estimated at around 4,000. The model allowed the establishment of a relationship between the number of quaternary families (sequence-based groups of protein-protein complexes) and quaternary folds (structure-based groups).
The interactions between proteins and small organic ligands were studied using sequence-independent methodologies. A new method was introduced to test three similarity metrics. The best of these metrics was subsequently employed, together with five other existing methodologies, to conduct an all-to-all comparison of all the known protein-FAD (Flavin-Adenine Dinucleotide) complexes. The results demonstrates that the new methodology captures the best the similarities between complexes in terms of protein-ligand contacts. Based on the all-to-all comparison, the protein-FAD complexes were subsequently separated into 237 groups. In the majority of cases, the classification divided the complexes according to their annotated function. Using a graph-based description of the FAD-binding sites, each group could be further characterized and uniquely described.
The study demonstrates that the newly developed methods are superior to the existing ones. The results indicate that both the known protein-protein and the protein-FAD interactions can be classified into a reduced number of types and that in general terms these classifications are consistent with the proteins' functions. / Tiivistelmä
Suurin osa proteiinien toiminnasta tapahtuu vuorovaikutuksessa muiden molekyylien kanssa. Proteiinit, jotka osallistuvat samanlaisiin vuorovaikutuksiin todennäköisesti toimivat samalla tavalla. Kahden proteiinin todennäköisyys esiintyä samanlaisissa vuorovaikutustilanteissa voidaan määrittää tutkimalla niiden rakenteellista samankaltaisuutta. Tämä väitöskirjatyö käsittelee proteiini-proteiini- ja proteiini-ligandi -vuorovaikutusten vertailuun käytettyjen menetelmien kehitystä, ja niiden soveltamista rakenteeseen perustuvissa luokittelujärjestelmissä.
Tunnettuja dimeerisiä proteiinikomplekseja tutkittiin uudella MultiMer-align-ohjelmaan (MM-align) perustuvalla menetelmällä. Vertailun tulokset osoittavat, että uusi menetelmä suoriutui MM-alignia paremmin merkittävässä osassa tapauksista. Tuloksia käytettiin myös kompleksien luokitteluun, jonka tuloksena oli 1761 erilaista proteiinien välistä vuorovaikutustyyppiä. Luonnossa esiintyvien proteiinien välisten vuorovaikutusten määrän arvioitiin tilastollisen mallin avulla olevan noin 4000. Tilastollisen mallin avulla saatiin vertailtua sekä sekvenssin (”quaternary families”) sekä rakenteen (”quaternary folds”) mukaan ryhmiteltyjen proteiinikompleksien määriä.
Proteiinien ja pienien orgaanisten ligandien välisiä vuorovaikutuksia tutkittiin sekvenssistä riippumattomilla menetelmillä. Uudella menetelmällä testattiin kolmea eri samankaltaisuutta mittaavaa metriikkaa. Näistä parasta käytettiin viiden muun tunnetun menetelmän kanssa vertailemaan kaikkia tunnettuja proteiini-FAD (Flavin-Adenine-Dinucleotide, flaviiniadeniinidinukleotidi) -komplekseja. Proteiini-ligandikontaktien osalta uusi menetelmä kuvasi kompleksien samankaltaisuutta muita menetelmiä paremmin. Vertailun tuloksia hyödyntäen proteiini-FAD-kompleksit luokiteltiin edelleen 237 ryhmään. Suurimmassa osassa tapauksista luokittelujärjestelmä oli onnistunut jakamaan kompleksit ryhmiin niiden toiminnallisuuden mukaisesti. Ryhmät voitiin määritellä yksikäsitteisesti kuvaamalla FAD:n sitoutumispaikka graafisesti.
Väitöskirjatyö osoittaa, että siinä kehitetyt menetelmät ovat parempia kuin aikaisemmin käytetyt menetelmät. Tulokset osoittavat, että sekä proteiinien väliset että proteiini-FAD -vuorovaikutukset voidaan luokitella rajattuun määrään vuorovaikutustyyppejä ja yleisesti luokittelu on yhtenevä proteiinien toiminnan suhteen.
|
403 |
Structural and Mechanistic Features of Protein Assemblies with Special Reference to SpliceosomeRakesh, Ramachandran January 2016 (has links) (PDF)
Macromolecular assemblies such as the ribosome, spliceosome, polymerases are imperative for cellular functions. The current understanding of these important machineries and many other assemblies at the molecular level is poor. The lack of structural data for many macromolecular assemblies further causes a bottleneck in understanding the cellular processes and the various disease manifestations. Hence, it is essential to characterize the structures and molecular architectures of these macromolecular assemblies.
Though the number of 3-D structures for individual proteins structures or domains in the Protein Data Bank (PDB) is growing, the number of structures deposited for macromolecular assemblies is relatively poor. Hence, apart from the use of experimental techniques for characterizing macromolecular assembly structures, the use of computational techniques would help in supplementing the growth of macromolecular assembly structures. This thesis deals with the use of integrative approaches where computational methods are combined with experimental data to model and understand the mechanistic features of macromolecular assemblies with a special focus on a sub-complex of the spliceosome machinery.
Chapter 1 of this thesis provides an introduction to protein-protein interactions and macromolecular assemblies. Further, the modelling of macromolecular assemblies using integrative methods are discussed, with a subsequent introduction to the spliceosome machinery.
In chapter 2, modelling studies were performed on the proteins involved in the general amino acid control mechanism, which is triggered in yeast under amino acid starvation conditions. The proteins involved in the study were Gcn1, a ribosome binding protein and the RWD-domain containing proteins Gcn2, Yih1, Gir2 and Mtc5. From laboratory experiments it is known that in order for Gcn2 activation, an eIF2α kinase, its RWD-domain has to bind to Gcn1 and the residue Arg-2259 is important for this interaction. As the 3-D structure for the Gcn1 region containing Arg-2259 is not currently available, its 3-D structure was inferred using fold recognition and comparative modelling techniques. Further, in order to understand the Gcn2 RWD domain-Gcn1 molecular interaction, a complex structure was inferred by using a restrained protein-protein docking procedure. As the proteins, Yih1 and Gir2 are known to bind to Gcn1 using their RWD-domains, first the structures of the RWD-domain containing proteins including Mtc5 were inferred using a Gcn2 RWD domain NMR structure. Additionally, the Gcn1-Gcn2 complex was used to build a set of complexes to explain the binding of other RWD domain containing proteins Yih1, Gir2 and Mtc5. The important molecular interactions were obtained on analysing the interacting residues in these complexes. Thus, the Gcn1-Gcn2 interaction at the molecular level has been proposed for the first time. Future experiments guided by the protein-protein complex models and the proposed set of mutations should provide an understanding about the critical molecular interactions involved in the general amino acid control mechanism.
Chapter 3 describes an integrative approach that was used to decipher a pseudo-atomic model of the closed form of human SF3b complex. SF3b is a multi-protein complex containing seven components – p14, SF3b49, SF3b155, SF3b145, SF3b130, SF3b14b and SF3b10. It recognizes the branch point adenosine in the pre-mRNA as part of U2 snRNP or U11/U12 di-snRNP in the spliceosome. Although, the cryo-EM map for human SF3b complex has been available for more than a decade, the structure and relative spatial arrangement of all components in the complex are not yet known. The integrative modelling approach used here involved utilizing structural data in the form of available X-ray and NMR structures, fold recognition and comparative modelling as well as currently available experimental datasets, along with the available cryo-EM density map to provide a model with high structural coverage. Hence, the molecular architecture of closed form human SF3b complex was derived that can now provide insights into the functioning of SF3b in splicing. This might also help the future high resolution structure determination efforts of the entire human spliceosome machinery
In chapter 4, the molecular architecture of the closed form of SF3b complex obtained from the use of integrative modelling approach (Chapter 3) is extensively discussed. The structure-function relationships for some of the SF3b components based on the pseudo-atomic model has also been provided. In addition, the extreme flexibility associated with some of the SF3b components based on dynamics analysis has also been examined. Further, using an existing U11/U12 di-snRNP cryo-EM map and the closed form SF3b complex pseudo-atomic model, an open form of the SF3b complex was modelled and the component structures were fit into it. Hence, it was found that the transition between closed and open forms is primarily caused by a flap containing the HEAT repeat protein, SF3b155. This Protein is also known to harbour cancer causing mutations and has the potential to affect the Closed to open transition as well as SF3b complex structure and stability. Thus, this provides a framework for the future understanding of the closed to open transition in SF3b functioning within the spliceosome.
Chapter 5 builds upon the integrative modelling approach (Chapter 3) that proposed the molecular architecture of the closed form of human SF3b complex and an open form of SF3b that was derived due to a flap opening of the closed form and which might help in accommodating RNA and other trans-acting factors within the U11/U12 di-snRNP (Chapter 4). In the current chapter, the SF3b open form and its interaction with the RNA elements is studied. The 5' end of U12 snRNA and its interaction with pre-mRNA in branch point duplex was modelled guided by the open form of SF3b that provided the necessary structural constraints and the RNA model is topologically consistent with the existing biochemical data. Further, utilizing the SF3b opens form-RNA model and the existing experimental knowledge, an extensive discussion has been provided on how the architecture of SF3b acts as a scaffold for U12 snRNA: pre-mRNA branch point duplex formation as well as its potential implications for branch point adenosine recognition fidelity. Moreover, the reasons for SF3b to be defined as a “fuzzy” complex - a complex with highly flexible folded regions along with intrinsically disordered regions is also discussed. Hence, the current work adds to the excellent developments made previously and deepens the understanding of the structure-function relationship of the human SF3b complex in the context of the spliceosome machinery.
In chapter 6, a methodology has been proposed for the use of evolutionary conservation of protein-protein interfacial residues in multiple protein cryo-EM density based fitting of the protein components in the low-resolution density maps of multi-protein assemblies. First, the methodology was tested on a dataset of simulated density maps generated at four different resolutions -10, 15, 20 and 25 Å. On utilizing the evolutionary conservation scores obtained from multiple sequence alignments to score the fitted complexes, it was found that there was a decrease in the conservation scores when compared to that of the crystal structures, which were used to generate the simulated density maps. Further, the assessment of the multiple protein density fitting technique to align the actual protein-protein interface residues correctly using a performance metric called F-measure showed there was a decrease in performance as the resolutions became poorer. Hence, based on evolutionary conservations scores as well as F-measure the decrease in conservation scores or performance was found to be mainly due to the errors associated with the fitting process.
Subsequently, a refinement methodology was designed involving the use of conservation scores, which improved the accuracy of the fitted models and the same, was observed in an experimental cryo-EM density test case of RyR1-FKBP12 complex. Hence, the conservation information acts as an effective filter to distinguish the incorrectly fitted structures and improves the accuracy of the fitting of the protein structures in the density maps. Thus, one can incorporate the conserved surface residues information in the current density fitting tools to reduce ambiguity and improve the accuracy of the macromolecular assembly structures determined using cryo-EM.
In the concluding chapter 7, the learnings on the structural and mechanistic features of protein assemblies obtained from the use of computational techniques and integration of experimental datasets is discussed. In chapter 2, the modelling of a binary macromolecular complex such as the Gcn1-Gcn2 complex was performed using computational structure prediction strategies to understand the molecular basis of its interaction. Due to the potential inaccuracies which can exist in computational modelling, the chapters 3 to 5 dealt with the use of integrative approaches, primarily guided by the cryo-EM map, in order to decipher the molecular architecture of the human SF3b complex in the closed and open forms as well as its contribution for branch point adenosine recognition. Based on the extensive experience gained in modelling of assemblies using cryo-EM data in the previous chapters, a new method has been proposed on the use of evolutionary conservation information to improve the accuracy of cryo-EM density based fitting. Hence, these studies have provided strategies for modelling macromolecular assemblies as well as a deeper understanding of its mechanistic features.
|
404 |
Analysis Of Structural And Functional Types Of Protein-Protein InteractionsNambudiry Rekha, * 02 1900 (has links) (PDF)
No description available.
|
405 |
Protein Structure Networks : Implications To Protein Stabiltiy And Protein-Protein InteractionsBrinda, K V 08 1900 (has links) (PDF)
No description available.
|
406 |
Probing Macromolecular Reactions At Reduced Dimensionality : Mapping Of Sequence Specific And Non-Specific Protein-Ligand lnteractionsGanguly, Abantika 03 1900 (has links) (PDF)
During the past decade the effects of macromolecular crowding on reaction pathways is gaining in prominence. The stress is to move out of the realms of ideal solution studies and make conceptual modifications that consider non-ideality as a variable in our calculations. In recent years it has been shown that molecular crowding exerts significant effects on all in vivo processes, from DNA conformational changes, protein folding to DNA-protein interactions, enzyme pathways and signalling pathways. Both thermodynamic as well as kinetic parameters vary by orders of magnitude in uncrowded buffer system as compared to those in the crowded cellular milieu. Ignoring these differences will restrict our knowledge of biology to a “model system” with few practical understandings. The recent expansion of the genome database has stimulated a study on numerous previously unknown proteins. This has whetted our thirst to model the cellular determinants in a more comprehensive manner. Intracellular extract would have been the ideal solution to re-create the cellular environment. However, studies conducted in this solution will be contaminated by interference with other biologically active molecule and relevant statistical data cannot be extracted out from it. Recent advances in methodologies to mimic the cellular crowding include use of inert macromolecules to reduce the volume occupancy of target molecules and the use of immobilization techniques to increase the surface density of molecules in a small volumetric region. The use of crowding agents often results in non-specific interaction and side-reactions like aggregation of the target molecules with the crowding agents themselves. Immobilization of one of the interacting partners reduces the probability of aggregation and precipitation of bio-macromolecules by restricting their degrees of freedom. Covalent linkage of molecules on solid support is used extensively in research for creating a homogeneous surface of bound molecules which can be interrogated for their reactivity. However, when it comes to biomolecules, direct immobilization on solid support or use of organic linkers often results in denaturation. The use of bio-affinity immobilization techniques can help us overcome this problem. Since mild conditions are needed to regenerate such a surface, it finds universal applicability as bio-memory chips. This thesis focuses on our attempts to design a physiologically viable immobilization technique for following rotein-protein/protein-DNA interactions. The work explores the mechanism for biological interactions related to transcription process in E. coli.
Chapter 1 deals with the literary survey of the importance and effects of molecular crowding on biological reactions. It gives a brief history of the efforts been made so far by experimentalists, to mimic macromolecular crowding and the methods applied. The chapter tries to project an all-round perspective of the pros and cons of different immobilization techniques as a means to achieve a high surface density of molecules and the advancements so far.
Chapter 2 deals with the detailed technicality and applicability of the Langmuir-Blodgett method. It discusses the rationale behind our developing this technique as an alternate means of bio-affinity immobilization, under physiologically compatible conditions. It then goes on to describe our efforts to follow the sequence-specific and sequential assembly process of a functional RNA polymerase enzyme with one immobilized partner and also explore the role of omega subunit of RNAP in the reconstitution pathway. This chapter uses the assembly process of a multi-subunit enzyme to evaluate the efficiency of the LB system as a universal two-dimensional scaffold to follow sequence-specific protein-ligand interaction.
Chapter 3 discusses the application of LB technique to quantitatively evaluate the kinetics and thermodynamics of promoter-RNA polymerase interaction under conditions of reduced dimensionality. Here, we follow the interaction of T7A1 phage promoter with Escherichia coli RNA polymerase using our Langmuir-Blodgett technique. The changes in mechanistic pathway and trapping of kinetic intermediates are discussed in detail due to the imposed restriction in the degrees of freedom of the system. The sensitivity of this detection method is compared vis-a-vis conventional immobilization methods like SPR. This chapter firmly establishes the universal application of LB technique as a means to emulate molecular crowding and as a sensitive assay for studying the effects of such crowding on vital biological reaction pathway.
Chapter 4 describes the mechanistic pathway for the physical binding of MsDps1 protein with long dsDNA in order to physically protect DNA during oxidative stress. The chapter describes in detail the mechanism of physical sequestering of non-specific DNA strands and compaction of the genome under conditions where a kinetic bottleneck has been applied. The data obtained is compared with results obtained in the previous chapter for the sequence-specific DNA-protein interaction in order to understand the difference in recognition process between regulatory and structural proteins binding to DNA.
Chapter 5 deals with the evaluation of the σ-competition model in E. coli for three different sigma factors (all belonging to the σ-70 family). Here again, we have evaluated the kinetic and thermodynamic parameters governing the binding of core RNAP with its different sigma factors (σ70, σ32and σ38) and performed a comparative study for the binding of each sigma factor to its core using two different non-homogeneous immobilization techniques. The data has been analyzed globally to resolve the discrepancies associated with establishing the relative affinity of the different sigma factors for the same core RNA polymerase under physiological conditions.
Chapter 6 summarizes the work presented in this thesis. In the Appendix section we have followed the unzipping of promoter DNA sequence using Optical Tweezers in an attempt to follow the temporal fluctuations occurring in biological reactions in real time and at a single molecule level.
|
407 |
Functionally Interacting Proteins : Analyses And PredictionMohanty, Smita 11 1900 (has links) (PDF)
Functional interaction of proteins is a broad term encompassing many different types of associations that are observed amongst proteins. It includes direct non-covalent interactions where the interacting proteins physically associate using an interface. There are also many protein-protein interactions where the proteins concerned are not involved in direct physical interactions but affect each other’s functions. Central focus of this thesis is to understand the various aspects of functionally interacting proteins. Chapter 1 of this thesis provides an introduction to functional interactions between proteins and discusses the key developments available in the literature. This chapter discusses the different types of functional associations observed commonly between proteins. Various approaches developed over time to elucidate such interactions have also been discussed. This chapter highlights how functional interactions between proteins have been helpful in understanding different cellular processes such as organization of metabolic pathways. The chapter emphasizes the importance of functional interactions between proteins, providing a motivation for development of methods with enhanced accuracy and sensitivity for the prediction of functional interactions. In this thesis, domain families which are found to co-exist in multidomain proteins have been used to understand and subsequently predict functional associations amongst proteins. Domains in proteins typically serve as modules associated with specific functions. There exist proteins with a single domain which describes the entire function of a protein, while there also exist proteins containing multiple domains, where various domains in unison describe the complete function of the multidomain protein. Therefore, by virtue of “guilt by association” domain families found together in multidomain proteins are functionally linked. This forms the basic premise for understanding functional association amongst proteins and is explained in great detail in the Introduction chapter. Using domain families which co-occur in multidomain proteins as the basis for functional association has many merits. First, as stated before, constituent domain families act as effective descriptors of function(s) of proteins. For example, members of SH3 domain family mediate protein-protein interactions by binding to regions with polyproline conformation irrespective of the multidomain protein in which it occurs. Thus, studies of domain families co-existing in multidomain proteins act as an accurate resource of functional associations between proteins. Also, assignment of domains to a protein relies on homology detection which has achieved a high level of reliability, thus, resulting in reasonably accurate prediction of functions. Such approaches enable exhaustive coverage of many diverse proteins including many multidomain proteins leading to detection of large numbers of functional associations between domains of multidomain proteins. Given the advantages attributed to functionally linked domain families in further understanding of functional associations, it is imperative to exhaustively enumerate all possible pairs of functionally linked domain families in multidomain proteins and study their various properties. This aspect is covered in the second chapter of the thesis.
In the second chapter, analysis of domain families which co-occur in multidomain proteins, termed as 'tethered domain families', has been reported. For this analysis, a large dataset of multidomain proteins was considered from a diverse set of fully sequenced genomes from many eukaryotic and prokaryotic organisms. In every multidomain protein, all possible pairs of unique domain family pairs have been considered and they are assumed to be under the same functional/evolutionary constraint. Thus, from the entire dataset of multidomain proteins, all possible
pairs of tethered domain families are obtained. For a given domain family, the number of other uniquely tethered families is referred to as the tethering number of a domain family. Therefore, tethering number of a domain family is an indicator of the diverse functional contexts in which a particular domain family is involved. Further analysis was carried out to understand various other attributes of domain families and its relation to tethering number. The results are summarized in the following points:
1) Distribution of tethering numbers of domain families in the entire dataset is found to be highly heterogeneous. Nearly 88% of domain families (10783 out of 12249 domain families) have tethering number of 10 or less and only 78 domain families show more than 100 unique associations. Further analysis reveals bias in functions of families showing high and low tethering numbers. The domain families with high tethering numbers are involved in processes such as signaling and protein-protein interactions. The domain families with low tethering numbers are often found to be involved in metabolic processes.
2) Differences are also observed in the type of organisms containing the domain families and their tethering numbers. Typically, domain families with high tethering numbers are ubiquitously found across almost all the kingdoms of life. In contrast, most of the domain families exclusively found in a kingdom have low tethering numbers. Furthermore, for the ubiquitously occurring domain families with high tethering numbers, the number of associations made and the type of associations are not strictly conserved across the kingdoms. Thus, the tethering preferences of such domain families vary across the kingdoms depending on their function. For instance, the protein kinase domain family which is a key regulator of signaling processes in eukaryotes, has a high tethering number in eukaryotes (270), and low tethering number in prokaryotes (96).
3) Tethering number of domain families is found to be correlated with the number of members (population) comprising a family. A Pearson correlation coefficient of 0.78 at a p-value ≤0.001 is obtained for the correlation between tethering number of domain families and their population.
4) Tethering numbers of domain families are also found to be well correlated with sequence and functional diversity within families. Thus, domain families with high tethering numbers comprise of members showing diversity in both sequence and functions.
Thus, the work presented in second chapter provides a framework for understanding the tethering preferences of domain families. The use of tethered domain families to identify functional association amongst proteins is the central theme of third and fourth chapters of this thesis. The use of tethered domain families for the prediction of functionally interacting proteins originates from the initial idea of “Rosetta stone” approach, which was proposed by Ouzounis and coworkers and Eisenberg and coworkers in 1999. Rosetta stone approach demonstrated the use of fused genes in predicting functional interaction. It stems from the observation that in many organisms, genes corresponding to proteins acting in a metabolic pathway are found fused in another organism. Thus, enumeration of 'fused genes' in a template database could provide a good basis for prediction of functionally interacting proteins in target organisms in which the homologous genes are not found to be fused. The method has been shown, by others, to work quite effectively in prokaryotes, especially in the identification of interactions between metabolic proteins. Chapter 3 of this thesis explores the idea of “Rosetta stones” at the level of domain families, by considering tethered domain families as analogs to the fused genes. In this analysis, tethered domain families derived from multidomain proteins comprises the template dataset. If members of two domain families occurring in a multidomain protein are found to occur independently in two different proteins in the target organism then an interaction is predicted between these two proteins (collection of such predicted interactions is henceforth referred as TEDIP database, Tethered Domain-based Interaction Prediction). During this analysis, care is taken such that none of the proteins in the template dataset belongs to the target organisms. The entire analysis has been conducted on 6 model organisms which act as the target dataset where functional interactions between proteins are predicted. The effectiveness of tethered domain families in functional interaction prediction is compared with two other datasets 1) all experimentally known interactions and 2) interactions predicted on the basis of their homology with interacting domain families with known structure. Subsequently, an attempt has been made to answer these questions: 1) how effective is the information on tethered domain families in predicting functional linkages amongst proteins operating in pathways in eukaryotic organisms? 2) what is the false positive rate of the predictions? The above mentioned datasets show very little overlap in the coverage of functional interactions. This is largely attributed to insufficient sampling and inherent bias existing in each of the methods. The TEDIP datasets in the six organisms led to an average three-fold more functional interaction predictions in cellular pathways than the other two datasets. Nearly 90% of the predicted interactions derived from tethered domain families are amongst proteins across different pathways. In yeast, more than 60% of such interactions were found to be overlapping with a recent large scale genetic interaction screen based on synthetic lethality especially performed for metabolic proteins, thus establishing the effectiveness of this approach in understanding pathway crosstalk. Along with efficacy in identifying functional interactions, an assessment based on co-localization, co-expression and overall functional similarity based on Gene Ontology (GO) terms was carried out. It was found that the TEDIP predictions and experimentally found interactions show poor correspondence with co-expression and co-localization data (10% and 20% respectively for the two methods). Additionally, it was found that functional similarity between predicted interacting proteins in TEDIP dataset is low (5%) and is comparable to experimentally known interactions that shows 10% similarity in functions based on a scoring function for GO term similarity. From Chapter 3, it was concluded that the use of tethered domain families is effective in exhaustive enumeration of functionally associated proteins. However, the low co-expression and functional similarity measures are a cause for concern. On the one hand, co-expression and GO functional similarity have been found to be weak predictors of functional interactions, explaining the low values obtained for both predictions in the TEDIP datasets and experimentally known interactions. On the other hand, the poorer values shown for predictions in the TEDIP datasets suggest that further improvement in prediction accuracy is possible. Chapter 4 explores the use of machine learning in improving the accuracy of functional interaction prediction based on TEDIP dataset.
In Chapter 4, two distinct machine learning approaches have been employed on a training dataset derived exclusively from yeast. Since the objective of the work is to improve the accuracy of prediction of functional interactions, the GO based functional similarity measures have been used to define positive and negative datasets. Thus, in the training dataset, positive interactions comprises of protein pairs which show high GO similarity in functions as defined in chapter 3 and 10% of this data overlaps with experimentally known interactions, while the negative dataset consists of protein pairs with no or insignificant similarity in their functions and additionally do not show similarity to any experimentally known interactions. Two machine learning approaches, namely Support vector machine (SVM) and Random forest, have been used on this training dataset. Use of two distinct approaches helps in addressing the weakness, if any, of these methods. Fourteen carefully chosen features have been utilized during the training process to aid in the process of distinguishing potentially correctly predicted interactions from incorrect predictions. Out of 14 features, some of the features chosen for the analysis are involved in quantifying the extent of similarity between the template proteins containing the fused domain families and the target protein pairs predicted to interact. The analysis also incorporates graph theory based parameters which are derived from a domain family based graph. In such a graph, each of the domain families which are involved in forming multidomain proteins represents the nodes and an edge is constructed between domain families which are found to co-exist in at least one multidomain protein. Graph theory based parameters such as clustering coefficient, degree and topological overlap have been employed. These are useful in down weighting appropriately the domain family pairs showing large number of associations which are expected to be promiscuous in their functions. These features also enable in identifying domain family pairs which are functionally related. Apart from the above mentioned features, coevolution and phylogenetic profiling of tethered domain families is also utilized to identify functionally related domain family pairs. Utilizing all these features in training, the machine learning approach yielded an accuracy of 94% using SVM and 92% using Random forest against the training data. Furthermore, the importance of using all these features has been addressed by performing principle component analysis, training both SVM and Random forest by removing one feature at a time and by quantifying the sensitivity by using only one feature. All of these suggest that the features used provide non-redundant information and contributed significantly to the classification. The models so generated were finally used on all the predicted functional interactions after the removal of the training dataset in yeast. The true positives observed were 56% using SVM and 63% using Random forest with around 80% of the interactions common between the two methods. Further analysis has been carried out on these interactions by first imparting a confidence score to these interactions using support vector regression that provides a probabilistic measure for SVM classification. Based on a cutoff of 0.5, 62455 interactions in total were termed as high confidence interactions. Further analysis was carried out for the high confidence interactions. Out of these, in 2855 interactions, both the proteins predicted to interact could be associated with a pathway in KEGG database. In-depth case studies have been performed on this dataset of 2855 interactions. Literature mining suggested that many known cross-pathway interactions such as between TCA and glycolysis are captured as high confidence interactions using TEDIP dataset. A few other case studies of high confidence interactions with supporting literature evidence are also presented in the chapter. These predictions could further aid in experimental characterization of pathway cross-talk between important metabolic and signaling pathways.
So far, the thesis discussed analyses involving functional interactions and their prediction. In the subsequent chapters, analyses pertaining to two different types of functional interactions are discussed. Chapters 5 and 6 involve analyses incorporating metabolic proteins in diverse pathways in the pathogenic organism Plasmodium falciparum. Chapter 5 attempts to improve the coverage of the repertoire of metabolic proteins in P.falciparum while in Chapter 6 interactions and pathways prevalent in different stages in the life cycle of the parasite are deciphered and discussed. Apart from functionally interacting proteins in metabolic pathways, physically and transiently interacting proteins have been analyzed and discussed in Chapters 7 and 8. In Chapter 5, metabolic proteins participating in pathways in Plasmodium falciparum have been analyzed. P.falciparum is the causative agent of malaria, a disease which affects large populations in the subtropical regions. P.falciparum genome is atypical and is rich in Adenine/Thymine pairs, and there is presence of large stretches of amino acid repeats encoded in protein coding regions. Various sequence-related features of P.falciparum proteins when compared with those of other organisms show extensive divergence. All of these have made reliable function prediction, by homology to other proteins with known functions, daunting. Like other proteins in P.falciparum, metabolic proteins have also diverged significantly from their functional counterparts in model eukaryotes such as yeast. Metabolic pathways play an important role in the survival of the organism and hence are amenable towards the identification of proteins susceptible to drugs, thereby combating pathogenesis. Chapter 5 of the thesis aims at furthering knowledge pertaining to metabolic proteins by first quantifying the extent of divergence observed in the already characterized metabolic proteins. This knowledge is further used in identification of potential metabolic proteins which are not identified as proteins involved in metabolic pathways by other annotation efforts undertaken for P.falciparum. In the first part of the chapter, the extent of divergence in the sequences of metabolic proteins in P.falciparum has been determined by comparing the P.falciparum proteins with their functional counterparts from 34 completely sequenced unicellular eukaryotic organisms. Comparison of domain architectures between the P.falciparum proteins with their functional counterparts reveals that in nearly 54% of metabolic pathways, proteins show nearly the same domain architecture as the other functional counterparts. Inversion, deletion and duplication of domains are observed in rest of the proteins. Further analysis reveals that P.falciparum proteins are longer than their functional counterparts. It was also observed in nearly 15% of the cases, the domains are characterized by the presence of large non-conserved or plasmodium genus specific inserts within the domain assigned regions. There is also prevalence of unassigned regions in the N- and C- terminal regions in P.falciparum proteins when compared with their functional counterparts. Finally, it was also observed that metabolic proteins of P.falciparum show significantly low sequence similarity when compared with other functional counterparts. From this analysis, it can be clearly seen that metabolic proteins of P.falciparum have significantly diverged from such proteins in other organisms, thus making function prediction by homology very difficult.
There are several steps in metabolic pathways in P.falciparum which are expected to be active based on experimental analysis. However, some of these proteins with expected functions have not been identified so far. One of the reasons for this apparent incompleteness is the high divergence observed in the metabolic proteins of P. falciparum. To overcome this limitation, in the second part of the chapter, a sensitive approach based on domain family assignment (MulPSSM), developed in-house, has been used to identify proteins which are potentially involved in metabolic pathways. The approach is based on reverse PSI–BLAST, where multiple sequence profiles for each family are used to search against sequence databases. This approach has been shown to be better or at-par with other remote homology detection procedures. Using this approach, 15 P. falciparum proteins have been identified which can potentially function as metabolic proteins and were not characterized in P.falciparum so far.
All the proteins identified by the approach show low sequence similarity to other well characterized proteins and contain significant fractions of unassigned regions thus, making function recognition non-trivial. Supporting literature and other data is provided to demonstrate the robustness of the homology-based annotation of the identified pathway proteins. Chapter 6 is an analysis of the dynamic changes occurring in the metabolic network of P.falciparum during its life cycle. In this chapter, two aspects of P. falciparum metabolic proteins have been integrated and analyzed. First, the dataset of protein-protein interactions derived from experimental studies and second, the datasets of microarray analysis providing information on stage specific expression of P. falciparum genes corresponding to the metabolic proteins. As a first step, protein-protein interaction information for the metabolic proteins was gathered. A total of 810 interactions have been obtained, where one or both proteins are involved in a pathway. Subsequently, these interactions were compared with 14070 interactions involving metabolic proteins from free-living and non-pathogenic unicellular eukaryote yeast. Comparison across the two organisms shows wide discrepancy in the number of proteins involved in interactions and also the pathways in which they participate. Out of the 810 interactions in P.falciparum, 173 are found uniquely in plasmodium where both or one of the protein have no identifiable homolog in yeast. Insufficient sampling of interactions made by proteins in P.falciparum in comparison to yeast, is one of the reasons for the observed discrepancy. However, the differences due to the parasitic lifestyle of P.falciparum could also be a potential reason. Further analysis of the protein-protein interactions by the metabolic proteins revealed that a large fraction of interactions are made between a metabolic protein and a non-metabolic protein. For instance, interaction observed between glycolytic protein phospoglycerate kinase with MAP kinase. This trend is observed in both plasmodium and yeast where 65% and 77% of the interactions, respectively, involve proteins not directly participating in metabolic pathways. Further, interactions between proteins belonging to different pathways and lastly, interactions between proteins in the same pathway are uncovered. All of these interactions depict the different modes by which metabolic pathways are regulated through protein-protein interactions. Another aspect explored in this analysis is the stage specific expression of genes encoding these metabolic proteins. The analysis is especially relevant in the parasite because its entire life cycle is divided into seven distinct stages. Upon integrating the protein-protein interactions with the gene expression data, it became apparent that the trophozoite, schizont and gametocyte stages show large fractions of co-expressed genes encoding proteins involved in protein-protein interactions within metabolic pathways. The high preponderance of co-expressed genes encoding for interacting protein pairs in these stages is also consistent with metabolic requirement of plasmodium in the various stages. Glycolytic pathway is central to energy production in the parasite and is discussed at length in this chapter. Members of this pathway are involved in interactions with other glycolytic proteins (9 such interactions), they also interact with proteins involved in other pathways (30 interactions) and with proteins not involved directly in any metabolic pathway (75 interactions). Nearly 70% of the interactions made by the glycolytic proteins are encoded by genes found to be co-expressed across the various stages. Integration of gene expression data along with protein-protein interaction information for metabolic pathways such as the glycolytic pathway thus, highlights the complex mode of regulation underlying this pathway. The analysis carried out in this chapter emphasizes on the intricacies involved in the regulation of metabolic proteins in P.falciparum.
Chapter 7 describes an in-depth analysis carried out to understand the basis for interaction specificity between small monomeric GTPases and their regulators, the Guanine nucleotide Exchange Factors (GEFs). Monomeric GTPases are involved in binding to guanine nucleotide. These proteins can bind to both GTP and GDP. However, transition from GDP bound to GTP bound form occurs with large conformational changes and requires binding of the GEFs. The conformational changes that arise due to the nucleotide exchange are required for the GTPases to bind to its various effectors. For the analysis carried out in Chapter 7, GTPases belonging to the Ras superfamily have been considered. The superfamily is further subdivided into 5 distinct families based on their functions. The 5 families are Ras, Ran, Rab, Arf and Rho. Members belonging to each of these families are involved in a wide array of cellular processes such as signaling and cytoskeletal remodeling. Members of each of these GTPase families bind to structurally distinct GEFs, and in some cases, multiple GEFs are involved in nucleotide exchange within a family. It is intriguing therefore, to understand how GTPases belonging to the same structural family maintain specificity across the highly dissimilar GEFs and this forms the main objective of this analysis.
So far, 13 distinct complexes between GTPases and their cognate GEFs have been solved using X-ray crystallography. This set of structural complexes forms the starting point of the analysis. As a first step, pairwise structural comparison of the interfaces has made between various pairs of complex structures. Based on these comparisons, it is apparent that most of the interfaces in the GTPase and GEF complexes comprise of residue positions which are topologically not equivalent suggesting different modes of binding across these complexes. Further analysis was carried out to probe the extent of specificity underlying these complexes. This is achieved by determining interface residues which are found to be conserved in a family specific manner. Such residue positions have been obtained by using a statistically robust algorithm Contrast Hierarchical Alignment and Interaction Network (CHAIN) that extracts sequence patterns most distinguishing two sets of homologous sequences. The analysis indicated the presence of family specific residues at the GTPase and GEF interface. Such residues could be implicated in maintaining the specific interactions between the GTPases and the GEFs. The robustness in the specificity of the interactions was further interrogated by providing an energetic basis to the specificity in the interactions mediated by the cognate GTPases and the GEFs and also understanding how crosstalk is prevented across the non-cognate complexes. For each of the 13 cognate complexes, empirical interaction energies have been estimated using FoldX. The interaction energy is compared to non-cognate complexes which are obtained by swapping the interface residues of the cognate GTPase with the non-cognate GTPase residues. For most of the complexes, it was observed that the interaction energies for the cognate complexes are much lower than the non-cognate complexes. Energy values across the non-cognate complexes are usually indicative of reduced stability, thereby precluding such interactions from occurring. Such large energy differences between cognate and non-cognate interactions arise due to drastic substitutions at the interface patch due to difference in the charge or other stereochemical aspects of the amino acids. Both evolutionary and energy based analysis indicates the presence and importance of few family specific residues in the cognate complexes and also the presence of unfavorable residues in the non-cognate complexes thus preventing crosstalk. However, apart from changes at the interfaces, many positions outside the interface also undergo changes across the various homologs within the same family/subfamily of GTPase. Coevolutionary analysis of GTPase and GEFs from multiple eukaryotic organisms has been carried out in these complexes and it was observed that most of the coevolving
positions are not found at the interface. Many of these residue positions are near the active site or near the interface. Identification of such coevolving positions, where residue variations in the GTPase are strongly coupled to the GEF, may provide initial clues to the possible allosteric path adopted in connecting the binding of GEF to the vast structural changes observed during GTP exchange in GTPases. Thus, the analysis provides a comprehensive framework to understand how interaction specificity has evolved between the GTPase and GEF complexes. Chapter 8 discusses another example of transient protein-protein interaction observed between proteins implicated in signaling process in Dictyostelium discoideum. The work reported in this chapter was carried out in collaboration with Prof. Nanjundaiah and coworkers from Molecular Reproduction and Developmental Genetics department, Indian Institute of Science. All the experimental analyses mentioned in this chapter were carried out by Prof. Nanjundaiah and coworkers and the author carried out all the computational analysis. Experimental analysis indicated the presence of a ribosomal protein S4 in D. discoideum which mediates interactions with CDC24 and CDC42. The protein is speculated to be a functional analog of yeast scaffolding protein Bem1. However, the exact structural and sequence features of the protein which can accommodate its non-ribosomal function as a scaffold by mediating protein-protein interactions are not clearly understood. With the aid of structural modeling, a 3-D structure was generated for the C-terminal regions of D. discoideum protein S4. The modeled structure, as in the template used for modelling, resembled the fold of SH3 domain which has been shown to be involved in protein-protein interactions. Structural and sequence analyses were carried out to evaluate the potential mode by which interactions could be mediated by this protein. The hypothesis generated was further corroborated by experimental analysis. Thus, both experimental and computational analysis provide evidence for the functional role of the ribosomal protein S4 from Dictyostelium discoideum as a scaffold. Chapter 9 summarizes the conclusions reached in various chapters of the thesis. The thesis embodies analyses probing various aspects of functional interactions between proteins. A frame work has been provided to elucidate functional interactions using tethered domain families in multidomain proteins. Further, the role of these functional interactions have been explored in different scenarios by exhaustively analyzing metabolic proteins and their regulation in pathogenic organism Plasmodium falciparum and by also analyzing two distinct types of transient protein-protein interactions.
|
408 |
Rational Structure-Based Rescaffolding Approach to De Novo Design of Interleukin 10 (IL-10) Receptor-1 MimeticsRuiz-Gómez, Gloria, Hawkins, John C., Philipp, Jenny, Künze, Georg, Wodtke, Robert, Löser, Reik, Fahmy, Karim, Pisabarro, M. Teresa 06 January 2017 (has links) (PDF)
Tackling protein interfaces with small molecules capable of modulating protein-protein interactions remains a challenge in structure-based ligand design. Particularly arduous are cases in which the epitopes involved in molecular recognition have a non-structured and discontinuous nature. Here, the basic strategy of translating continuous binding epitopes into mimetic scaffolds cannot be applied, and other innovative approaches are therefore required. We present a structure-based rational approach involving the use of a regular expression syntax inspired in the well established PROSITE to define minimal descriptors of geometric and functional constraints signifying relevant functionalities for recognition in protein interfaces of non-continuous and unstructured nature. These descriptors feed a search engine that explores the currently available three-dimensional chemical space of the Protein Data Bank (PDB) in order to identify in a straightforward manner regular architectures containing the desired functionalities, which could be used as templates to guide the rational design of small natural-like scaffolds mimicking the targeted recognition site. The application of this rescaffolding strategy to the discovery of natural scaffolds incorporating a selection of functionalities of interleukin-10 receptor-1 (IL-10R1), which are relevant for its interaction with interleukin-10 (IL-10) has resulted in the de novo design of a new class of potent IL-10 peptidomimetic ligands.
|
409 |
Rational Structure-Based Rescaffolding Approach to De Novo Design of Interleukin 10 (IL-10) Receptor-1 MimeticsRuiz-Gómez, Gloria, Hawkins, John C., Philipp, Jenny, Künze, Georg, Wodtke, Robert, Löser, Reik, Fahmy, Karim, Pisabarro, M. Teresa 06 January 2017 (has links)
Tackling protein interfaces with small molecules capable of modulating protein-protein interactions remains a challenge in structure-based ligand design. Particularly arduous are cases in which the epitopes involved in molecular recognition have a non-structured and discontinuous nature. Here, the basic strategy of translating continuous binding epitopes into mimetic scaffolds cannot be applied, and other innovative approaches are therefore required. We present a structure-based rational approach involving the use of a regular expression syntax inspired in the well established PROSITE to define minimal descriptors of geometric and functional constraints signifying relevant functionalities for recognition in protein interfaces of non-continuous and unstructured nature. These descriptors feed a search engine that explores the currently available three-dimensional chemical space of the Protein Data Bank (PDB) in order to identify in a straightforward manner regular architectures containing the desired functionalities, which could be used as templates to guide the rational design of small natural-like scaffolds mimicking the targeted recognition site. The application of this rescaffolding strategy to the discovery of natural scaffolds incorporating a selection of functionalities of interleukin-10 receptor-1 (IL-10R1), which are relevant for its interaction with interleukin-10 (IL-10) has resulted in the de novo design of a new class of potent IL-10 peptidomimetic ligands.
|
410 |
Peptides-beta/gamma mixtes : nouveaux édifices foldamères pour mimer l'hélice-alpha / Beta/gamma-Peptide manifolds designed as alpha-helix mimeticsGrison, Claire 23 November 2015 (has links)
Cette thèse est consacrée à la synthèse et à l'étude structurale de peptides-beta/gamma, contenant en alternance des acides aminés-beta et -gamma, conçus pour mimer l'hélice-alpha (ou hélice-13), structure secondaire des protéines. Nous avons ainsi élaboré une stratégie de design « bottom-up » pour des peptides-beta/gamma devant se replier sous forme d'hélice-13. Ces peptides comportent un acide aminé-beta, le (1S,2S)-trans-2-aminocyclobutanecarboxylique, qui joue un rôle clé de brique constitutive en apportant des contraintes conformationnelles. Dans un premier temps, la synthèse énantiomériquement pure du trans-ACBC basée sur une étape clé de photocycloaddition [2+2] a été optimisée. Il a alors été possible de synthétiser des peptides-beta/gamma incorporant en alternance le trans-ACBC et le GABA, qui est un acide aminé-gamma dépourvu de toute contrainte. Des études expérimentales et théoriques fines de ces peptides-beta/gamma ont révélé une structuration inédite sous forme de rubban-9/8, en solution. Il a été démontré que ces nouveaux foldamères adoptent une forme plus ou moins courbe gouvernée par un code combinant configuration et conformation des acides aminés constitutifs de ces peptides. Dans un deuxième temps, des contraintes sur l'acide aminé-gamma ont été introduites par la préparation de peptides-beta/gamma alternant le trans-ACBC et des acides aminés-gamma4. Des études expérimentales et théoriques de ces peptides-beta/gamma en solution ont révélé une préférence conformationnelle sous forme d'hélice-13. La stabilité de cette structure hélicoïdale augmente avec la longueur de la chaîne peptidique. Ces hélices-13 sont en effet fortement stabilisées à partir de 5 liaisons hydrogènes inter-résidus. Enfin, des peptides-alpha/beta/gamma capables de mimer l'hélice-alpha du peptide p53(15-31) ont été conçus et synthétisés, afin de vérifier expérimentalement leur hélicité prédite par modélisation moléculaire. Une fois leur résistance à la dégradation protéolytique démontrée, ces peptides-alpha/beta/gamma ont été testés comme inhibiteur de l'interaction p53/hDM2. Un candidat a particulièrement été capable d'inhiber cette interaction en se liant au site naturel de fixation avec la protéine hDM2. Ce résultat illustre la réussite de notre stratégie de construction de mimes de l'hélice-alpha. / This thesis is devoted to the synthesis and the structural characterisation of beta/gamma-peptides, constructed from beta- and gamma-amino acids in alternation, designed to mimic the alpha-helix secondary structure which is present in many native proteins. The alpha-helix can be defined as a 13-helix and a bottom-up foldamer design strategy to target a 13-helical structure was examined, whereby beta/gamma-peptides were proposed in which (1S,2S)-trans-2-aminocyclobutanecarboxylic acid (trans-ACBC) was incorporated as a conformationally-restricted beta-amino acid component. The scalable synthesis of enantiomerically pure trans-ACBC using a [2+2] photocycloaddition strategy was successfully optimized. beta/gamma-Peptides incorporating trans-ACBC and GABA, the latter being the gamma-amino acid component devoid of any constraint, were then synthesised. Experimental and theoretical investigations of their solution-state folding behaviour revealed an unprecedented 9/8-ribbon foldamer structure that adopts curved shapes governed by a combined configuration-conformation code. Additional constraints on the gamma-amino acid component were then considered and beta/gamma-peptides incorporating trans-ACBC and gamma4-amino acids were synthesised. Experimental and theoretical investigations of these beta/gamma-peptides in solution unveiled a preference for 13-helix folding behaviour, which increased commensurately with the peptide chain length; robust 13-helices were stabilised by a minimum of five intramolecular hydrogen bonds. In the last part of this thesis, molecular modelling was used to design helical alpha/beta/gamma-peptides intended to reproduce as closely as possible the hot-spot residues of the known alpha-helical peptide sequence p53(15-31). These peptides were synthesised and their predicted helical folding was verified experimentally along with their resistance to proteolytic enzymes. The alpha/beta/gamma-peptides were tested as inhibitors of the p53/hDM2 interaction. One peptide was found to behave as potent inhibitor and to bind to the native peptide binding pocket of the hDM2 protein, providing a successful proof of concept of the alpha-helix mimetic design strategy.
|
Page generated in 0.1635 seconds