• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Large-Scale Structural Analysis of Protein-ligand Interactions : Exploring New Paradigms in Anti-Tubercular Drug Discovery

Anand, Praveen January 2015 (has links) (PDF)
BIOLOGICAL processes are governed through specific interactions of macromolecules. The three-dimensional structural information of the macromolecules is necessary to understand the basis of molecular recognition. A large number of protein structures have been determined at a high resolution using various experimental techniques such as X-ray crystallography, NMR, electron microscopy and made publicly available through the Protein Data Bank. In the recent years, comprehending function by studying a large number of related proteins is proving to be very fruitful for understanding their biological role and gaining mechanistic insights into molecular recognition. Availability of large-scale structural data has indeed made this task of predicting the protein function from three-dimensional structure, feasible. Structural bioinformatics, a branch of bioinformatics, has evolved into a separate discipline to rationalize and classify the information present in three-dimensional structures and derive meaningful biological insights. This has provided a better understanding of biological processes at a higher resolution in several cases. Most of the structural bioinformatics approaches so far, have focused on fold-level analysis of proteins and their relationship to sequences. It has long been recognized that sequence-fold or fold-function relationships are highly complex. Information on one aspect cannot be readily extrapolated to the other. To a significant extent, this can be overcome by understanding similarities in proteins by comparing their binding site structures. In this thesis, the primary focus is on analyzing the small-molecule ligand binding sites in protein structures, as most of the biological processes ranging from enzyme catalysis to complex signaling cascades are mediated through protein-ligand interactions. Moreover, given that the precise geometry and the chemical properties of the residues at the ligand binding sites dictate the molecular recognition capabilities, focusing on these sites at the structural level, is likely to yield more direct insights on protein function. The study of binding sites at the structural level poses several problems mainly because the residues at the site may be sequentially discontinuous but spatially proximal. Further, the order of the binding site residues in primary sequence, in most of cases has no significance for ligand binding. Compounding these difficulties are additional factors such as, non-uniform contribution to binding from different residues, and size-variations in binding sites even across closely related proteins. As a result, methods available to study ligand-binding sites in proteins, especially on a large-scale are limited, warranting exploration of new approaches. In the present work, new methods and tools have been developed to address some of these challenges in binding site analysis. First, a novel tool for site-based function annotation of protein structures, called PocketAnnotate was developed ( http://proline.biochem.iisc.ernet. in/pocketannotate/). PocketAnnotate, detects the putative binding sites from a given protein structure and compares them to known binding sites in PDB to derive functional annotation in terms of ligand association. Since the tool derives functional annotation at the level of binding sites, it has an advantage over other methods that solely utilize fold or sequence information. This becomes even more important for cases where there is no detectable homology with entries in existing databases, as Pocket Annotate does not depend on evolutionary based information for annotation. Second, a web-accessible tool for in silico almandine scanning mutations of binding site residues called ABS-Scan has been developed ( http://proline.biochem.iisc.ernet.in/abscan/). This tool helps in assessing the contribution of the individual residues of binding sites in the protein towards ligand recognition. All residues, one at a time, in a binding site are mutated systematically to an alanine and the ability of the corresponding mutant to bind a given ligand is analyzed. The contribution of each residue towards ligand binding is calculated through a G value derived by comparing the binding affinity to the wild-type protein-ligand complex. Third, a database called Protein-Ligand Interaction Clusters (PLIC) has been developed to identify and analyze the information of similarity across binding sites in PDB, which has been provided in the form of a web-accessible database ( http://proline.biochem.iisc.ernet/ PLIC). Protein-ligand interactions are primarily explored using three different computational approaches - (i) binding site characteristics including pocket shape, nature of residues and interaction profiles with different kinds of chemical probes, (ii) atomic contacts between protein and ligands (iii) binding energetics involved in interactions derived from scoring functions developed for docking. The information on variations in these features derived from different computational tools is also included in the database for enabling the characterization of the binding sites. As a case study to demonstrate the usefulness of these tools, they have been applied to decipher the complexity of S-adenosyl methionine interactions with the protein. Around 1,213 binding sites of SAM or SAM-like compounds could be extracted from the PLIC database. The SAM or SAM-like compounds were observed to interact with ∼18 different protein-fold types. The variations in different protein-ligand contacts across fold types were analyzed. The fold-specific interaction properties and contribution of individual residues towards SAM binding are identified. The tools developed and example analyses using them are described in Chapter 2. Chapter 3 describes a large-scale pocketome analysis from structural complexes in PDB, in an effort to characterize the known pocket space of protein-ligand interactions. Tools devel-opted as described in Chapter 2 are used for this. A set of 84,846 binding sites compiled from PDB, have been comprehensively analyzed with an objective of obtaining (a) classification of binding sites, (b) sequence-fold-site relationships among proteins, (c) a minimal set of physicochemical attributes sufficient to explain ligand recognition specificity and (d) site-type specific signatures in terms of physicochemical features. A new method to describe binding sites was developed in the form of BScIds such that the structural fold information is well captured. Binding sites and similarities among them were abstracted in the form of networks where each node represents a binding site and an edge between two nodes represents significant similarity between the sites at the structural level. Pocketome networks were constructed from the large-scale information on protein-ligand interactions in the PLIC database. The large pocketome network was then studied to derive relationships between protein folds and chemical entities they interact with. A classification of the binding pockets was achieved by analyzing the pocketome network using graph theoretical approaches combined with clustering methods. 10,858 clusters were identified from the network, each indicating a site-type. Thus, it can be said that there are about 10,858 site-types. Classification of ligand associations into specific site-types helps greatly in resolving the complex relationships by yielding specific site-type ligand associations. The observed classification was further probed to understand the basis of ligand recognition by representing the pockets through feature vectors. These features capture a wide range of physicochemical properties that can be used to derive site-type specific signatures and explore the pocket-space of protein-ligand interactions. A principal component analysis of these features reveals that binding site feature space is continuous in the entire PDB and minor changes in specific features can give rise to significant differences in ligand specificity, consequently defining their distinct functional roles. The weights were also derived for these features through the use of different information theoretic approaches to explain the multiple-specificity of protein-ligand interactions. Analysis of binding sites arising from contribution of residues from different protein fold-types revealed increasing diversity of physicochemical properties at the site, supporting the hypothesis that combination of folds could give rise to new binding sites. Given that a finer appreciation of the molecular mechanisms within the cell is possible only with the structural information, the next objective was to explore if a structural view of an entire proteome can be obtained and if a pocketome could be constructed and analyzed. With this in mind, the causative agent of tuberculosis - Mycobacterium tuberculosis (Mtb) was chosen. Mtb is also being studied in the laboratory from a systems biology perspective, which enabled exploration of how systems and the structural perspectives could be combined and applied for drug discovery. Chapters 4 to 6 describe this effort. The genome sequence of Mycobacterium tuberculosis (Mtb) H37Rv, indicates the presence of ∼4,000 protein coding genes, of which experimentally determined structures are available for ∼300 proteins. Further, advances in homology modeling methods have made it feasible to obtain structural models for many more proteins in the proteome. Chapter 4 describes the efforts for obtaining the Mtb structural proteome, through which the three-dimensional struc-tures were derived for ∼70% of the proteins in the genome. Functional annotation of each protein was derived based on fold-based functional assignments, binding-site comparisons and consequent ligand associations. PocketAnnotate, a site-based function annotation pipeline was utilized for this purpose and is described in Chapter 2. Besides these, the annotation covers detection of various sequence and sub-structural motifs and quaternary structure predictions based on the corresponding templates. The study provides a unique opportunity to obtain a global perspective of the fold distribution in the genome. The annotation indicates that cellular metabolism can be achieved with only 219 unique folds. New insights about the folds that predominate in the genome, as well as the fold-combinations that make up multi-domain proteins are also obtained. 1,728 binding pockets have been associated with ligands through binding site identification and sub-structure similarity analyses, yielding a list of ligands that can participate in various biochemical events in the mycobacterial cell. A web-accessible database MtbStructuralproteome has been developed to make the data and the analyses available to the community, ( http://proline.physics.iisc.ernet.in/Tbstructuralannotation). The resource, being one of the first to be based on structure-based functional annotations at a genome scale, is expected to be useful for better understanding of tuberculosis and for application in drug discovery. The reported annotation pipeline is fairly generic and can be applied to other genomes as well. Chapter 5 describes the characterization of the Mtb pocketome. For the structural models of the Mtb proteome described in chapter 4, a genome-scale binding site prediction exercise was carried out using three different computational methods and subsequently obtaining consensus predictions. The three methods were independent and were based on considering geometry, inter-molecular energies with probes and sequence conservations in evolutionarily related proteins respectively. In all, 13,858 consensus binding pockets were predicted in 2,877 proteins. The pocket space within Mtb was then explored through systematic all-pair comparisons of binding sites. The number of site-types within Mtb was found to be 6,584, as compared to the ∼400 structural folds and 1,831 unique sequence families. This reveals that the pocket space is larger than the sequence or fold-space, suggesting that variations at the site-level contribute significantly to functional repertoire of the organism. By comparing the pockets with the PDB sites enclosing known ligands, around 6906 binding sites were observed to exhibit significant similarity in the entire pockets to some or the other known binding site in PDB. 1,213 metabolites could be mapped onto 665 enzymes covering most of the metabolic pathways. The identified ligands serve as a predicted metabolome for unit abundances of the proteins. A list of proteins containing unique pockets is also identified. The binding pockets, similarities they share within Mtb and the ligands mapped onto them are all made available in a web-accessible database at http://proline.biochem.iisc.ernet.in/mtbpocketome/. The availability of structural information of the pocketome at a genome-scale opens up several opportunities in drug discovery. They can be directly applied for understanding mechanism of drug action, predicting adverse effects and pharmacodynamics of a drug. Moreover, it enables exploration of new ideas in drug discovery. Polypharmacology is a new concept that aims at modulating multiple drug targets through a single chemical entity. Currently, there are no established approaches to either select appropriate target sets or design polypharmacological drugs. In this study, a structural-proteomics approach is explored to first characterize the pocketome and then utilize it to identify similar binding sites. The knowledge of similarity relationships between the binding sites within the genome can be used in identifying possible polypharmacological drug targets. A pocket similarity based clustering of binding site residues resulted in identification of binding site sets, each having a theoretical potential to interact with a common ligand. A polypharmacological index was formulated to rank targets by incorporating a measure of drug ability and similarity to other pockets within the proteome. By comparing with known drug binding sites from databases such as the Drug Bank, the study has yielded a ready shortlist that includes sets of promising drug targets with polypharmacological possibilities and at the same time has identified possible drug candidates either directly for repurposing or at the least as significant lead clues that can be used to design new drug molecules against the entire group of proteins in each set. This analysis presents a rational approach to identify targets with polypharmacological potential, clues about lead compounds and a list of candidates for drug repurposing. This thesis demonstrates the feasibility of utilizing the structural bioinformatics approaches at a genome-scale. The tools developed for analyzing large-scale data on protein-ligand inter-actions could be applied to characterize the pocket-space of protein-ligand interactions. The network theory approaches applied in this work, make large-scale data tractable and enable binding-site typing. The binding site analysis at a genome-scale for Mtb is first of its kind and has provided novel insights into the pocket space. The binding site analysis performed on a genome-scale for Mtb provided an opportunity to rationalize the polypharmacological target selection and explore drugs for repurposing in TB. In the larger context, structural modelling of a proteome, mapping the small-molecule binding space in it and understanding the determinants of small-molecule recognition forms a major step in defining a proteome at higher resolution. This in turn will serve as a valuable input towards the emerging field of structural-systems biology, which seeks to understand the biological models at a systems level without compromising on the resolution of the study.
2

Structural and Biochemical Analysis of DNA Processing Protein A (DprA) from Helicobacter Pylori

Dwivedi, Gajendradhar R January 2014 (has links) (PDF)
H. pylori has a panmictic population structure due to high genetic diversity. The homoplasy index for H. pylori is 0.85 (where 0 represents a completely clonal organism and 1.0 indicates a freely recombining organism) which is much higher than homoplasy index for E. coli (0.26) or naturally competent Neisseria meningitides (0.34). It undergoes both inter as well as intra strain transformation. Intergenomic recombination is subject to strain specific restriction in H. pylori. Hence, a high homoplasy index means that competence predominates over restriction in H. pylori. Annotation of the genomes of H. pylori strains 26695 and J99 show the presence of nearly two dozen R-M systems out of which 16 were postulated to be Type II for J99. H. pylori has been described to be an ideal model system for understanding the equilibrium between competing tension of genomic integrity and diversity (42). R-M systems allow some degree of sexual isolation in a population of competent cells by acting as a barrier to transformation. The mixed colonizing population of H. pylori has a polyploidy nature where each H. pylori strain adds to ‘ploidy’ of the colonizing population. Maintenance of polyploidy nature of mixed colonizing population in a selective niche of stomach needs a barrier to free gene flow. Restriction barrier maintains a polyploidy nature of H. pylori population which is considered as yet another form of genetic diversity helping in persistence of infection. Thus, according to the model proposed by Kang and Blaser, where H. pylori are considered as perfect gases like bacterial population, transformation and restriction both add to genetic diversity of the organism. Again, restriction barriers are not completely effective, which could be due to cellular regulation of restriction system. Thus, a perfect balance between restriction and transformation in turn regulates the gene flow to equilibrate competition and cooperation between various H. pylori strains in a mixed population. RecA, DprA and DprB have been shown to be involved in the presynaptic pathway for recombination substrates brought in through the Com system. Biochemical characterization of HpDprA, during this study revealed its ability to bind to ssDNA and dsDNA. Binding of HpDprA to both ssDNA as well as dsDNA results in large nucleoprotein complex that does not enter the native PAGE. However, DNA trapped in the wells could be released by the addition of excess of competitor DNA, illustrating that the complex are formed reversibly and do not represent dead-end reaction products. Transmission electron microscopy for SpDprA interaction with ssDNA established that a large nucleoprotein complex consisting of a network of several DNA molecules bridged by DprA is formed which is retained in the well. A large DNA-protein complex that sits in the well has also been observed with other DNA binding proteins like RecA. It has been observed for ssDNA binding protein (SSB) that they bind non-specifically to dsDNA under low salt condition (20 mM NaCl) in the absence of Mg2+. The non specific binding of SSB to dsDNA was prevented under high salt conditions (200 mM NaCl) or in the presence of Mg2+. HpDprA interaction with both ssDNA and dsDNA was stable under high salt condition (200 mM NaCl) and in the presence of Mg2+ indicating that these interactions are specific. The interaction of HpDprA with dsDNA is significant since dsDNA plays an important role in natural transformation of H. pylori. The pathway of transformation by dsDNA is highly facilitated (nearly 1000 fold) as compared to ssDNA. However, dsDNA is a preferred substrate for REases which are a barrier to horizontal gene transfer. This implies that the decision of ‘restriction’ or ‘facilitation for recombination’ of incoming DNA might be taken before the conversion of dsDNA into ssDNA. The incoming DNA has been shown to be in the double-stranded form in periplasm and in single-stranded form in cytoplasm. Hence, the temporal and spatial events surrounding endonuclease cleavage remain to be understood. Taken together, these results suggest a very important role of dsDNA in natural transformation in H. pylori. Hence, binding and protection of dsDNA by HpDprA is possibly of crucial importance in the success of natural transformation process of the organism. DprA is characterized by presence of a conserved DNA binding domain. The DNA binding domain adopts a Rossman fold like topology spanning most region of the protein. Rossman fold consists of alternating alpha helix and beta strands in the topological order of β-α-β-α-β. It generally binds to a dinucleotide in a pair as a single Rossman fold can bind to a mononucleotide only. All homologous DprA proteins characterized till date show that in addition of the prominent Rossman fold domain they consist one or more smaller domains. RpDprA consists two more domains other than the Rossman fold domain i.e., N- terminal SAM (sterile alpha motif) domain and a C-terminal DML-1 like domain. SpDprA consist of an N-terminal SAM domain other than Rossman fold domain. While the main function of Rossman fold is to bind DNA, the supplementary domains are highly variable in sequences and functions. For example, the SAM domain in S. pneumoniae plays a key role in shut-off of competence by directly interacting with ComE~P. HpDprA consist of an N-terminal Rossman fold domain and a C-terminal DML-1 like domain. Both these domains are found to be prominently α-helical in nature. Amino acid sequence analysis of the protein suggests that NTD is basic and CTD is acidic in nature. NTD is sufficient for binding with ssDNA and dsDNA, while CTD plays an important role in formation of higher order polymeric complex with DNA. For HpDprA and SpDprA, dimerization site was mapped in Rossman fold domain. Gel filtration data revealed an important observation that HpDprA can exist as a monomer (dominant species at lower concentration) as well as a dimer (dominant species at higher concentration) in solution. However, the exchange between these two forms is very fast resulting in a single peak of elution. Since, HpDprA binds to DNA in dimeric form, the dimer species will be favoured in presence of DNA. Hence, even at lower concentrations HpDprA will be mainly a dimer in presence of DNA. Interestingly, both domains of HpDprA i.e., NTD and CTD were able to form dimers but no higher oligomeric form. On the other hand, HpDprA was seen to form oligomeric forms higher than dimer in gluteraldehyde cross linking assay. The strength of CTD dimer was much lower that NTD dimer, therefore it could be proposed that there are two sites of interaction present in HpDprA - a primary interaction site (N-N interaction) and a secondary interaction site (C-C interaction). The N-N interaction is responsible for dimer formation but further oligomerization of HpDprA necessitates the interaction of two dimers using C-C interaction site. It was shown that NTD binds to ssDNA but forms lower molecular weight complex. SPR analysis of DprA and NTD – DNA interaction pointed out that deletion of CTD leads to faster dissociation of the protein from DNA. Concomitantly, reduction in binding affinity was observed for both ss and ds DNA upon deletion of CTD from full length protein. These results suggest that CTD does play an important role in interaction of full length HpDprA with DNA. Two possible roles of CTD were proposed by Wang et al (2014) group to explain their observation of formation of lower molecular weight complex in absence of CTD. (i) CTD possesses a second DNA binding site but much weaker than site present in NTD. (ii) CTD is not involved in DNA binding but mediates nucleoprotein complex formation through protein – protein interaction. EMSA and SPR analysis with purified CTD protein confirmed that there is no secondary DNA binding site present in CTD. As discussed above, it was observed that CTD can mediate interaction between two HpDprA through C-C interaction. Since the interaction is weaker it is lesser likely to be responsible for dimer formation but in trimer or higher oligomeric form of HpDprA, the presence of N-N interaction will facilitate and stabilize C-C interaction. These observations together bring forward an interesting model for HpDprA – DNA interaction. HpDprA forms dimer through N-N interaction (favourably in presence of DNA) and many HpDprA dimers bind to DNA owing to their high affinity and sequence independent nature of binding. These dimers interact with each other through C-C interaction resulting in higher molecular weight nucleoprotein complex. HpDprA - DNA complex formation is slower than NTD – DNA complex but the former one is more stable (Fig. 2). According to the above proposed model there are two binding events (DNA – protein and protein – protein) in case of HpDprA – DNA complex formation and hence it would take longer time than NTD-DNA complex formation which involves only one binding event. But the resulting higher order complex with HpDprA – DNA would be much more stable. NTD is able to offer equally efficient protection from nuclease to ssDNA and dsDNA (Fig. 7). This shows that NTD alone is sufficient to completely coat single molecule DNA. AFM images confirm the difference in binding pattern of HpDprA full length protein and NTD. As can be seen in Fig. 8F, NTD binds a DNA molecule by entirely occupying all the available space but forms nucleoprotein filaments isolated from each other. In contrast to full length HpDprA, which forms tightly packed, condensed, extensively cross linked polynucleoprotein complexes, NTD forms much thinner complexes with DNA. In the electron micrographs of SpDprA – DNA complex, extensive cross filament interaction was observed resulting in a dense molecular aggregate. Similar kinds of complexes with DNA were also observed for Bacillus subtilis DprA in atomic force microscope images. Thus, it could be proposed that HpDprA binds to a single DNA molecule (single strand or double strand) mainly as a dimer formed through N-N interaction. Such multiple individual nucleoprotein filaments come together and interact with each other through C- C interaction resulting in dense and intricate poly – nucleoprotein complex. HpDprA is proposed to undergo conformational changes from closed state to open state in presence of ssDNA. In agreement with this, structural transition (resulting in reduction of α-helicity of the protein) was observed in presence of ssDNA. Similar structural transitions were observed for dsDNA indicating possibly a common mode of interaction for both forms of DNA. Further, mutation of the residues shown to be involved in binding ssDNA from crystallographic data, resulted in decrease of binding affinity with dsDNA as well. The fold reduction in binding affinity of dsDNA was lower than that for ssDNA despite that it is obvious that the same positively charged pocket which is primarily involved in ssDNA interaction is also responsible (atleast partially) for binding with dsDNA. However, the residues crucial for interaction with these two forms of DNA may be different. Both DprA and R-M systems have been shown to have presynaptic role in natural transformation process. While DprA has a protective role, R-M systems have an inhibitory role for incoming DNA suggesting a functional interaction between them. Results of this study show that HpDprA interacts with dsDNA, inhibits Type II restriction enzymes from acting on it and at the same time stimulates the activity of MTases resulting in increased methylation of bound DNA. This observation is of significance from the view of genetic diversity as the only way a bacterial cell discriminates between self and nonself DNA is through the pattern of methylation. Binding of HpDprA to incoming DNA inhibits its access to restriction endonucleases but not to methyltransferases. As a result DNA will be methylated with the same pattern as that of the host cell. Hence, it no longer remains a substrate for restriction enzymes. HpDprA thus, effectively alleviates the restriction barrier. However, it remains to be understood as to how DNA in complex with HpDprA, while not accessible to REases or other cellular nucleases, is accessible to a MTase? A possible explanation could be that HpDprA interacts with MTase and recruits it on DNA. It has been shown that there is a overlap between DprA dimerization and RecA interaction interfaces and in presence of RecA, DprA-DprA homodimer is replaced with DprA-RecA heterodimer allowing RecA nucleation and polymerization on DNA followed by homology search and synapsis with the chromosome. A similar scenario can be thought for interaction of HpDprA with the MTase. R-M systems play an important role in protection of genomic DNA from bacteriophage DNA. Hence, downregulation of restriction barrier by HpDprA may not be desirable by host during the entire life cycle. Therefore, the expression of HpDprA, which is ComK dependent and that which takes place only when competence is achieved is noteworthy. In H. pylori, DNA damage induces genetic exchange via natural competence. Direct DNA damage leads to significant increase in intergenomic recombination. Taken together it can be proposed that when genetic competence is induced, R-M systems are down regulated to allow increased genetic exchange and thus, increasing adaptive capacity in a selective environment of stomach. There is an evolutionary arms race between bacterial genomes and invading DNA molecules. R-M systems and anti-restriction systems have co-evolved to maintain an evolutionary balance between prey and predator. Phages and plasmids employ anti-restriction strategies to avoid restriction barrier by a) DNA sequence alteration, b) transient occlusion of restriction sites and c) subversion of restriction-modification activities. DNA binding proteins have been shown to bind and occlude restriction sites. On the other hand, λ Ral protein alleviates restriction by stimulating the activity of Type IA MTases. The observations of MTase stimulation and site occlusion of restriction sites by HpDprA appears to be analogous to anti restriction strategies, otherwise employed by bacteriophages. Thus, DprA could be a unique bacterial anti-restriction protein used by H. pylori for downregulating its own R-M systems to maintain the balance between fidelity and diversity. In conclusion, HpDprA has unique ability to bind to dsDNA in addition ssDNA but displays higher affinity towards ssDNA. Binding of HpDprA to DNA results in a compact complex that is inert to the activity of nucleases. A novel site of oligomerization for HpDprA was observed which suggests the role of C-C interaction in inter-nucleoprotein filament interaction. It would be interesting to further study the effects of CTD deletion on the transformation efficiency of H. pylori, to understand these mechanisms better. It has been well demonstrated that R-M systems offer a barrier to incoming DNA, but our understanding of the regulation of R-M systems has been poor. While other factors like regulation of cellular concentration of restriction enzymes and conversion of dsDNA into ssDNA might play crucial roles in striking the perfect balance between genome diversity and integrity, one of the factors that regulate R-M systems could be DprA.

Page generated in 0.1749 seconds