• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 3
  • 3
  • 3
  • 3
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Machine Learning and Optimization Algorithms for Intra- and Intermolecular Interaction Prediction

Roche, Rahmatullah 30 July 2024 (has links)
Computational prediction of intra- and intermolecular interactions, specifically intra- protein residue-residue interactions and the interaction sites in between proteins and other macromolecules, are critical for understanding numerous biological processes. The existing methods fall short in estimating the quality of intra-protein interactions. Moreover, the methods for predicting intermolecular interactions fail to harness some of the latest technological advancements such as advances in pretrained protein and RNA language models and struggle to effectively integrate predicted structural information, thus limiting their predictive modeling accuracy. Hence, my objectives include (1) the development of computational methods for protein structure modeling through the estimation of intra-protein interactions, (2) the development of computational methods for predicting protein- protein interaction sites leveraging the latest deep learning architectures and predicted structural information, and (3) extending the scope beyond protein-protein interactions to develop novel computational methods to predict protein-nucleic acid interactions informed by protein and RNA language models. The major benefits of achieving these objectives for the broader scientific community are the following: (1) intra-protein interaction estimation methods have the potential to enhance the accuracy of protein structure modeling, and (2) the methods for predicting protein-protein and protein-nucleic acid interaction will deepen our understanding of biomolecular interactions in cell, even when experimentally determined molecular structures are not available. / Doctor of Philosophy / My research focuses on developing accurate computational predictive modeling methods centering to biomolecular interactions. Proteins, one of the most important biomolecules, fold into stable three-dimensional forms to perform specific tasks in the cell. Recognizing the importance of this information in the absence of ground truth three-dimensional structures, I developed computational methods to predict the folded three-dimensional structures of proteins, utilizing intra-protein atomic interactions and for the quality estimation of those interactions. Since proteins not only fold themselves but also interact with other proteins and biomolecules such as nucleic acids, which is crucial for many biological processes, I expanded my research from intra-protein interactions to predicting interactions between proteins and other molecules. In particular, using advanced computational techniques, I developed methods for predicting protein-protein and protein-nucleic acid interactions. The research outcomes not only outperform existing state-of-the-art computational methods by overcoming their limitations but also have potential applications in designing effective therapies and combating diseases, ultimately improving the health sector through their large-scale predictability. All the scientific tools resulting from the research are publicly available, fascinating knowledge sharing and collaboration within and beyond the scientific community.
2

An effective layered workflow of virtual screening for identification of active ligands of challenging protein targets

Folly da Silva Constantino, Laura 01 August 2017 (has links)
Docking is a computer simulation method used to predict the preferred orientation of two interacting chemical species that has been successfully applied to numerous macromolecules over the years. However, non-traditional targets have inherent difficulties associated with their screening. Large interfaces, lack of obvious binding sites, and transient pockets are some examples. Additionally, most natural ligands of challenging targets are inadequate models for identifying or designing new ligands. Therefore, it is not surprising that customary techniques of structure-based virtual screening are incompatible with these non-traditional targets. We hypothesized that an integrative virtual screening campaign comprised of docking followed by refinement of best receptor–ligand complexes would effectively identify small-molecule ligands of challenging receptors. We targeted the single-stranded DNA (ssDNA) binding groove of the human RAD52, and a cryptic allosteric pocket of the Helicobacter pylori Glutamate Racemase (GR). In this project, we first determined which docking method was more appropriate for each studied non-traditional target, and then examined how good our two-step docking workflow was in finding novel active ligand scaffolds. This research developed a powerful layered virtual screening workflow for the discovery of lead compounds against challenging protein targets. Furthermore, we successfully applied a statistical analysis method, which used receiver operating characteristic (ROC) curves, to validate the selected docking protocol that would be used in the screening campaigns. Using the validated workflow, we identified a natural compound that competes with ssDNA to bind to RAD52. The performed screening campaigns also provided new insights into the studied binding pockets, as well as structure-activity relationships (SAR) and binding determinants of the ligands. Our achievements reinforce the power of the ROC curve analysis approach in directing the search for the most appropriate docking protocol and helping to speed up drug discovery in pharmaceutical research.
3

Exploring Protein-Nucleic Acid Interactions Using Graph And Network Approaches

Sathyapriya, R 03 1900 (has links)
The flow of genetic information from genes to proteins is mediated through proteins which interact with the nucleic acids at several stages to successfully transmit the information from the nucleus to the cell cytoplasm. Unlike in the case of protein-protein interactions, the principles behind protein-nucleic acid interactions are still not very (Pabo and Nekludova, 2000) and efforts are still underway to arrive at the basic principles behind the specific recognition of nucleic acids by proteins (Prabakaran et al., 2006). This is mainly due to the innate complexity involved in recognition of nucleotides by proteins, where, even within a given family of DNA binding proteins, different modes of binding and recognition strategies are employed to suit their function (Luscomb et al., 2000). Such difficulties have also not made possible, a thorough classification of DNA/RNA binding proteins based on the mode of interaction as well as the specificity of recognition of the nucleotides. The availability of a large number of structures of protein-nucleic acids complexes (albeit lesser than the number of protein structures present in the PDB) in the past few decades has provided the knowledge-base for understanding the details behind their molecular mechanisms (Berman et al., 1992). Previously, studies have been carried out to characterize these interactions by analyzing specific non-covalent interactions such as hydrogen bonds, van der Walls, and hydrophobic interactions between a given amino acid and the nucleic acid (DNA, RNA) in a pair-wise manner, or through the analysis of interface areas of the protein-nucleic acid complexes (Nadassy et al., 1998; Jones et al., 1999). Though the studies have deciphered the common pairing preferences of a particular amino acid with a given nucleotide of DNA or RNA, there is little room for understanding these specificities in the context of spatial interactions at a global level from the protein-nucleic acid complexes. The representation of the amino acids and the nucleotides as components of graphs, and trying to explore the nature of the interactions at a level higher than exploring the individual pair-wise interactions, could provide greater details about the nature of these interactions and their specificity. This thesis reports the study of protein-nucleic interactions using graph and network based approaches. The evaluation of the parameters for characterizing protein-nucleic acid graphs have been carried out for the first time and these parameters have been successfully employed to capture biologically important non-covalent interactions as clusters of interacting amino acids and nucleotides from different protein-DNA and protein-RNA complexes. Graph and network based approaches are well established in the field of protein structure analysis for analyzing protein structure, stability and function (Kannan and Vishveshwara, 1999; Brinda and Vishveshwara, 2005). However, the use of graph and network principles for analyzing structures of protein-nucleic acid complexes is so far not accomplished and is being reported the first time in this thesis. The matter embodied in the thesis is presented as ten chapters. Chapter 1 lays the foundation for the study, surveying relevant literature from the field. Chapter 2 describes in detail the methods used in constructing graphs and networks from protein-nucleic acid complexes. Initially, only protein structure graphs and networks are constructed from proteins known to interact with specific DNA or RNA, and inferences with regard to nucleic acid binding and recognition were indirectly obtained . Subsequently, parameters were evaluated for representing both the interacting amino acids and the nucleotides as components of graphs and a direct evaluation of protein-DNA and Protein-RNA interactions as graphs has been carried out. Chapter 3 and 4 discuss the graph and network approaches applied to proteins from a dataset of DNA binding proteins complexed with DNA. In chapter 3, the protein structure graphs were constructed on the basis of the non-covalent interactions existing between the side chains of amino acids. Clusters of interacting side chains from the graphs were obtained using the graph spectral method. The clusters from the protein-DNA interface were analyzed in detail for the interaction geometry and biological importance (Sathyapriya and Vishveshwara, 2004). Chapter 4 also uses the same dataset of DNA binding proteins, but a network-based approach is presented. From the analysis of the protein structure networks from these DNA binding proteins, interesting observations relating the presence of highly connected nodes(or hubs) of the network to functionally important amino acids in the structure, emerged. Also, the comparison between the hubs identified from the protein-protein and the protein-DNA interfaces in terms of their amino acid composition and their connectivity are also presented (Sathyapriya and Vishveshwara, 2006) Chapter 5 and 6 deal with the graph and network applications to a specific system of protein-RNA complex (aminoacyl-tRNA synthetases) to gain insights into their interface biology based on amino acid connectivity. Chapter 5 deals with a dataset of aminoacyl-tRNA synthetase (aaRS) complexes obtained with various ligands like ATP, tRNA and L-amino acids. A graph based identification of side chain clusters from these ligand-bound aaRS structures has highlighted important features of ligand-binding at the catalytic sites of the two structurally different classes of aaRS (Class I and Class II). Side chain clusters from other regions of aaRS such as the anticodon binding region and the ligand-activation sites are discussed. A network approach is used in a specific system of aaRS(E.coli Glutaminyl-tRNA synthetase (GlnRS) complexed with its ligands, to specifically understand the effects of different ligand binding., in chapter 6. The structure networks of E.coli GlnRS in the ligand-free and different ligand-bound states are constructed. The ligand-free and the ligand-bound complexes are compared by analyzing their network properties and the presence of hubs to understand the effect of ligand-binding. These properties have elegantly captured the effects of ligand-binding to the GlnRS structure and have also provided an alternate method for comparing three dimensional structures of proteins in different ligand-bound states (Sathyapriya and Vishveshwara, 2007). In contrast to protein structure graphs (PSG), both the interacting amino acids and nucleotides (DNA/RNA) form the components of the protein-nucleic acid graphs (PNG) from protein-nucleic acid complexes. These graphs are constructed based on the non-covalent interactions existing between the side chains of the amino acids and nucleotides. After representing the interacting nucleotides and amino acids as graphs, clusters of the interacting components are identified. These clusters are the strongly interacting amino acids and nucleotides from the protein-nucleic acid complexes. These clusters can be generated at different strengths of interaction between the amino acid side chain and the nucleotide (measured in terms of its atomic connectivity) and can be used for detecting clusters of non-specific as well as specific interactions of amino acids and nucleotides. Though the methodology of graph construction and cluster identification are given in chapter 2, the details of the parameters evaluated for constructing PNG are given in chapter 7. Unlike in the previous chapters, the succeeding chapters deal exclusively with results that are obtained from the analyses of PNG. Two examples of obtaining clusters from a PNG are given, one each for a protein-DNA and a protein-RNA complex. In the first example, a nucleosome core particle is subjected to the graph based analysis and different clusters of amino acids with different regions of the DNA chain such as phosphate, deoxyribose sugar and the base are identified. Another example of aminoacyl-tRNA synthetase complexed with its cognate tRNA is used to illustrate the method with a protein-RNA complex. Further, the method of constructing and analyzing protein-nucleic acid graphs has been applied to the macromolecular machinery of the pre-translocation complex of the T. thermophilus 70S ribosome. Chapter 8 deals exclusively with the results identified from the analysis of this magnificent macromolecular ensemble. The availability of the method that can handle interactions between both amino acids and the nucleotides of the protein-nucleic acid complexes has given us the basis fro evaluating these interactions in a level higher than that of analyzing pair-wise interactions. A study on the evaluation of short hydrogen bonds(SHB) in proteins, which does not fall under the realm of the main objective of the thesis, is discussed in the Chapter 9. The short hydrogen bonds, defined by the geometrical distance and angle parameters, are identified from a non-redundant dataset of proteins. The insights into their occurrence, amino acid composition and secondary structural preferences are discussed. The SHB are present in distinct regions of protein three-dimensional structures, such that they mediate specific geometrical constraints that are necessary for stability of the structure (Sathyapriya and Vishveshwara, 2005). The significant conclusions of various studies carried out are summarized in the last chapter (Chapter 10). In conclusion, this thesis reports the analyses performed with protein-nucleic acid complexes using graph and network based methods. The parameters necessary for representing both amino acids and the nucleotides as components of a graph, are evaluated for the first time and can be used subsequently for other analyses. More importantly, the use of graph-based methods has resulted in considering the interaction between the amino acids and the nucleotides at a global level with respect to their topology of the protein-nucleic acid complexes. Such studies performed on a wide variety of protein-nucleic acid complexes could provide more insights into the details of protein-nucleic acid recognition mechanisms. The results of these studies can be used for rational design of experimental mutations that ascertain the structure-function relationships in proteins and protein-nucleic acid complexes.

Page generated in 0.1305 seconds