• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 18
  • 5
  • Tagged with
  • 26
  • 26
  • 8
  • 7
  • 6
  • 5
  • 5
  • 5
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

The roles of HSV-1 VP16 and ICPO in modulating cellular innate antiviral responses

Hancock, Meaghan H. January 2010 (has links)
Thesis (Ph.D.)--University of Alberta, 2010. / A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Virology, Medical Microbiology and Immunology. Title from pdf file main screen (viewed on January 10, 2010). Includes bibliographical references.
22

In Silico Perspectives on RNA Structures Modulating Viral Gene Expression and Mechanics of tRNA Transport

Gupta, Asmita January 2015 (has links) (PDF)
The repertoire of cellular functions mediated by Ribonucleic acid (RNA) molecules have expanded considerably during the last two decades. The role played by RNA in controlling and regulating gene expression in viruses, prokaryotes and eukaryotes has been a matter of continuous investigations. This interest has arisen primarily due to the discoveries of cisacting RNA structures like riboswitches, ribosensors and frameshift elements, which are found in either the 5’-, 3’-untranslated regions of mRNA or in the open reading frames. These structures control gene expression at the level of translation by either sequestering the Shine-Dalgarno (SD) sequence to regulate translation initiation or modulating ribosomal positions during an active translation process. Very often, these structures comprise of an RNA pseudoknot and it has been observed that these pseudoknots exist in a dynamic equilibrium with other intermediate structures. This equilibrium could be shifted by several factors including presence of ions, metabolites, temperature and external force. RNA pseudoknots represent the most versatile and ubiquitous class of RNA structures in the cell, whose unique folding topology could be exploited in a number of ways by the cellular machinery. In this thesis, a thorough study of programmed -1 ribosomal frameshifting (-1 PRF) process, which is a well known gene regulation event employed by many RNA viruses, was carried out. -1 PRF is a translation recoding process, necessary for viruses to main-tain a stoichiometric ratio of structural: enzymatic proteins. This ratio varies among different viral species. At the heart of this process, lies an RNA pseudoknot accompanied by a seven nucleotide long sequence motif, which pauses an actively translating ribosome on mRNA and causes it to shift its reading frame. The frameshift inducing efficiency of pseudoknot depends on multiple factors, for example the time scale of ribosomal pause and RNA unfolding, subsequent refolding of structure to native/intermediate states and/or environment conditions. With the aim of illustrating the fundamentals of the process, multiple factors involved in -1 PRF were studied. Chapters 2-4 represent distinct aspects of -1 PRF process, while Chapter 5 discusses a different work concerned with nucleocytoplasmic transport of tRNA carried out by nuclear export receptor Exporting. Chapter 1 gives an overview of the different regulatory activities with which RNA structures and sequences are found to be associated and the evolution of these stud-ies. It discusses the different types of structural motifs found to constitute tertiary RNA structure and secondary structure prediction and determination techniques. A brief description of ab initio RNA structure modeling and other relevant tools and methodologies used in this work has been presented. Details of techniques used in each study have been provided in relevant chapters. Chapter 2 describes how local factors like ionic conditions, hydration patterns, presence of protonated residues and single residue mutations affect the structural dynamics of an RNA pseudoknot involved in -1 PRF from a plant luteovirus. Single residue mutations in the loop regions or certain base-pair inversions in the stem regions of pseudoknot increase the frameshift inducing ability of the pseudoknot structure, while some others decrease this efficiency. However, it was not clear how the changes made to the wild-type (WT) RNA pseudoknot from Beet Western Yellow Mosaic virus were affecting the global structure in terms of its dynamics and other parameters. To study this, multiple all-atom molecular dynamics simulations (MD) were performed on WT and mutant structures created in silico. The effect of presence and absence of magnesium ions on the structural geometry was also studied. The analysis was done to identify the increase/decrease in the number of hydrogen bonds formed by Watson-Crick base-pairs in stem region or non Watson-Crick pairs between stem and loop. Ionic and water densities were analyzed and the role of potential ribosome-pseudoknot interaction was elaborated. With the aim of mimicking ribosome induced unfolding of an RNA pseudoknot, steered molecular dynamics pulling experiments were performed. This work was done primarily to understand the unfolding pathway of Hairpin(H)-type pseudoknots in general and the intermediate structures formed. Chapter 3 describes the thermodynamics and mechanics associated with the mechanical pulling of -1 PRF inducing RNA pseudoknot and its mutants described in previous chapter. Analysis of the trajectories reveal relative unfolding patterns in terms of disruption of various hydrogen bonds. This study allowed us to pinpoint the kind of intermediate structures being formed during pulling and whether these intermediate structures correspond to any known secondary structures, such as simple stem-loops. This information could be used for gaining insights into the folding pathways of these structures. An RNA pseudoknot stimulates -1 PRF in conjunction with a heptanucleotide “slippery site” and an intervening spacer sequence. A comprehensive study of analyzing the sequence signatures and composition of all overlapping gene segments harboring these frameshift elements from four different RNA virus families was carried out. Chapter 4 describes the sequence composition of all overlapping gene segments in Astroviridae, Coronaviridae, Retroviridae and Luteoviridae viral families which are known to employ -1 PRF process for maintaining their protein products. Sequence analysis revealed preference for GC bases in the structure forming sequence regions. A comparative study between multiple sequence alignment and secondary structure prediction revealed that while pseudoknots have a clear preference for specific base-pairs in their stem regions, viral families that employ a hairpin loop as -1 PRF structure, doesn’t show this preference. Information derived from secondary structure prediction was then used for RNA ab initio modeling to generate tertiary structures. Furthermore, the structural parameters were calculated for the helices of the frameshift inducing pseudoknots and were compared with the values calculated for a set of non -1 PRF inducing H-type pseudo-knots. This study highlighted the differences between -1 PRF pseudoknots and other H-type pseudoknot structures as well as specific sequence and structural preferences of the former. Chapter 5 discusses the dynamics of a tRNA transport factor Exportint (Xpot), which transports mature tRNA molecules from nucleus to cytoplasm and belongs to Importitβ family of proteins. The global conformational dynamics of other transport receptors has been reported earlier, using coarse-grained modeling and Elastic Network Models (ENMs), but a detailed description of the dynamics at an all-atomic resolution was lacking. This transport requires association of Xpot with RanGTP, a G-protein, in the nucleus and hydrolysis of RanGTP in the cytoplasm. The chain of events leading to tRNA release from Xpot after RanGTP hydrolysis was not studied previously. With these objectives, several molecular complexes containing Xpot bound to Ran or tRNA or both in the GTP and GDP ligand states as well as free Xpot structures in nuclear and cytosolic forms were studied. A combination of conventional and accelerated molecular dynamics simulations was used to study these molecular complexes. The study highlighted various aspects associated with tRNA release and conformational change which occurs in Xpot in cytosolic form. The nuclear to cytosolic state transition in Xpot could be attributed to large fluctuations in C-terminal region and dynamic hinge-points located between specific HEAT repeats. A secondary role of Xpot in controlling the quality of tRNA transport has been proposed based on multiple sequence and structure alignment with Importin-β protein. The loss of critical contacts like hydrogen bonds and salt bridges between Xpot/Ran and Xpot/tRNA interface was evaluated in order to study the initial effects of RanGTP hydrolysis and how it influences receptor-cargo binding. This study revealed various aspects of tRNA transport process by Xpot, not understood previously. The results presented in this thesis illustrate the role of RNA sequence elements and pseudoknots present in RNA viruses in modulating -1 PRF process and how multiple environmental factors affect -1 PRF inducing ability of the structure. From the studies of Xpot and its complexes, the effects of GTP hydrolysis leading to tRNA dissociation have been presented and the progression of conformational transition in Xpot after tRNA dissociation has been highlighted. Chapter 6 summarizes major conclusions of this thesis work. The refolding of single stranded RNA chains, subjected to a previous unfolding simulation is studied. Appendix A describes this work and initial results. Appendix B describes the effect of improved molecular dynamics force fields, containing corrections for χ torsion angle for RNA, on the conformation of tertiary RNA structures. Part of the work presented in this thesis has been reported in the following publications. 1.Asmita Gupta and Manju Bansal. Local Structural and Environmental Factors De-fine the Efficiency of an RNA Pseudoknot Involved in Programmed Ribosomal Frameshift Process. J. Phys. Chem. B. 118 (41), pp 11905-11920. 2014 2.Asmita Gupta, Senthilkumar Kailasam and Manju Bansal. Insights Into Nucleo-cytoplasmic Transport of tRNA by Exportin-t. Manuscript under review. List of manuscripts that are being prepared from the work reported in Chapter 3 in this thesis. 1 Asmita Gupta and Manju Bansal. The role of sequence effects on altering the un-folding pathway of an RNA pseudoknot: a steered molecular dynamics study. Manuscript in preparation. 2 Asmita Gupta and Manju Bansal. Molecular basis for nucleocytoplasmic transport of tRNA by Exportin-t. Journal of Biomolecular Structure and Dynamics, May;33 Suppl 1:59-60, 2015
23

Functional Characterization of C4 Protien of Cotton Leaf Curl Kokhran Virus - Dabawali

Guha, Debojit January 2012 (has links) (PDF)
1) Geminiviruses are a group of plant viruses which contain circular single stranded DNA molecules as their genomes and the capsid consists of two icosahedra fused together to form twinned or geminate particles. The largest genus in the family Geminiviridae is that of begomoviruses which are of two kinds; the monopartite begomoviruses which contain only one circular single stranded DNA molecule as their genome and the bipartite begomoviruses which contain two circular single stranded DNA molecules (designated DNA-A and DNA-B) as their genomes. In bipartite viruses, the two DNA molecules are enclosed in separate geminate capsids. 2) In bipartite begomoviruses, the DNA-A encodes the proteins essential for replication and encapsidation of the viral genome while the DNA-B encodes the proteins involved in movement. The DNA-B encodes two proteins: the BV1 or the nuclear shuttle protein (NSP) and BC1 or the cell-to-cell movement protein. Geminiviruses have DNA genomes which replicate inside the host cell nucleus. The NSP, which contains nuclear localization signal, brings the viral DNA from nucleus to the cytoplasm while the BC1 serves to take the viral genome to the cell periphery for movement to the neighbouring cell through the plasmodesmata. 3) The monopartite begomoviruses do not contain DNA-B (which, in bipartite begomoviruses, encodes the proteins involved in movement) and it has been suggested that some of the proteins encoded by DNA-A take up the movement function. Based on studies on TYLCV and CLCuV, a model has been proposed for the movement of monopartite begomoviruses according to which the coat protein (CP) of monopartite begomoviruses serves as the functional equivalent of the NSP of bipartite begomoviruses 4) The present thesis deals with the biochemical characterization of the C4 protein of the monopartite begomovirus CLCuKV-Dab. As stated in statement (3) above, the V2 and C4 proteins of monopartite begomoviruses have been implicated to be involved in cell-to¬cell movement of the viral genome. In TYLCV, both the proteins were shown to be localized to the cell periphery and could move from one cell to another through the plasmodesmata. Further, the V2 protein of CLCuKV-Dab was shown to interact with the coat protein and bind to single stranded DNA. The biochemical properties of the C4 protein needed to be elucidated in order to strengthen the proposal of its probable involvement in movement. 5) The objectives of the present study were: i) Bioinformatic analysis of the C4 protein of CLCuKV-Dab ii) Biochemical characterization of ATPase and pyrophosphatase activities of the C4 protein. iii) Studies on the effect of V2-C4 interaction on the enzymatic properties of C4. iv) Functional characterization of C4 in planta. 6) The FoldIndex© and PONDR analyses predicted the C4 protein of CLCuKV-Dab to be natively unfolded. Similarly, in PSIpred analysis, most of the C4 protein was predicted to be a random coil without any well-defined secondary structure. Further, the protein sequence was analyzed using the motifscan server. However, no motif for any specific function was predicted in the C4 Protein. 7) The C4 gene was initially cloned into pRSET-C vector and overexpressed as histidine tagged protein and the solubility of the protein was tested in various conditions including low temperature (18° C) after inducing the expression of the protein, buffers of various pH and different salt concentrations but the protein remained insoluble. Subsequently, the protein was purified under denaturing conditions and attempts were made to refold the protein but the protein precipitated during refolding. In order to get the C4 protein in soluble form, the C4 gene was subcloned into pGEX-5X2 vector and overexpressed as a GST-tagged fusion protein (GST-C4). Some of the GST-C4 protein was soluble which was purified by using GST-bind resin. The purified fusion protein was observed as a 37 kDa band on SDS-PAGE gel. The purified protein was accompanied by a degraded product of approximately 30 kDa size. Both the intact GST-C4 protein and the degraded product were detected in western blot analysis using anti-GST antibody. 8) Because C4 has been implicated to be involved in movement of monopartite begomoviruses and movement is an energy requiring process, it was of interest to determine if GST-C4 possesses ATPase activity. The purified GST-C4 protein was incubated with γ-[32P]-ATP, the product of the reaction was separated by thin layer chromatography and the chromatography plate was analyzed by phosphorimager. The hydrolysis of ATP by GST-C4 and the release of inorganic phosphate was clearly observed, suggesting that GST-C4 might possess ATPase activity. 9) The reaction conditions for the ATPase activity of GST-C4 were standardized. The activity increased linearly upto 2.60μM of the protein. The optimum temperature and pH for the ATPase activity were found to be 30 C and 6.0 respectively. The activity was inhibited by EDTA, suggesting that it is dependent on divalent metal ions. The activity was stimulated by Mg+2, Mn+2 and Zn+2 but inhibited by Ca+2ions. Further, in the time course experiment, it was observed that the ATPase activity increased linearly upto one hour. 10) The Km, Vmax and kcat for the ATPase activity of GST-C4 were found to be 51.72 ± 2.5 µM, 7.2 ± 0.54 nmoles/min/mg of the protein and 0.27 min-1 respectively. Some of the other virally encoded ATPases have been found to exhibit kcat similar to that found for GST-C4 but it is much lower than those of most of the prokaryotic and eukaryotic ATPases (as mentioned in Table 3.3, page 100, chapter 3). Further, the presence of the degraded product did not affect the kinetic constants as described in chapter 3, pages 95¬-98. It is possible that the enzymatic activity might increase upon interaction with some ligand. 11) In the absence of any putative ATP binding motifs, systematic deletions from N-and C-termini were made to delineate the regions of C4 important for the ATPase activity. GST-N∆15-C4 and GST-N∆30-C4 exhibited approximately 70 % reduction in the ATPase activity while all the C-terminal deletion mutants (GST-C∆10-C4, GST-C∆20¬C4 and GST-C∆30-C4) retained the activity similar to the full length GST-C4 protein. This suggested that the N-terminal region of C4 may contain the residues important for the ATPase activity of GST-C4. 12) In the N-terminal region of C4, there is a sequence CSSSSR which closely resembles the sequence present at the active site of phosphotyrosine phosphatases (CXXXXXR). However, GST-C4 did not catalyze the hydrolysis of p-Nitrophenyl phosphate, a substrate analogue commonly used to assay phosphotyrosine phosphatase activity. It was of interest to determine if the cysteine and arginine in this sequence are important for the ATPase activity of GST-C4. GST-R13A-C4 exhibited an approximately two fold reduction in Vmax suggesting that R13 in C4 may be catalytically important for the ATPase activity of GST-C4. On the other hand, the C8A mutation did not affect the ATPase activity of GST-C4. 13) The GST-C4 protein was tested for its ability to hydrolyze several other phosphate containing compounds as mentioned chapter 2, pages 53-55. Among these compounds, GST-C4 catalyzed the hydrolysis of sodium pyrophosphate, that is, GST-C4 exhibited an inorganic pyrophosphatase activity. 14) The reaction conditions for the inorganic pyrophosphatase activity of GST-C4 were initially standardized. The pyrophosphatase activity of GST-C4 increased linearly upto 3.38 µM of the protein. The optimum temperature and pH for the pyrophosphatase activity were found to be 37° C and 7.0 respectively. The pyrophosphatase activity was inhibited by EDTA, suggesting that it is dependent on divalent metal ions. The activity was most efficiently stimulated by Mg+2, although it was also stimulated by Mn+2and Zn+2but inhibited by Ca+2ions. Thus, the pyrophosphatase activity of GST-C4 resembles the family I inorganic pyrophosphatases in metal ion requirements. Further, the pyrophosphatase activity increased linearly upto 1 hour 30 minutes. 15) The Km, Vmax and Kcat for the pyrophosphatase activity of GST-C4 were found to be 0.76 ± 0.04 mM, 141.16 ± 20 nmoles/min/mg of the protein and 5.2 minrespectively. The kcat for the pyrophosphatase activity was approximately 20 fold higher than that for the ATPase activity (0.27 min-1). 16) GST-N∆15-C4 and GST-N∆30-C4 exhibited >70 % reduction in the pyrophosphatase activity, a finding similar to that for the ATPase activity. On the other hand, while GST-C∆10-C4 retained the activity similar to the full length GST-C4 protein, GST-C∆20-C4 and GST-C∆30-C4 exhibited 20 % and 60 % reduction in the pyrophosphatase activity, respectively, as compared to the full length GST-C4 protein. This suggested that the C-terminal region of C4 may also contain the residues important for the pyrophosphatase activity of GST-C4. However, the C-terminal deletion mutants retained the ATPase activity similar to the full length protein. 17) The pyrophosphatase activity of GST-C4 was stimulated more than three fold by several reducing agents. The C4 protein contains only one cysteine (at position 8 in the C4 sequence). This was the first clue that the cysteine may be important for the pyrophosphatase activity of the GST-C4 protein. Further, the pyrophosphatase activity of GST-C4 did not exhibit preference for a particular kind of reducing agent like that of the pyrophosphatase activity in Streptococcus faecalis. 18) GST-C8A-C4 exhibited more than two fold reduction in Vmax, suggesting that the C8 may be catalytically important for the pyrophosphatase activity of GST-C4. On the other hand, the R13A mutation did not affect the pyrophosphatase activity of the GST-C4 protein. Thus, it is possible that during catalysis, the cysteine thiolate of C4 makes a 19) The pyrophosphatase activity of GST-C4 was inhibited by vanadate and fluoride. Vanadate was found to be a competitive inhibitor with Ki 0.33 mM while fluoride was a non-competitive inhibitor with Ki 2.82 mM. A comparative account of the two enzymatic activities of GST-C4 is presented in table 6.1
24

The Boiling Springs Lake Metavirome: Charting the Viral Sequence-Space of an Extreme Environment Microbial Ecosystem

Diemer, Geoffrey Scott 04 March 2014 (has links)
Viruses are the most abundant organisms on Earth, yet their collective evolutionary history, biodiversity and functional capacity is not well understood. Viral metagenomics offers a potential means of establishing a more comprehensive view of virus diversity and evolution, as vast amounts of new sequence data becomes available for comparative analysis.Metagenomic DNA from virus-sized particles (smaller than 0.2 microns in diameter) was isolated from approximately 20 liters of sediment obtained from Boiling Springs Lake (BSL) and sequenced. BSL is a large, acidic hot-spring (with a pH of 2.2, and temperatures ranging from 50°C to 96°C) located in Lassen Volcanic National Park, USA. BSL supports a purely microbial ecosystem comprised largely of Archaea and Bacteria, however, the lower temperature regions permit the growth of acid- and thermo-tolerant Eukarya. This distinctive feature of the BSL microbial ecosystem ensures that virus types infecting all domains of life will be present. The metagenomic sequence data was used to characterize the types of viruses present within the microbial ecosystem, to ascertain the extent of genetic diversity and novelty comprising the BSL virus assemblage, and to explore the genomic and structural modalities of virus evolution.Metagenomic surveys of natural virus assemblages, including the survey of BSL, have revealed that the diversity within the virosphere far exceeds what has currently been determined through the detailed study of viruses that are relevant to human health and agriculture. The number of as-yet-uncharacterized virus protein families present in the BSL assemblage was estimated by clustering analysis. Genomic context analysis of the predicted viral protein sequences in the BSL dataset indicates that most of the putative uncharacterized proteins are endemic or unique to BSL, and are largely harbored by known virus types. A comparative metagenomic analysis approach identified a set of conserved, yet uncharacterized BSL protein sequences that are commonly found in other similar and dissimilar environments.New sequence data from metagenomic surveys of natural virus assemblages was also used to better characterize and define known virus protein families, as some of the viruses found in the BSL environment represent distant relatives of well-characterized isolates. By comparing viral genes and protein sequences from these highly divergent species, it is possible to better understand the dynamics of adaptation and evolution in the virosphere. Additionally, as structures of virus proteins continue to be experimentally determined by X-ray crystallography and cryo-electron microscopy, a merger of structural and metagenomic sequence data allows the opportunity to observe the structural dynamics underlying virus protein evolution.Capsid (structural) proteins from two distinct Microviridae strains; a globally ubiquitous and highly sequence-diverse virus family, were identified in, and isolated from the BSL metagenomic DNA sample. These BSL capsid protein sequences, along with several other homologous sequences derived from metagenomic surveys and laboratory isolates, were mapped to the solved structure of a closely related capsid protein from the Spiroplasma phage-4 microvirus. Patterns of amino acid sequence conservation, unveiled by structure-based homology modeling analysis, revealed that the protein sequences within this family exhibit a remarkable level of plasticity, while remaining structurally and functionally congruent.Lateral gene transfer is thought to have had a significant impact on the genomic evolution and adaptation of virus families. Genomic context analysis was also utilized to identify interviral gene transfer within the BSL virus assemblage. An ostensibly rare interviral gene transfer event, having transpired between single-stranded RNA and DNA virus types, was detected in the BSL metagenome. Similar genomes were subsequently detected in other ecosystems around the globe. The discovery of this new virus genome dramatically underscores the scope and importance of genetic mobility and genomic mosaicism as major forces driving the evolution of viruses.The analyses conducted herein demonstrate the many ways in which viral metagenomic sequence data may be utilized to not only evaluate the composition of a natural virus assemblage, but to discover new viral genes, and to better understand the dynamics of both genomic and structural evolution within the virosphere.
25

Computational Analyses Of Proteins Encoded In Genomes Of Pathogenic Organisms : Inferences On Structures, Functions And Interactions

Tyagi, Nidhi 11 1900 (has links) (PDF)
The availability of completely sequenced genomes for a number of organisms provides an opportunity to understand the molecular basis of physiology, metabolism, regulation and evolution of these organisms. Significant understanding of the complexity of organisms can be obtained from the functional characterization of repertoire of proteins encoded in their genomes. Computational approaches for recognition of function of proteins of unknown function encoded in genomes often rely on ability to detect well characterized homologues. Homology searches based on pair-wise sequence comparisons can reliably detect homologues with sequence identity more than 30%. However, detecting homologues characterized by sequence identity below 30% is difficult using these methods. Distant homology relationship can be established using profiles or position specific scoring matrices, which encapsulate information about structurally and functionally conserved residues. These conserved residues imply high constraints at a particular amino acid residue site due to their involvement in structural stability, enzymatic activity, ligand binding, protein folding or protein–protein interactions. In addition, information on three dimensional structures of proteins also aid in detection of remote homologues, as tertiary structures of proteins are conserved better than the primary structures of proteins. The gross objective of the work reported in this thesis is to employ various sensitive remote homology detection methods to recognize relevant functional information of proteins encoded mainly in pathogenic organisms. Since proteins do not work in isolation in a cell, it has become essential to understand the in vivo context of functions of proteins. For this purpose, it is essential to have an understanding of all molecules that interact with a particular protein. Thus, another major area of bioinformatics has been to integrate protein-protein interaction information to enable better understanding of context of functional events. Protein-protein interaction analysis for host-pathogen can lead to useful insight into mode of pathogenesis and subsequent consequences in host cell. Chapters 2-6 of the thesis discuss the sequence and structural characteristics along with remote evolutionary relationships and functional implications of uncharacterized proteins encoded in genomes of following pathogens: Helicobacter pylori, Plasmodium falciparum and Leishmania donovani. The Chapters 6-8 discuss mainly various sequence, structural and functional aspects of protein kinases encoded in genomes of various prokaryotes and viruses. Chapter 1 discusses background information and literature survey in the areas of homology detection and prediction of protein-protein interactions. The growth of genomic data and need for processing genomic data to infer context of various functional events have been highlighted. Different approaches to recognize functions of proteins (experimental as well as computational) have been discussed. Various experimental and computational approaches to detect/predict protein-protein interactions have been mentioned. Chapter 2 discusses recognition of non-trivial remote homology relationships involving proteins of Helicobacter pylori and their implications for function recognition. H. pylori is microaerophilic, Gram negative bacterial pathogen. It colonizes human gastric mucosa and is a causative agent of gastroduodenal disease. The pathogen infects about 50% of the human population. It can lead to development of Mucosa-associated lymphoid tissue lymphoma. About 10% of the infected population develop gastric or duodenal ulcer and approximately 1% develop gastric cancer. H. pylori has been classified as class I carcinogen by WHO. Pathogen is characterized by type IV secretion system. The complete genomic sequences of three widely studied strains including 26695, J99 and HPAG1 of Helicobacter pylori are available. According to the genome analysis, the number of predicted open reading frames in strain 26695, J99 and HPAG1 are 1590, 1495 and 1536 respectively. Out of predicted H. pylori proteins from 26695, J99 and HPAG1 strains, numbers of proteins with no functional domain assignments in Pfam database (Protein family database) are 453, 357 and 400 respectively. There are proteins in different strains of H. pylori genomes where one part of the protein is associated with at least one protein domain of known function and hence preliminary indication of their functions is available whereas rest of the region is not associated with any function. There are 772, 803 and 790 such segments in proteins from strains 26695, J99 and HPAG1 respectively with at least 45 residues with no functional assignment currently available. Sensitive remote homology detection methods have been employed to establish relationships for 294 amino acid sequences and results have been grouped into 4 categories. Results of homology detection have been further confirmed by studying conservation of amino acid residues which are important for functioning of the proteins concerned. (i) Remote relationship has been established involving protein domain families for which no bonafide member is currently known in H. pylori. For example: DNA binding protein domain (Kor_B) has been assigned to a H. pylori protein at sequence identity of 20%. Study involving secondary structure prediction and conservation of amino acid residues confirms the results of homology detection methods. (ii) Remote relationship has been established involving H. pylori hypothetical proteins and protein domain families, for which paralogous members are present in Helicobacter pylori. For example, Cytochrome_C, an electron transfer protein domain could be associated with a Helicobacter pylori protein sequence which shows a sequence identity of 14% with sequences of bonafide cytochrome C. (iii) “Missing” metabolic proteins of H. pylori have also been recognized. For example, Aspartoacylase (EC 3.5.1.15) catalyzes deacetylation of N-acetylaspartic acid to produce acetate and L-aspartate. This enzyme in aspartate metabolism pathway has not been reported so far from H. pylori. A remote evolutionary relationship between a H. pylori protein and Aspartoacylase domain has been established at sequence identity of 17% thus filling the gap in this metabolic pathway in the pathogen. (iv) New functional assignments for domains in H. pylori sequences with prior assignment of domains for the rest of the sequences have been made. For example, DNA methylase domain has been assigned to C-terminal region of H. pylori protein which already had Helicase domain assigned to the N-terminal region of the protein. All these information should open avenues for further probing by carrying out experiments which will impact the design of inhibitor against this pathogen and will result in better understanding of pathogenesis of this organism in human. Chapter 3 describes prediction of protein–protein interactions between Helicobacter pylori and the human host. A lack of information on protein-protein interactions at the host-pathogen interface is impeding the understanding of the pathogenesis process. A recently developed, homology search-based method to predict protein-protein interactions is applied to the gastric pathogen, Helicobacter pylori to predict the interactions between proteins of H. pylori and human proteins in vitro. Many of the predicted interactions could potentially occur between the pathogen and its human host during pathogenesis as we focused mainly on the H. pylori proteins that have a transmembrane region or are encoded in the pathogenic island and those which are known to be secreted into the human host. By applying the homology search approach to protein-protein interaction databases DIP and iPfam, in vitro interactions for a total of 623 H. pylori proteins with 6559 human proteins could be predicted. The predicted interactions include 549 hypothetical proteins of as yet unknown function encoded in the H. pylori genome and 13 experimentally verified secreted proteins. A total of 833 interactions involving the extracellular domains of transmembrane proteins of H. pylori could be predicted. Structural analysis of some of the examples reveals that the predicted interactions are consistent with the structural compatibility of binding partners. Various probable interactions with discernible biological relevance are discussed in this chapter. For example, interaction between CFTR protein (NP_000483) and multidrug resistance protein (HP1206) has been predicted. The structure of the CFTR intracellular domain is known in the homomeric form and consists of five AAA transport domains in tandem (PDB code 1XMI). Out of the five identical subunits, two subunits (the B chain and the E chain in the PDB structure) have been selected. The structure of multidrug resistance protein of the pathogen based on the B chain (sequence identity 32%) of the template has been modeled. This exercise suggests that interface residues in the model are congenial for interaction. This makes the structural complex feasible in in vitro conditions and suggests that the pathogen protein may compete for occupancy with the host protein. Chapter 4 describes recognition of Plasmodium-specific protein domain families and their roles in Plasmodium falciparum life cycle. Malaria in humans is caused by the parasites of intracellular, eukaryotic protozoan of apicomplexan nature belonging to the genus Plasmodium. Out of five species of Plasmodium, namely, P. falciparum, P. ovale, P. vivax, P. malariae and P. knowlesi which infects human, P. falciparum causes lethal infection. P. falciparum proteins have diverged extensively during the course of evolution. Pathogen genome is rich in A+T composition which larger than the homologous proteins from other organisms due to presence of low complexity regions. Organism specific families are important as they play roles in peculiar life style of an organism. If the organism is a pathogen, then these family members may play roles in pathogenesis. Inhibiting these specific proteins is unlikely to interfere with host system as no homolog may be present in host. In the present work we identify Plasmodium specific protein families and their role in different stages of life cycle of the pathogen. A total of 5086 amino acid sequences (full length sequences/fragments of proteins) show homology only with amino acid sequences from Plasmodium organisms and hence are Plasmodium-specific. These Plasmodium-specific amino acid sequences cluster into 106 Plasmodium-specific families (≥2 members per family). 14 Plasmodium-specific protein domain families with known physico-chemical properties are observed. These Plasmodium-specific protein domain families are involved in various important functions such as rosetting and sequestering of infected erythrocytes, binding to surface of host cell and invasion process in life cycle of pathogen. Also, 89 new Plasmodium-specific protein domain families have been recognized. Analysis of various aspects of members of Plasmodium-specific proteins domain families such as their potential to target apicoplast, protein-protein interaction, expression profile and domain organization has been performed to derive relevant information about function. New Plasmodium specific domain families for which no function can be associated could provide some insight into much diverged Plasmodium species. These proteins may play role in parasite-specific life style. Experimental work on these Plasmodium-specific proteins might fill the gaps of less understood physiology of this parasite. Chapter 5 presents genome-wide compilation of low complexity regions (LCR) in proteins. An indepth analysis of the nature, structure, and functional role of the proteins containing low complexity regions in Plasmodium falciparum, was undertaken given the high prevalence of LCRs in the proteome of this organism. Low complexity regions and repeat patterns have been recognized in proteins encoded in 986 genomes (68 archaea, 896 prokaryotes and 22 eukaryotes). Low complexity regions have been classified into following three categories: a) Composition of LCRs: (i) LCRs can be stretches of homo amino acid residues (ii) LCRs can be stretches of more than one amino acid residue type b) Periodicity of amino acids in LCRs: Certain amino acid residues can be observed at certain specific periodicity in proteins. c) Repeat patterns: Certain motif of amino acid residues are repeated in protein. 850 Plasmodium falciparum proteins are observed to have at least one repeat pattern where the repeating unit is at least 5 amino acid residues long. Statistical analysis on single amino acid residue repeats indicate that occurrence of stretches of homo amino acid residues is not a random event. Studies on recognition of functions, protein protein interactions and organization of tethered domain(s) in proteins containing LCR suggest that these proteins are part of variety of functional events such as signal transduction, enzymatic processes, cell differentiation, pyrimidine biosynthesis, fatty acid biosynthesis and chromosomal replication. Representations of low complexity regions of Plasmodium falciparum in protein data bank suggest that LCRs can take conformation of regular secondary structure (apart from disordered regions) in 3-D structures of proteins. Chapter 6 describes sequence analysis, structural modeling and evolutionary studies of Leishmania donovani hypusine pathway enzymes. Leishmania is an eukaryotic kinetoplastid protozoan parasite which causes leishmaniasis in humans. Hypusine is a non standard polyaminederived amino acid Nε-(4-amino-2-hydroxybutyl) lysine and is named after its two structural components, hydroxyputrescine and lysine. The eukaryotic translation initiation factor 5A (eIF5A) is the only cellular protein containing hypusine. Synthesis of hypusine is critical for the function of elF5A and is essential for eukaryotic cell proliferation and survival. Formation of hypusine is the result of a two step post-translational modification process involving enzymes (i) deoxyhypusine synthase (DHS) (ii) deoxyhypusine hydroxylase (DOHH). DHS, the first enzyme involved in hypusine pathway catalyzes the NAD-dependent transfer of the butylamino moiety of spermidine (substrate) to the ε-amino group of a specific lysine residue of eIF5A precursor and generates deoxyhypusine containing intermediate. DOHH, the second enzyme in same pathway catalyzes the hydroxylation of deoxyhypusine-containing intermediate, generating hypusine-containing mature eIF5A. Two putative deoxyhypusine synthase (DHS) sequences DHS34 and DHS20 have been identified in Leishmania donovani, by Professor Madhubala and coworkers (Jawaharlal Nehru University, New Delhi) with whom the work embodied in this chapter was done in collaboration. Detailed comparison of DHS34 sequence from Leishmania with human DHS protein indicated conservation of functionally important residues. 3D structural modeling studies of protein suggested that residues around the active site were absolutely conserved. NAD binding regions are located spatially closer, however, one NAD binding region was observed in a large (225 amino acid residues long) insertion. Based on these observations, DHS34 was predicted to have enzymatic activity. Experimental studies done by our collaborators confirmed preliminary results of computational analysis. Based on sequence and structural analysis of DHS20 and DOHH proteins, DHS20 and DOHH were proposed to be catalytically inactive and active respectively. Experimental studies on these proteins supported results of computational analysis. Deoxyhypusine synthase (DHS) and Deoxyhypusine hydroxylase (DOHH) are key proteins conserved in the hypusine synthesis pathways of eukaryotes. Because they are highly conserved, they could be coevolving. Comparison of the genetic distance matrices of DHS and DOHH proteins reveals that their evolutionary rates are better correlated when compared to the rate of an unrelated protein such as Cytochrome C. This indicates that they are coevolving, further serving as an indicator that, even non-interacting proteins that are functionally coupled, experience correlated evolution. However, this correlation does not extend to their tree topologies. Chapter 7 provides a classification scheme for protein kinases encoded in genomes of prokaryotic organisms. Overwhelming majority of the Ser/Thr protein kinases identified by gleaning archaeal and eubacterial genomes could not be classified into any of the well known Hanks and Hunter subfamilies of protein kinases. This is owing to the development of Hanks and Hunter classification scheme based on eukaryotic protein kinases which are highly divergent from their prokaryotic homologues. A large dataset of prokaryotic Ser/Thr protein kinases prokaryotic Ser/Thr protein kinases. Traditional sequence alignment and phylogenetic approaches have been used to identify and classify prokaryotic kinases which represent 72 subfamilies with at least 4 members in each. Such a clustering enables classification of prokaryotic Ser/Thr kinases and it can be used as a framework to classify newly identified prokaryotic Ser/Thr kinases. After series of searches in a comprehensive sequence databases, it is recognized that 38 subfamilies of prokaryotic protein kinases are associated to a specific taxonomic level. For example 4, 6 and 3 subfamilies have been identified that are currently specific to phylum proteobacteria, cyanobacteria and actinobacteria respectively. Similarly, subfamilies which are specific to an order, sub-order, class, family and genus have also been identified. In addition to these, it was also possible to identify organism-diverse subfamilies. Members of these clusters are from organisms of different taxonomic levels, such as archaea, bacteria, eukaryotes and viruses. Interestingly, occurrence of several taxonomic level specific subfamilies of prokaryotic kinases contrasts with classification of eukaryotic protein kinases in which most of the popular subfamilies of eukaryotic protein kinases occur diversely in several eukaryotes. Many prokaryotic Ser/Thr kinases exhibit a wide variety of modular organization which indicates a degree of complexity in protein-protein interactions and the signaling pathways in these microbes. Chapter 8 focuses on recognition, classification of protein kinases encoded in genomes of viruses and their implications in various functions and diseases. Protein kinases encoded by viral genomes play a major role in infection, replication and survival of viruses. Using traditional sequence homology detection tools, sequence alignment methods and phylogenetic approaches, protein kinases were recognized. 646123 protein sequences from 35799 viral genomes (including strains) have been used in this analysis. Protein kinases are identified using a combination of profile-based search methods such as PSI-BLAST, RPS-BLAST and HMMER approaches. Based upon sequence similarity over the length of catalytic kinase domains, 479 protein kinase domains recognized in 244 viral genomes have been clustered into 46 subfamilies with minimum sequence identity of 35% within a subfamily. Viral protein kinases are encoded in genomes of retro-transcribing viruses or viruses which possess double stranded DNA as genetic material. Based on the available functional information present for one or more members of a subfamily, a putative function has been assigned to other members of the subfamily. Information regarding interaction of viral protein kinases with viral/host protein has also been considered for enhancing understanding of function of kinases in a subfamily. Out of 46 subfamilies, 14 subfamilies are characterized by various functions. Kinases belonging to UL97, US69, UL13 and BGLF subfamilies are virus specific. For 7 subfamilies, nearest neighbors are from well characterized eukaryotic protein kinase groups such as AGC, CAMK and CDK. Out of 25 new uncharacterized subfamilies observed in this analysis, 13 subfamilies are virus specific. Different subfamilies have been characterized by various functions which are crucial for viral infection such as synthesis of structural unit, replication of genetic material, modification of cellular components, alteration in host immune system, competing with cellular protein for efficient usage of host machinery. Also, many viral kinases share very high sequence identity (~97%) with their eukaryotic counterpart and represent disease state. For example, a protein kinase encoded in Avian erythroblastosis virus shares 97% sequence identity with catalytic domain of human epidermal growth factor receptor tyrosine kinase. Leucine at position 861 in human protein is substituted by Gln in cancer conditions; the viral protein kinase sequence possesses Gln at corresponding position and thus represents disease state. Chapter 9 provides study of dependency on the ability of 3-D structural features of comparative models and crystal structures of inactive forms of enzymes to predict enzymes by considering protein kinases as case study. With the advent of structural genomics initiatives, there is a surge in the number of proteins with 3-D structural information even before functional features are understood on many of these proteins. One of the useful annotations of a protein is the demarcation of a protein into an enzyme or non-enzyme solely from the knowledge of 3-D structure. This is facilitated by the identification of active sites and ligand binding sites in a protein. In this work, which was carried out in collaboration with Dr Jim Warwicker of Manchester University, UK, an approach developed by Warwicker and coworkers has been used. In the 3D structure of proteins, the largest clefts are generally considered to be ligand binding sites. This feature along with other sequence alignment independent properties such as residue preferences, fraction of surface residues and secondary structure elements have been considered to differentiate enzymes from non-enzymes. Electrostatic potential at the active site is one of the key properties utilized in this respect. Active sites in enzymes are generally associated with ionizable groups which can take part in catalysis. In addition to the feature of large clefts in enzymes, active site residues are in buried environments and show larger deviation in pKa values than surface residues. The method proposed by Warwicker and co-workers distinguish proteins in to enzymes and non-enzymes considering the electrostatic features at clefts along with the sequence profile of the protein concerned. Conformation of the inactive state of an enzyme is not congenial to the catalytic function. In an ideal situation, a method should be capable of predicting an enzyme irrespective of whether determined structure corresponds to active or inactive state. Peak potential values have been calculated by using Warwicker program for a set of 15 protein kinases for which 3-D structures are present in active as well in inactive conformations. Comparison of peak potential values calculated for active and inactive conformations suggests that algorithm can differentiate between active and inactive conformations as value for active conformations are generally higher than corresponding values for inactive conformations. However, the peak potential values are high enough for even the inactive conformations to be predicted as enzyme. Peak potential values calculated for generated homology models of protein kinases (for which crystal structures are already available) at different sequence identities with template sequences predict protein kinases as enzymes and their peak potential values are comparable to corresponding values for X-ray structures. This suggests that proteins for which there are no crystal or NMR structures yet available and no good template with high sequence identity are present, peak potential values for models generated at low sequence identity can still give insight into probable function of protein as an enzyme. The enzyme/non-enzyme prediction algorithm was also found to be useful in confirming enzyme functionality using 3-D models of putative viral kinases. Initially, putative function of kinase has been assigned to these viral proteins based solely upon their sequence characteristics such as presence of residues/motifs which are important for activity of the protein. The enzyme recognition method which is not directly sensitive to these motifs confirmed that all the analyzed putative viral kinases are enzymes. Chapter 10 presents conclusions of work embodied in the entire thesis. Very briefly, various computational approaches have been used to analyze and understand structural and functional properties of repertoire of proteins of pathogenic organisms. Analysis of uncharacterized protein domain families has helped to understand the functional implications of constituent proteins. Experimental validation of these results can further facilitate unraveling of functional aspects of proteins encoded in various pathogenic organisms. Apart from studies embodied in the thesis, author has been involved in two other studies, which are provided as appendices. Appendix 1 describes comparison of substitution pattern of amino acid residues of protein encoded in P. falciparum genome with substitution pattern of corresponding homologous proteins from non-Plasmodium organisms. Salient differences have been highlighted. Appendix 2 discusses study of bacterial tyrosine kinases with an objective of recognition of all putative protein tyrosine kinases in E. coli. Computational study suggests that protein SopA can be a potential tyrosine kinase and this conclusion is being tested experimentally in collaborator’s laboratory.
26

Mechanism Of Replication Of Sesbania Mosaic Virus (SeMV)

Govind, Kunduri 02 1900 (has links) (PDF)
No description available.

Page generated in 0.0861 seconds