Global ETD Search

31	Design And Synthesis Of Benzimidazole Based Templates In Duplex And Quadruplex DNA Recognition And In Topoisomerase Inhibition Chaudhuri, Padmaparna 02 1900 (has links) The thesis entitled “Design and Synthesis of Benzimidazole Based Templates in Duplex and Quadruplex DNA Recognition and in Topoisomerase Inhibition” deals with the design and synthesis of several benzimidazole based molecules and their interaction with duplex and quadruplex DNA structures. It also elucidates the inhibition effect of the compounds on the activity of topoisomerase I enzyme of parasitic pathogen Leishmania donovani. The work has been divided into five chapters. Chapter 1: An Introduction to DNA and its Interaction with Small molecules. The first chapter provides an introduction to the double helical structure of DNA and the central dogma that suggests the flow of genetic information from DNA to RNA to protein. This chapter also presents an overview on the various types of small molecules that interact with duplex and quadruplex structures of DNA or interfere with the activity of DNA targeted enzymes like topoisomerase. This chapter describes the importance of such molecules as chemotherapeutic agents. Chapter 2 deals with three isomeric, symmetrical bisbenzimidazole derivatives bearing pyridine on the two termini. The syntheses, duplex DNA binding and computational structure analyses of the molecules have been divided into two sections. Chapter 2A: Novel Symmetrical Pyridine Derivatized Bisbenzimidazoles: Synthesis and Unique Metal Ion Mediated Tunable DNA Minor Groove Binding. The first chapter deals with the synthesis and double stranded (ds) DNA binding characteristics of the three bisbenzimidazole derivatives. Despite being positional isomers, their relative binding affinities towards ds-DNA varied considerably. Fluorescence, circular dichroism and temperature dependent UV-absorption spectroscopy have been employed to characterize ligand-DNA binding interaction. All spectroscopic studies revealed the strong A-T selective DNA binding affinities of the p- and m-pyridine derivatized molecules (p-pyben and m-pyben respectively) and indicated dramatically weak binding interaction of the ortho derivative (o-pyben) to ds-DNA. Additionally, unique transition metal ion mediated tunable DNA binding shown by o-pyben has been described in this chapter. While the ds-DNA binding characteristics of p- and m-pyben remained unaffected in presence of metal ions, that of o-pyben could be reversibly ‘switched off’ in the presence of divalent transition metal ions like Co2+, Ni2+, and Cu2+. Addition of EDTA reversed the effects and DNA binding was again observed. This interesting observation provides valuable insight into the DNA recognition property of these isomeric bisbenzimidazole derivatives. Figure 1. Molecular structures of pyridine derivatized symmetrical bisbenzimidazoles. Chapter 2B: Differential Binding of Positional Isomers of Symmetric Bisbenzimidazoles on DNA Minor-Groove: A Computational study. To explain the weak DNA binding affinity of o-pyben, compared to p- or m-pyben, detailed ab initio/DFT computational analyses of the inherent structural features of the three isomers were performed both in the gas-phase and in water. The study revealed the presence of intramolecular hydrogen bond existing in the opyben, between the benzimidazole proton (H3) and the pyridine nitrogen (N1). Additionally, potential energy scans for rotation about the bonds connecting the pyridine-benzimidazole and benzimidazole-benzimidazole fragments were performed. This revealed surprising conformational rigidity existing in the o- isomer that resisted any out-of-plane twisting of the pyridine-benzimidazole fragment. The presence of intramolecular H-bonding was further confirmed by experimental determination of pKa of the three isomers. The molecules being bisbenzimidazole derivatives bound to the minor groove of ds-DNA, the benzimidazole protons forming hydrogen bonded interactions with the DNA bases. However in the o- derivative, the intramolecular hydrogen bonding made the crucial benzimidazole protons unavailable for DNA binding thereby leading to its poor interaction with DNA. Chapter 3. Novel Series of Anthra[1,2-d]imidazole-6,11-dione Derivatives: Synthesis, DNA Binding and Inhibition of Topoisomerase I of Leishmania donovani This chapter describes the synthesis of nine imidazole fused anthraquinone derivatives and their interaction with double-stranded DNA, investigated by UV-visible absorption spectroscopy and viscometric titrations. Figure 2. Molecular structures of the imidazole fused anthraquinone derivatives. All the molecules showed intercalative mode of binding to double stranded DNA, though their relative binding affinities were different. Next their inhibitory effects on the catalytic activity of topoisomerase I enzyme of Leismania donovani were investigated. L. donovani is the causative agent for human visceral leishmaniasis; a fatal disease affecting liver and spleen. Five out of the nine derivatives tested, proved to be extremely efficient inhibitors of the enzyme. Of them, three showed greater inhibition potency than camptothecin, a well-established topoisomerase I inhibitor and the precursor for several clinically useful anti-tumor drugs. The molecules were shown to inhibit by the stabilization of enzyme-DNA cleavable complex, and the inhibition efficiency was found to be highly dependent on the pKa of the side-chain nitrogen. These results provide useful insights towards developing more potent inhibitors of the parasitic enzyme. As the compounds are synthetically facile, chemically stable and possess long shelf life, they should be attractive candidates for design of novel family of topoisomerase I inhibitor. Indeed the nature of amine based side chain and its pKa would hold the key in such design. Chapter 4 deals with a series of symmetrical bisbenzimidazole derivatives in which the benzimidazole units have been connected via different aromatic linkers. The syntheses, duplex DNA interaction, topoisomerase inhibition and quadruplex DNA stabilization shown by these four molecules have been divided into two sections. Chapter 4A. Synthesis, Duplex DNA Binding and Topoisomerase I Inhibition by Symmetrical Bisbenzimidazole Derivatives with Aromatic Linkers. This chapter describes the synthesis of four symmetrical bisbenzimidazole derivatives bearing aromatic linkers, phenyl, naphthyl or anthryl between the benzimidazole rings. Next their interaction with duplex DNA was investigated using fluorescence and temperature dependent UV absorption spectroscopy and viscometric titration techniques. Addition of DNA caused fluorescence enhancement of the molecules implying their interaction with duplex DNA. All the four molecules on binding to double helical DNA induced thermal stabilization of the latter. Viscometric titration of calf thymus DNA with the four compounds revealed a partial-intercalative mode of binding for the anthracene derivatized molecule 4. Next, their inhibitory effects on the catalytic activity of topoisomerase I enzyme were studied. The anthracene derivatized compound (4) showed high inhibition of the enzyme catalyzed relaxation of supercoiled plasmid DNA. Naphthalene derivatized compound (3) exhibited weak inhibition whereas the derivatives bearing 1,4- and 1,3-disubstitued benzene (1 and 2 respectively) units showed no inhibition. Figure 3. Molecular structures of the symmetrical bisbenzimidazole derivatives. Chapter 4B. Quadruplex DNA Stabilization by Symmetrical Bisbenzimidazole Derivatives with Aromatic Linkers. The ability of the aforementioned molecules to stabilize G-quadruplex structures was investigated next. DNA quadruplex secondary structures are potential molecular targets for new generation chemotherapeutic drugs; hence there is an impetus in developing quadruplex targeting molecules. The Tetrahymena thermophilia telomeric sequence 5´-(T2G4)4-3´ was selected for the studies as it exhibits interesting structural polymorphism depending on whether quadruplex formation occurs in presence of Na+ or K+. Circular dichroism and fluorescence anisotropy techniques were used to study the interaction of these newly synthesized molecules with quadruplex DNA. Also thermal stabilization of quadruplex structure induced by the molecules was determined by temperature dependent UV absorption studies. The compounds 1, 3 and 4 stabilized Na+ induced quadruplex without causing any structural alterations of the latter. However, the m-phenyl linker bearing molecule 2, above a certain [ligand]/[DNA] concentration ratio, caused uniquestructural alteration of the Na+ induced quadruplex such that the CD-signature of the latter resembled that of a K+ induced quadruplex structure. This result was corroborated by quadruplex thermal melting data and fluorescence anisotropy. Interestingly this ligand was also able to induce secondary structure formation in randomly oriented ss-DNA, akin to K+ induced quadruplex structure, even in the absence of Na+ or K+. Chapter 5. Synthesis and DNA Binding of Novel Biscationic Dimers of Bisbenzimidazole Systems. This chapter describes the design, synthesis and ds-DNA binding properties of four dicationic dimers of bisbenzimidazoles. Targeting long base pair sequences in double helical DNA is a key issue in chemical biology and connecting different DNA binding modules by appropriate linkers is an attractive strategy for achieving the same. The precursor monomer unit was a bisbenzimidazole derivative and an analogue of Hoechst 33258. Two such moieties were connected via bisoxyethylenic or 6- or 3-methylenic or piperazinyl units to achieve linker of varying length, rigidity and hydrophilicity. To study the interaction of the dimers with duplex DNA, fluorescence and circular dichroism spectroscopy were used. Two of the dimers, (bbim-2ox-bbim and bbim-6met-bbim) bearing long flexible spacers, were able to target 13-AT base pairs long oligonucleotide sequences in a 1:1 binding mode with an affinity 8-10 times better than the precursor monomer or Hoechst 33258. Also thermal denaturation experiments showed high duplex stabilization induced by the same two dimers. All studies indicated a bidentate mode of binding where both the arms of the dimers participated in DNA binding. The molecules bearing the short and rigid linkers (bbim-3met-bbim and bbimpiper- bbim) on the other hand showed low binding affinity towards duplex DNA, as indicated by fluorescence, circular dichroism and thermal melting studies. The short linkers probably did not favor simultaneous binding of both the monomeric arms of the dimers to DNA minor groove. The work reported in this chapter indicates the strong influence of the length and nature of linker in determining drug/DNA binding affinity. Figure 4. Molecular structures of dicationic dimeric bisbenzimidazole derivatives.(Refer PDF File) Molelcular Genetics DNA Topoisomerase Benzimidazole DNA Binding DNA Interaction Bisbenzimidazole DNA Minor Groove Topoisomerase I Leishmania donovani Novel Biscationic Dimers Biochemical Genetics
32	Uso da estratégia drogas gêmeas para a síntese de novos homodimeros de adutos de morita-bayllis- hilman potenciais candidatos a fármacos antiparasitários Silva, Wagner André Vieira da 15 August 2016 (has links) Submitted by Maike Costa (maiksebas@gmail.com) on 2017-06-27T14:23:29Z No. of bitstreams: 1 arquivototal.pdf: 5840129 bytes, checksum: 587ce0612688b158f82e9c8c528f38e7 (MD5) / Made available in DSpace on 2017-06-27T14:23:29Z (GMT). No. of bitstreams: 1 arquivototal.pdf: 5840129 bytes, checksum: 587ce0612688b158f82e9c8c528f38e7 (MD5) Previous issue date: 2016-08-15 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / This work was performed in order to synthesize and bioavaliar the activity of new adducts Morita-Baylis-Hillman (AMBH) as potential drug candidates. The AMBH were synthesized from the twin drug approach (approach twin drugs) and bioavaliados against Leishmania promastigote form donovanii, a kind of visceral leishmaniasis and more severe disease, which has a drug used for the treatment accompanied by high toxicity. As Michael acceptor to be used in Morita-Baylis-Hillman (MRBH) was synthesized diacrylate ethylene glycol (50) from the esterification reaction between ethylene glycol (65) and acrylic acid (66). The first MRBH was investigated between two equivalent 2-nitrobenzaldehyde (57) and one equivalent of diacrylate 50, in acetonitrile as solvent in the presence of DABCO, yielding two products: an adduct 67 and an adduct homodimeric 42. In investigations of experimental parameters the MRBH, DMF, DABCO and room temperature proved to be the most favorable conditions for the formation of adducts homodimeric, these being obtained in yields of 35-94% and reaction times between 24 and 20 days, isolated by liquid / liquid and via flash chromatography. Homodimers and other bioavaliados AMBH results were satisfactory to excellent IC50 for homodimeric adducts (IC50 126.20 to 0,50M). All homodimeric AMBH had higher bioactivity the corresponding AMBH, showing the success of the twin drugs approach against promastigote species of leishmania donovanii, reaching the impressive result, in the case of 49 homodimer be 393.1 times more active than the corresponding AMBH 56, being 1.24 more active than the anfoterinica B, and no reported toxicity exposure in red blood cells of human blood (iS> 400 against iS = 18.73 amphotericin B). These results show that 49 homodimer is a promising molecule in the search for new drug candidates. / Este trabalho foi realizado com o objetivo de sintetizar e bioavaliar a atividade de novos Adutos de Morita-Baylis-Hillman (AMBH) como potenciais candidatos a fármacos. Os AMBH foram sintetizados a partir da abordagem drogas gêmeas (twin drugs approach) e bioavaliados contra a forma promastigota Leishmania donovanii, uma espécie da forma visceral e mais grave da doença, a qual possui o fármaco utilizado para o tratamento acompanhado de grande toxicidade. Como aceptor de Michael para ser utilizado na reação de Morita-Baylis-Hillman (RMBH), foi sintetizado o diacrilato do etileno glicol (50) a partir da reação de esterificação entre o etileno glicol (65) e o ácido acrílico (66). A primeira RMBH investigada foi entre dois equivalentes 2-nitrobenzaldeído (57) e um equivalente do diacrilato 50, em acetonitrila como solvente na presença de DABCO, obtendo-se dois produtos: um aduto 67 e um aduto homodimérico 42. Nas investigações dos parâmetros experimentais da RMBH, o DMF, o DABCO e a temperatura ambiente mostraram ser as condições mais favoráveis para a formação dos adutos homodiméricos, sendo esses obtidos com rendimentos entre 35-94% e em tempos reacionais entre 24h e 20 dias, isolados por extração líquido/líquido e via cromatográfia flash. Os homodímeros e os demais AMBH bioavaliados tiveram resultados de satisfatórios a excelentes de CI50 para os adutos homodiméricos (CI50 126,20 a 0,50M). Todos os AMBH homodiméricos tiveram bioatividade superior aos AMBH correspondentes, evidenciando o sucesso da abordagem de drogas gêmeas contra a espécie promastigota da leishmania donovanii, chegando ao impressionante resultado, no caso do homodímero 49, ser 393,1 vezes mais ativo que o AMBH correspondente 56, sendo 1.24 mais ativo que a anfoterinica B, além de não apresentar toxicidade na exposição em glóbulos vermelhos do sangue humano (IS > 400, contra IS = 18,73 da anfotericina B). Estes resultados evidenciam que homodímero 49 é uma molécula promissora na busca de novos candidatos a fármacos. Drogas gêmeas Leishmania donovani Twin drugs Homodimers adducts Morita-Baylis-Hillman Leishmania donovanii CIENCIAS EXATAS E DA TERRA::QUIMICA
33	Kala-azar in Nepal: public health evidence to support the elimination initiative Uranw, Surendra Kumar 25 September 2013 (has links) Visceral leishmaniasis (VL) or kala-azar is a parasitic infectious disease that is fatal if left untreated. Two types of Leishmania species are causal agents of VL: Leishmania infantum and Leishmania donovani. VL caused by L.infantum is a zoonosis and is endemic in countries around the Mediterranean basin and in Latin-America. VL caused by L. donovani is assumed to be an anthroponosis and is endemic in East-Africa and the Indian subcontinent.<p><p>VL is considered as a major public health problem in the Indian subcontinent and the annual case load of VL in this focus is represents around 80% of the global burden. In Nepal, a quarter of the country’s population is estimated to be at risk of this disease. The disease in the ISC is caused by L. donovani, which is transmitted from man to man by the bite of the sandfly Phlebotomus argentipes. VL occurs predominantly among the poorest of the poor. Since 2005, the governments of Bangladesh, India and Nepal have been engaged in a collaborative effort to eliminate VL from the region. The strategies to control the disease include early diagnosis and treatment, along with vector control measures, effective disease surveillance, social mobilization and partnership building, and clinical and operational research. In recent years, considerable efforts were made within the elimination initiative. Still, important gaps remain in the understanding of the VL epidemiology, and impact as well as on the best approach to case management or vector control. These knowledge gaps may affect the success of the ongoing VL elimination initiative and make it difficult to meet the set target of bringing the incidence down to less than 1 case per 10,000 by 2015. With this background we focused on some of the knowledge gaps; we wanted to generate evidence and offer sound recommendations for policy makers to underpin the ongoing VL elimination initiative in the Indian subcontinent in general and in Nepal in particular. <p><p>We have - for the first time- described the epidemiology of L. donovani infection in high transmission areas in Nepal. The sero-prevalence of L. donovani infection was 9% in these communities, but there was wide variation between endemic villages (5-15%). The seroprevalence rates remain however substantially lower than those observed in a parallel study in the neighbouring districts in Bihar, India. In our study 39% of individuals who live together in a house with at least one recent VL case were serologically (DAT) positive compared to 9% in the overall study population in the same endemic region. This pattern suggests that untreated VL cases are the main source of transmission and sharing the same household is an important risk factor for L. donovani infection. Therefore, the VL elimination campaign recently initiated an active case detection strategy including the search of active cases of VL and post-kala-azar dermal leishmaniasis (PKDL).<p><p>Generally the risk factors for VL are linked to precarious housing conditions and an environment that provides excellent breeding sites for the sandfly vector.VL has thus been largely considered as a disease of the rural poor. However, with occasional cases being reported also from town e.g. Dharan in south-eastern Nepal, questions were raised about possible extension of transmission to urban areas.<p><p>We conducted an outbreak investigation including a case-control study among the residents of Dharan town. We documented several clusters of VL cases in the more peripheral wards of the town. These are wards with new settlements where the poorest migrants install themselves. They are typically a rural-urban interface with most residents dependent on daily wages as agricultural labourers. However, several factors pointed to urban transmission: firstly, we found a strong association between VL and certain housing factors. Secondly, the clustering of VL cases in space and the intra-household clustering makes urban transmission more likely than infection due to migration. Finally, the entomological data also provide further evidence in support of local transmission of VL inside the town. The vector P. argentipes was captured repeatedly inside the town, and some of them were infested with L. donovani.<p><p>We studied the health seeking behavior and documented the households cost of VL care in a miltefosine-based programme after the intensified implementation of VL control efforts in Nepal. We enrolled 168 patients that had been treated for VL within twelve months prior to the survey in five districts in south-eastern Nepal. We observed a median delay of 25 days to present to the appropriate level of the primary healthcare system. Most patients first visited unqualified local practitioners or traditional faith healers for VL care. With a median total cost of US$ 165 per episode of VL treatment, the economic burden of VL across all households was 11% of annual household income or 57% of median annual per capita income. About half of the households exceeded the catastrophic expenditure threshold of 10% of annual household income. Our findings seem to suggest that, compared to previous studies, the economic burden of VL (as a % of household income) has indeed decreased. However, despite the provision of free diagnostics and drugs by the government, households still incurred substantial medical out-of-pocket expenditure, especially at private providers. The government should consider specific policies to reduce VL care costs such as a conditional cash programme for travel and food, and a better health insurance scheme. <p><p>We monitored clinical outcomes of VL treatment with miltefosine up to 12 months after the completion of therapy and explored the potential role of patient compliance, drug resistance, and reinfection. The initial cure rate was 95.8% and cure rate at 6 months after treatment was 82.5%, which further dropped to 73.3% at 12 months after miltefosine treatment. The relapse rate at 6 months was 10.8% and 20.0% at 12 months i.e. relapse is observed in one-fifth of miltefosine treated VL patients in Nepal. The decreased effectiveness of miltefosine observed in our study is an alarming signal for the ongoing VL elimination initiative and implicates the need for reviewing the drug policy in the Indian subcontinent. Relapse was most common among children (<12 years of age) and continued to occur beyond the commonly used 6-month follow-up period. No significant clinical risk factors or predictors of relapse apart from age <12 years were found. Parasite fingerprints of pre-treatment and relapse bone marrow isolates were similar within 8 tested patients, suggesting that clinical relapses were not due to re-infection with a new strain, but due to true recrudescences. MIL blood levels at the end of treatment were similar for cured and relapsed patients.The MIL-susceptibility of 131 VL isolates was also analysed in vitro with a promastigote assay and the mean promastigote MIL-susceptibility (IC50) of isolates from definite cures was similar to that of relapses.<p><p>We also assessed patient adherence to miltefosine treatment for VL given on an unsupervised ambulatory basis, prescribed under routine conditions (i.e. little or no time for treatment counselling) in government primary healthcare facilities. Our findings showed that adherence is a problem and the target of 90% of capsules taken is not reached in 15% of the enrolled patients. The gastrointestinal related side-effects and treatment-negligence after the resolution of clinical symptoms of VL were the main reasons for poor adherence. Effective counselling during the treatment, a short take-home message on the action and side effects of miltefosine, and on the importance of adherence are the best way to prevent poor adherence.<p><p>Post-kala-azar dermal leishmaniasis is more commonly seen in inadequately treated cases which is considered as a reservoir of infection maintaining disease transmission. The occurrence of PKDL in Nepal is relatively low compared to neighbouring countries involved in the elimination initiative. Supervised and adequate treatment of VL seems essential to reduce the risk of PKDL development. Policy makers should include surveillance and case management of PKDL in the VL elimination programme.<p> / Doctorat en Santé Publique / info:eu-repo/semantics/nonPublished Santé publique Kala-azar -- Nepal Epidemiology -- Nepal Kala-azar -- Népal Epidémiologie -- Népal L. donovani épidémiologie élimination contrôle de maladies leishmaniose viscérale Nepal
34	Computational Analyses Of Proteins Encoded In Genomes Of Pathogenic Organisms : Inferences On Structures, Functions And Interactions Tyagi, Nidhi 11 1900 (has links) (PDF) The availability of completely sequenced genomes for a number of organisms provides an opportunity to understand the molecular basis of physiology, metabolism, regulation and evolution of these organisms. Significant understanding of the complexity of organisms can be obtained from the functional characterization of repertoire of proteins encoded in their genomes. Computational approaches for recognition of function of proteins of unknown function encoded in genomes often rely on ability to detect well characterized homologues. Homology searches based on pair-wise sequence comparisons can reliably detect homologues with sequence identity more than 30%. However, detecting homologues characterized by sequence identity below 30% is difficult using these methods. Distant homology relationship can be established using profiles or position specific scoring matrices, which encapsulate information about structurally and functionally conserved residues. These conserved residues imply high constraints at a particular amino acid residue site due to their involvement in structural stability, enzymatic activity, ligand binding, protein folding or protein–protein interactions. In addition, information on three dimensional structures of proteins also aid in detection of remote homologues, as tertiary structures of proteins are conserved better than the primary structures of proteins. The gross objective of the work reported in this thesis is to employ various sensitive remote homology detection methods to recognize relevant functional information of proteins encoded mainly in pathogenic organisms. Since proteins do not work in isolation in a cell, it has become essential to understand the in vivo context of functions of proteins. For this purpose, it is essential to have an understanding of all molecules that interact with a particular protein. Thus, another major area of bioinformatics has been to integrate protein-protein interaction information to enable better understanding of context of functional events. Protein-protein interaction analysis for host-pathogen can lead to useful insight into mode of pathogenesis and subsequent consequences in host cell. Chapters 2-6 of the thesis discuss the sequence and structural characteristics along with remote evolutionary relationships and functional implications of uncharacterized proteins encoded in genomes of following pathogens: Helicobacter pylori, Plasmodium falciparum and Leishmania donovani. The Chapters 6-8 discuss mainly various sequence, structural and functional aspects of protein kinases encoded in genomes of various prokaryotes and viruses. Chapter 1 discusses background information and literature survey in the areas of homology detection and prediction of protein-protein interactions. The growth of genomic data and need for processing genomic data to infer context of various functional events have been highlighted. Different approaches to recognize functions of proteins (experimental as well as computational) have been discussed. Various experimental and computational approaches to detect/predict protein-protein interactions have been mentioned. Chapter 2 discusses recognition of non-trivial remote homology relationships involving proteins of Helicobacter pylori and their implications for function recognition. H. pylori is microaerophilic, Gram negative bacterial pathogen. It colonizes human gastric mucosa and is a causative agent of gastroduodenal disease. The pathogen infects about 50% of the human population. It can lead to development of Mucosa-associated lymphoid tissue lymphoma. About 10% of the infected population develop gastric or duodenal ulcer and approximately 1% develop gastric cancer. H. pylori has been classified as class I carcinogen by WHO. Pathogen is characterized by type IV secretion system. The complete genomic sequences of three widely studied strains including 26695, J99 and HPAG1 of Helicobacter pylori are available. According to the genome analysis, the number of predicted open reading frames in strain 26695, J99 and HPAG1 are 1590, 1495 and 1536 respectively. Out of predicted H. pylori proteins from 26695, J99 and HPAG1 strains, numbers of proteins with no functional domain assignments in Pfam database (Protein family database) are 453, 357 and 400 respectively. There are proteins in different strains of H. pylori genomes where one part of the protein is associated with at least one protein domain of known function and hence preliminary indication of their functions is available whereas rest of the region is not associated with any function. There are 772, 803 and 790 such segments in proteins from strains 26695, J99 and HPAG1 respectively with at least 45 residues with no functional assignment currently available. Sensitive remote homology detection methods have been employed to establish relationships for 294 amino acid sequences and results have been grouped into 4 categories. Results of homology detection have been further confirmed by studying conservation of amino acid residues which are important for functioning of the proteins concerned. (i) Remote relationship has been established involving protein domain families for which no bonafide member is currently known in H. pylori. For example: DNA binding protein domain (Kor_B) has been assigned to a H. pylori protein at sequence identity of 20%. Study involving secondary structure prediction and conservation of amino acid residues confirms the results of homology detection methods. (ii) Remote relationship has been established involving H. pylori hypothetical proteins and protein domain families, for which paralogous members are present in Helicobacter pylori. For example, Cytochrome_C, an electron transfer protein domain could be associated with a Helicobacter pylori protein sequence which shows a sequence identity of 14% with sequences of bonafide cytochrome C. (iii) “Missing” metabolic proteins of H. pylori have also been recognized. For example, Aspartoacylase (EC 3.5.1.15) catalyzes deacetylation of N-acetylaspartic acid to produce acetate and L-aspartate. This enzyme in aspartate metabolism pathway has not been reported so far from H. pylori. A remote evolutionary relationship between a H. pylori protein and Aspartoacylase domain has been established at sequence identity of 17% thus filling the gap in this metabolic pathway in the pathogen. (iv) New functional assignments for domains in H. pylori sequences with prior assignment of domains for the rest of the sequences have been made. For example, DNA methylase domain has been assigned to C-terminal region of H. pylori protein which already had Helicase domain assigned to the N-terminal region of the protein. All these information should open avenues for further probing by carrying out experiments which will impact the design of inhibitor against this pathogen and will result in better understanding of pathogenesis of this organism in human. Chapter 3 describes prediction of protein–protein interactions between Helicobacter pylori and the human host. A lack of information on protein-protein interactions at the host-pathogen interface is impeding the understanding of the pathogenesis process. A recently developed, homology search-based method to predict protein-protein interactions is applied to the gastric pathogen, Helicobacter pylori to predict the interactions between proteins of H. pylori and human proteins in vitro. Many of the predicted interactions could potentially occur between the pathogen and its human host during pathogenesis as we focused mainly on the H. pylori proteins that have a transmembrane region or are encoded in the pathogenic island and those which are known to be secreted into the human host. By applying the homology search approach to protein-protein interaction databases DIP and iPfam, in vitro interactions for a total of 623 H. pylori proteins with 6559 human proteins could be predicted. The predicted interactions include 549 hypothetical proteins of as yet unknown function encoded in the H. pylori genome and 13 experimentally verified secreted proteins. A total of 833 interactions involving the extracellular domains of transmembrane proteins of H. pylori could be predicted. Structural analysis of some of the examples reveals that the predicted interactions are consistent with the structural compatibility of binding partners. Various probable interactions with discernible biological relevance are discussed in this chapter. For example, interaction between CFTR protein (NP_000483) and multidrug resistance protein (HP1206) has been predicted. The structure of the CFTR intracellular domain is known in the homomeric form and consists of five AAA transport domains in tandem (PDB code 1XMI). Out of the five identical subunits, two subunits (the B chain and the E chain in the PDB structure) have been selected. The structure of multidrug resistance protein of the pathogen based on the B chain (sequence identity 32%) of the template has been modeled. This exercise suggests that interface residues in the model are congenial for interaction. This makes the structural complex feasible in in vitro conditions and suggests that the pathogen protein may compete for occupancy with the host protein. Chapter 4 describes recognition of Plasmodium-specific protein domain families and their roles in Plasmodium falciparum life cycle. Malaria in humans is caused by the parasites of intracellular, eukaryotic protozoan of apicomplexan nature belonging to the genus Plasmodium. Out of five species of Plasmodium, namely, P. falciparum, P. ovale, P. vivax, P. malariae and P. knowlesi which infects human, P. falciparum causes lethal infection. P. falciparum proteins have diverged extensively during the course of evolution. Pathogen genome is rich in A+T composition which larger than the homologous proteins from other organisms due to presence of low complexity regions. Organism specific families are important as they play roles in peculiar life style of an organism. If the organism is a pathogen, then these family members may play roles in pathogenesis. Inhibiting these specific proteins is unlikely to interfere with host system as no homolog may be present in host. In the present work we identify Plasmodium specific protein families and their role in different stages of life cycle of the pathogen. A total of 5086 amino acid sequences (full length sequences/fragments of proteins) show homology only with amino acid sequences from Plasmodium organisms and hence are Plasmodium-specific. These Plasmodium-specific amino acid sequences cluster into 106 Plasmodium-specific families (≥2 members per family). 14 Plasmodium-specific protein domain families with known physico-chemical properties are observed. These Plasmodium-specific protein domain families are involved in various important functions such as rosetting and sequestering of infected erythrocytes, binding to surface of host cell and invasion process in life cycle of pathogen. Also, 89 new Plasmodium-specific protein domain families have been recognized. Analysis of various aspects of members of Plasmodium-specific proteins domain families such as their potential to target apicoplast, protein-protein interaction, expression profile and domain organization has been performed to derive relevant information about function. New Plasmodium specific domain families for which no function can be associated could provide some insight into much diverged Plasmodium species. These proteins may play role in parasite-specific life style. Experimental work on these Plasmodium-specific proteins might fill the gaps of less understood physiology of this parasite. Chapter 5 presents genome-wide compilation of low complexity regions (LCR) in proteins. An indepth analysis of the nature, structure, and functional role of the proteins containing low complexity regions in Plasmodium falciparum, was undertaken given the high prevalence of LCRs in the proteome of this organism. Low complexity regions and repeat patterns have been recognized in proteins encoded in 986 genomes (68 archaea, 896 prokaryotes and 22 eukaryotes). Low complexity regions have been classified into following three categories: a) Composition of LCRs: (i) LCRs can be stretches of homo amino acid residues (ii) LCRs can be stretches of more than one amino acid residue type b) Periodicity of amino acids in LCRs: Certain amino acid residues can be observed at certain specific periodicity in proteins. c) Repeat patterns: Certain motif of amino acid residues are repeated in protein. 850 Plasmodium falciparum proteins are observed to have at least one repeat pattern where the repeating unit is at least 5 amino acid residues long. Statistical analysis on single amino acid residue repeats indicate that occurrence of stretches of homo amino acid residues is not a random event. Studies on recognition of functions, protein protein interactions and organization of tethered domain(s) in proteins containing LCR suggest that these proteins are part of variety of functional events such as signal transduction, enzymatic processes, cell differentiation, pyrimidine biosynthesis, fatty acid biosynthesis and chromosomal replication. Representations of low complexity regions of Plasmodium falciparum in protein data bank suggest that LCRs can take conformation of regular secondary structure (apart from disordered regions) in 3-D structures of proteins. Chapter 6 describes sequence analysis, structural modeling and evolutionary studies of Leishmania donovani hypusine pathway enzymes. Leishmania is an eukaryotic kinetoplastid protozoan parasite which causes leishmaniasis in humans. Hypusine is a non standard polyaminederived amino acid Nε-(4-amino-2-hydroxybutyl) lysine and is named after its two structural components, hydroxyputrescine and lysine. The eukaryotic translation initiation factor 5A (eIF5A) is the only cellular protein containing hypusine. Synthesis of hypusine is critical for the function of elF5A and is essential for eukaryotic cell proliferation and survival. Formation of hypusine is the result of a two step post-translational modification process involving enzymes (i) deoxyhypusine synthase (DHS) (ii) deoxyhypusine hydroxylase (DOHH). DHS, the first enzyme involved in hypusine pathway catalyzes the NAD-dependent transfer of the butylamino moiety of spermidine (substrate) to the ε-amino group of a specific lysine residue of eIF5A precursor and generates deoxyhypusine containing intermediate. DOHH, the second enzyme in same pathway catalyzes the hydroxylation of deoxyhypusine-containing intermediate, generating hypusine-containing mature eIF5A. Two putative deoxyhypusine synthase (DHS) sequences DHS34 and DHS20 have been identified in Leishmania donovani, by Professor Madhubala and coworkers (Jawaharlal Nehru University, New Delhi) with whom the work embodied in this chapter was done in collaboration. Detailed comparison of DHS34 sequence from Leishmania with human DHS protein indicated conservation of functionally important residues. 3D structural modeling studies of protein suggested that residues around the active site were absolutely conserved. NAD binding regions are located spatially closer, however, one NAD binding region was observed in a large (225 amino acid residues long) insertion. Based on these observations, DHS34 was predicted to have enzymatic activity. Experimental studies done by our collaborators confirmed preliminary results of computational analysis. Based on sequence and structural analysis of DHS20 and DOHH proteins, DHS20 and DOHH were proposed to be catalytically inactive and active respectively. Experimental studies on these proteins supported results of computational analysis. Deoxyhypusine synthase (DHS) and Deoxyhypusine hydroxylase (DOHH) are key proteins conserved in the hypusine synthesis pathways of eukaryotes. Because they are highly conserved, they could be coevolving. Comparison of the genetic distance matrices of DHS and DOHH proteins reveals that their evolutionary rates are better correlated when compared to the rate of an unrelated protein such as Cytochrome C. This indicates that they are coevolving, further serving as an indicator that, even non-interacting proteins that are functionally coupled, experience correlated evolution. However, this correlation does not extend to their tree topologies. Chapter 7 provides a classification scheme for protein kinases encoded in genomes of prokaryotic organisms. Overwhelming majority of the Ser/Thr protein kinases identified by gleaning archaeal and eubacterial genomes could not be classified into any of the well known Hanks and Hunter subfamilies of protein kinases. This is owing to the development of Hanks and Hunter classification scheme based on eukaryotic protein kinases which are highly divergent from their prokaryotic homologues. A large dataset of prokaryotic Ser/Thr protein kinases prokaryotic Ser/Thr protein kinases. Traditional sequence alignment and phylogenetic approaches have been used to identify and classify prokaryotic kinases which represent 72 subfamilies with at least 4 members in each. Such a clustering enables classification of prokaryotic Ser/Thr kinases and it can be used as a framework to classify newly identified prokaryotic Ser/Thr kinases. After series of searches in a comprehensive sequence databases, it is recognized that 38 subfamilies of prokaryotic protein kinases are associated to a specific taxonomic level. For example 4, 6 and 3 subfamilies have been identified that are currently specific to phylum proteobacteria, cyanobacteria and actinobacteria respectively. Similarly, subfamilies which are specific to an order, sub-order, class, family and genus have also been identified. In addition to these, it was also possible to identify organism-diverse subfamilies. Members of these clusters are from organisms of different taxonomic levels, such as archaea, bacteria, eukaryotes and viruses. Interestingly, occurrence of several taxonomic level specific subfamilies of prokaryotic kinases contrasts with classification of eukaryotic protein kinases in which most of the popular subfamilies of eukaryotic protein kinases occur diversely in several eukaryotes. Many prokaryotic Ser/Thr kinases exhibit a wide variety of modular organization which indicates a degree of complexity in protein-protein interactions and the signaling pathways in these microbes. Chapter 8 focuses on recognition, classification of protein kinases encoded in genomes of viruses and their implications in various functions and diseases. Protein kinases encoded by viral genomes play a major role in infection, replication and survival of viruses. Using traditional sequence homology detection tools, sequence alignment methods and phylogenetic approaches, protein kinases were recognized. 646123 protein sequences from 35799 viral genomes (including strains) have been used in this analysis. Protein kinases are identified using a combination of profile-based search methods such as PSI-BLAST, RPS-BLAST and HMMER approaches. Based upon sequence similarity over the length of catalytic kinase domains, 479 protein kinase domains recognized in 244 viral genomes have been clustered into 46 subfamilies with minimum sequence identity of 35% within a subfamily. Viral protein kinases are encoded in genomes of retro-transcribing viruses or viruses which possess double stranded DNA as genetic material. Based on the available functional information present for one or more members of a subfamily, a putative function has been assigned to other members of the subfamily. Information regarding interaction of viral protein kinases with viral/host protein has also been considered for enhancing understanding of function of kinases in a subfamily. Out of 46 subfamilies, 14 subfamilies are characterized by various functions. Kinases belonging to UL97, US69, UL13 and BGLF subfamilies are virus specific. For 7 subfamilies, nearest neighbors are from well characterized eukaryotic protein kinase groups such as AGC, CAMK and CDK. Out of 25 new uncharacterized subfamilies observed in this analysis, 13 subfamilies are virus specific. Different subfamilies have been characterized by various functions which are crucial for viral infection such as synthesis of structural unit, replication of genetic material, modification of cellular components, alteration in host immune system, competing with cellular protein for efficient usage of host machinery. Also, many viral kinases share very high sequence identity (~97%) with their eukaryotic counterpart and represent disease state. For example, a protein kinase encoded in Avian erythroblastosis virus shares 97% sequence identity with catalytic domain of human epidermal growth factor receptor tyrosine kinase. Leucine at position 861 in human protein is substituted by Gln in cancer conditions; the viral protein kinase sequence possesses Gln at corresponding position and thus represents disease state. Chapter 9 provides study of dependency on the ability of 3-D structural features of comparative models and crystal structures of inactive forms of enzymes to predict enzymes by considering protein kinases as case study. With the advent of structural genomics initiatives, there is a surge in the number of proteins with 3-D structural information even before functional features are understood on many of these proteins. One of the useful annotations of a protein is the demarcation of a protein into an enzyme or non-enzyme solely from the knowledge of 3-D structure. This is facilitated by the identification of active sites and ligand binding sites in a protein. In this work, which was carried out in collaboration with Dr Jim Warwicker of Manchester University, UK, an approach developed by Warwicker and coworkers has been used. In the 3D structure of proteins, the largest clefts are generally considered to be ligand binding sites. This feature along with other sequence alignment independent properties such as residue preferences, fraction of surface residues and secondary structure elements have been considered to differentiate enzymes from non-enzymes. Electrostatic potential at the active site is one of the key properties utilized in this respect. Active sites in enzymes are generally associated with ionizable groups which can take part in catalysis. In addition to the feature of large clefts in enzymes, active site residues are in buried environments and show larger deviation in pKa values than surface residues. The method proposed by Warwicker and co-workers distinguish proteins in to enzymes and non-enzymes considering the electrostatic features at clefts along with the sequence profile of the protein concerned. Conformation of the inactive state of an enzyme is not congenial to the catalytic function. In an ideal situation, a method should be capable of predicting an enzyme irrespective of whether determined structure corresponds to active or inactive state. Peak potential values have been calculated by using Warwicker program for a set of 15 protein kinases for which 3-D structures are present in active as well in inactive conformations. Comparison of peak potential values calculated for active and inactive conformations suggests that algorithm can differentiate between active and inactive conformations as value for active conformations are generally higher than corresponding values for inactive conformations. However, the peak potential values are high enough for even the inactive conformations to be predicted as enzyme. Peak potential values calculated for generated homology models of protein kinases (for which crystal structures are already available) at different sequence identities with template sequences predict protein kinases as enzymes and their peak potential values are comparable to corresponding values for X-ray structures. This suggests that proteins for which there are no crystal or NMR structures yet available and no good template with high sequence identity are present, peak potential values for models generated at low sequence identity can still give insight into probable function of protein as an enzyme. The enzyme/non-enzyme prediction algorithm was also found to be useful in confirming enzyme functionality using 3-D models of putative viral kinases. Initially, putative function of kinase has been assigned to these viral proteins based solely upon their sequence characteristics such as presence of residues/motifs which are important for activity of the protein. The enzyme recognition method which is not directly sensitive to these motifs confirmed that all the analyzed putative viral kinases are enzymes. Chapter 10 presents conclusions of work embodied in the entire thesis. Very briefly, various computational approaches have been used to analyze and understand structural and functional properties of repertoire of proteins of pathogenic organisms. Analysis of uncharacterized protein domain families has helped to understand the functional implications of constituent proteins. Experimental validation of these results can further facilitate unraveling of functional aspects of proteins encoded in various pathogenic organisms. Apart from studies embodied in the thesis, author has been involved in two other studies, which are provided as appendices. Appendix 1 describes comparison of substitution pattern of amino acid residues of protein encoded in P. falciparum genome with substitution pattern of corresponding homologous proteins from non-Plasmodium organisms. Salient differences have been highlighted. Appendix 2 discusses study of bacterial tyrosine kinases with an objective of recognition of all putative protein tyrosine kinases in E. coli. Computational study suggests that protein SopA can be a potential tyrosine kinase and this conclusion is being tested experimentally in collaborator’s laboratory. Proteins Viral Genomes Microorganisms Microbiology Viral Protein Kinases Protein-protein Interactions Helicobacter Pylori Proteins Plasmodium-Specific Proteins Leishmania donovani Deoxyhypusine Synthase Protein Kinases Plasmodium falciparum Microbiology

Page generated in 0.0341 seconds