Global ETD Search

1	Assignment and assessment of orthology and gene function / Storm, Christian, January 2004 (has links) Diss. (sammanfattning) Stockholm : Karol. inst., 2004. / Härtill 5 uppsatser.
2	Exploring sequence-structure-function relationships in proteins using classification schemes Cheek, Sara Anne. January 2005 (has links) (PDF) Thesis (Ph.D.) -- University of Texas Southwestern Medical Center at Dallas, 2005. / Not embargoed. Vita. Bibliography: 182-209.
3	Identification of two distinct lineages of macaque gamma-2 herpesviruses / Strand, Kurt B. January 2002 (has links) Thesis (Ph. D.)--University of Washington, 2002. / Vita. Includes bibliographical references (leaves 163-232).
4	Design and data analysis of kinome microarrays 2014 May 1900 (has links) Catalyzed by protein kinases, phosphorylation is the most important post-translational modification in eukaryotes and is involved in the regulation of almost all cellular processes. Investigating phosphorylation events and how they change in response to different biological conditions is integral to understanding cellular signaling processes in general, as well as to defining the role of phosphorylation in health and disease. A recently-developed technology for studying phosphorylation events is the kinome microarray, which consists of several hundred "spots" arranged in a grid-like pattern on a glass slide. Each spot contains many peptides of a particular amino acid sequence chemically fixed to the slide, with different spots containing peptides with different sequences. Each peptide is a subsequence of a full protein, containing an amino acid residue that is known or suspected to undergo phosphorylation in vivo, as well as several surrounding residues. When a kinome microarray is exposed to cell lysate, the protein kinases in the lysate catalyze the phosphorylation of the peptides on the array. By measuring the degree to which the peptides comprising each spot are phosphorylated, insight can be gained into the upregulation or downregulation of signaling pathways in response to different biological treatments or conditions. There are two main computational challenges associated with kinome microarrays. The first is array design, which involves selecting the peptides to be included on a given array. The level of difficulty of this task depends largely on the number of phosphorylation sites that have been experimentally identified in the proteome of the organism being studied. For instance, thousands of phosphorylation sites are known for human and mouse, allowing considerable freedom to select peptides that are relevant to the problem being examined. In contrast, few sites are known for, say, honeybee and soybean. For such organisms, it is useful to expand the set of possible peptides by using computational techniques to predict probable phosphorylation sites. In this thesis, existing techniques for the computational prediction of phosphorylation sites are reviewed. In addition, two novel methods are described for predicting phosphorylation events in organisms with few known sites, with each method using a fundamentally different approach. The first technique, called PHOSFER, uses a random forest-based machine-learning strategy, while the second, called DAPPLE, takes advantage of sequence homology between known sites and the proteome of interest. Both methods are shown to allow quicker or more accurate predictions in organisms with few known sites than comparable previous techniques. Therefore, the use of kinome microarrays is no longer limited to the study of organisms having many known phosphorylation sites; rather, this technology can potentially be applied to any organism having a sequenced genome. It is shown that PHOSFER and DAPPLE are suitable for identifying phosphorylation sites in a wide variety of organisms, including cow, honeybee, and soybean. The second computational challenge is data analysis, which involves the normalization, clustering, statistical analysis, and visualization of data resulting from the arrays. While software designed for the analysis of DNA microarrays has also been used for kinome arrays, differences between the two technologies prompted the development of PIIKA, a software package specifically designed for the analysis of kinome microarray data. By comparing with methods used for DNA microarrays, it is shown that PIIKA improves the ability to identify biological pathways that are differentially regulated in a treatment condition compared to a control condition. Also described is an updated version, PIIKA 2, which contains improvements and new features in the areas of clustering, statistical analysis, and data visualization. Given the previous absence of dedicated tools for analyzing kinome microarray data, as well as their wealth of features, PIIKA and PIIKA 2 represent an important step in maximizing the scientific value of this technology. In addition to the above techniques, this thesis presents three studies involving biological applications of kinome microarray analysis. The first study demonstrates the existence of "kinotypes" - species- or individual-specific kinome profiles - which has implications for personalized medicine and for the use of model organisms in the study of human disease. The second study uses kinome analysis to characterize how the calf immune system responds to infection by the bacterium Mycobacterium avium subsp. paratuberculosis. Finally, the third study uses kinome arrays to study parasitism of honeybees by the mite Varroa destructor, which is thought to be a major cause of colony collapse disorder. In order to make the methods described above readily available, a website called the SAskatchewan PHosphorylation Internet REsource (SAPHIRE) has been developed. Located at the URL http://saphire.usask.ca, SAPHIRE allows researchers to easily make use of PHOSFER, DAPPLE, and PIIKA 2. These resources facilitate both the design and data analysis of kinome microarrays, making them an even more effective technique for studying cellular signaling. Peptide arrays Kinome arrays Phosphorylation Cellular signaling Sequence homology Machine learning Clustering Statistical methods
5	Towards a complete sequence homology concept: Limitations and applications Wong, Wing-Cheong 14 December 2011 (has links) (PDF) Historically, the paradigm of similarity of protein sequences implying common structure, function and ancestry was generalized based on studies of globular domains. The implications of sequence similarity among non-globular protein segments have not been studied to the same extent; nevertheless, homology considerations are silently extended for them. This appears especially detrimental in the case of transmembrane helices (TMs) and signal peptides (SPs) where sequence similarity is necessarily a consequence of physical requirements rather than common ancestry. Since the matching of SPs/TMs creates the illusion of matching hydrophobic cores, the inclusion of SPs/TMs into domain models can give rise to wrong annotations. More than 1001 domains among the 10,340 models of Pfam release 23 and 18 domains of SMART version 6 (out of 809) contain SP/TM regions. As expected, fragment mode HMM searches generate promiscuous hits limited to solely the SP/TM part among clearly unrelated proteins. More worryingly, this work shows explicit examples that the scores of clearly false-positive hits, even in globalmode searches, can be elevated into the significance range just by matching the hydrophobic runs. In the PIR iProClass database v3.74 using conservative criteria, this study finds that at least between 2.1% and 13.6% of its annotated Pfam hits appear unjustified for a set of validated domain models. Thus, false positive domain hits enforced by SP/TM regions can lead to dramatic annotation errors where the hit has nothing in common with the problematic domain model except the SP/TM region itself. A workflow of flagging problematic hits arising from SP/TM-containing models for critical reconsideration by annotation users is provided. While E-value guided extrapolation of protein domain annotation from libraries such as Pfam with the HMMER suite is indispensable for hypothesizing about the function of experimentally uncharacterized protein sequences, it can also complicate the annotation problem. In HMMER2, the E-value is computed from the score via a logistic function or via a domain model-specific extreme value distribution (EVD); the lower of the two is returned as E-value for the domain hit in the query sequence. We demonstrated that, for thousands of domain models, this treatment results in switching from the EVD to the statistical model with the logistic function when scores grow (for Pfam release 23, 99% in the global mode and 75% in the fragment mode). If the score corresponding to the breakpoint results in an E-value above a user-defined threshold (e.g., 0.1), a critical score region with conflicting E-values from the logistic function (below the threshold) and from EVD (above the threshold) does exist. Thus, this switch will affect E-value guided annotation decisions in an automated mode. To emphasize, switching in the fragment mode is of no practical relevance since it occurs only at E-values far below 0.1. Unfortunately, a critical score region does exist for 185 domain models in the hmmpfam and 1748 domain models in the hmmsearch global-search mode. For 145 out the respective 185 models, the critical score region is indeed populated by actual sequences. In total, 24.4% of their hits have a logistic function-derived E-value<0.1 when the EVD provides an E-value>0.1. Examples of false annotations are provided and the appropriateness of a logistic function as alternative to the EVD is critically discussed. This work shows that misguided E-value computation coupled with non-globular regions embedded in domain model library not only causes annotation errors in public databases but also limits the extrapolation power of protein function prediction tasks. So far, the preceding work has demonstrated that sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches from non-polar residues. We found that there are two types of transmembrane helices (TMs) in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intramembrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information. For extending the homologyconcept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments. Theoretical insights from this work were applied to problems of function prediction for specific uncharacterized gene/protein sequences (for example, APMAP and ARXES) and for the functional classification of TM-containing proteins. ddc:000
6	Shb and its homologues : signaling in T lymphocytes and fibroblasts / Lindholm, Cecilia K., January 2002 (has links) Diss. (sammanfattning) Uppsala : Univ., 2002. / Härtill 4 uppsatser.
7	Hidden Markov models for remote protein homology detection / Wistrand, Markus, January 2005 (has links) Diss. (sammanfattning) Stockholm : Karol. inst., 2006. / Härtill 4 uppsatser.
8	Mechanism of homologous recombination : from crystal structures of RecA-single stranded DNA and RecA-double stranded DNA filaments / Chen, Zhucheng. January 2009 (has links) Thesis (Ph. D.)--Cornell University, January, 2009. / Vita. Includes bibliographical references (leaves 121-134).
9	Nekovalentní interakce tryptofanu ve struktuře proteinu / Non-covalent interactions of tryptophan in protein structure Sokol, Albert January 2019 (has links) A thorough knowledge of non-covalent amino acid interactions within a protein structure is essential for a complete understanding of its conformation, stability and function. Among all the amino acids that usually make up a protein, tryptophan is distinguished both by its rarity and size of its side chain formed by an indole group. It is able to provide various types of indispensable interactions within the protein and between different polypeptide chains, but also between the protein and a biological membrane. In addition, it is the most commonly used natural fluorophore. Databases of solved protein structures are commonly used to study amino acid interactions and allow more or less complex analyzes of the issue. Thus many non-covalent interactions that may occur between tryptophan and other amino acids have been found. However, most of these analyzes focus on specific interactions and do not follow up the tryptophan's environment as a whole, where all amino acids interact. Some newly developed methods have been used in this Thesis, specifically the occurrence profiles of the individual amino acids around the indole group of tryptophan and the results were compared with an available literature. The amino acid that has the greatest preference for tryptophan turned out to be tryptophan again, and...
10	Towards a complete sequence homology concept: Limitations and applications Wong, Wing-Cheong 11 August 2011 (has links) Historically, the paradigm of similarity of protein sequences implying common structure, function and ancestry was generalized based on studies of globular domains. The implications of sequence similarity among non-globular protein segments have not been studied to the same extent; nevertheless, homology considerations are silently extended for them. This appears especially detrimental in the case of transmembrane helices (TMs) and signal peptides (SPs) where sequence similarity is necessarily a consequence of physical requirements rather than common ancestry. Since the matching of SPs/TMs creates the illusion of matching hydrophobic cores, the inclusion of SPs/TMs into domain models can give rise to wrong annotations. More than 1001 domains among the 10,340 models of Pfam release 23 and 18 domains of SMART version 6 (out of 809) contain SP/TM regions. As expected, fragment mode HMM searches generate promiscuous hits limited to solely the SP/TM part among clearly unrelated proteins. More worryingly, this work shows explicit examples that the scores of clearly false-positive hits, even in globalmode searches, can be elevated into the significance range just by matching the hydrophobic runs. In the PIR iProClass database v3.74 using conservative criteria, this study finds that at least between 2.1% and 13.6% of its annotated Pfam hits appear unjustified for a set of validated domain models. Thus, false positive domain hits enforced by SP/TM regions can lead to dramatic annotation errors where the hit has nothing in common with the problematic domain model except the SP/TM region itself. A workflow of flagging problematic hits arising from SP/TM-containing models for critical reconsideration by annotation users is provided. While E-value guided extrapolation of protein domain annotation from libraries such as Pfam with the HMMER suite is indispensable for hypothesizing about the function of experimentally uncharacterized protein sequences, it can also complicate the annotation problem. In HMMER2, the E-value is computed from the score via a logistic function or via a domain model-specific extreme value distribution (EVD); the lower of the two is returned as E-value for the domain hit in the query sequence. We demonstrated that, for thousands of domain models, this treatment results in switching from the EVD to the statistical model with the logistic function when scores grow (for Pfam release 23, 99% in the global mode and 75% in the fragment mode). If the score corresponding to the breakpoint results in an E-value above a user-defined threshold (e.g., 0.1), a critical score region with conflicting E-values from the logistic function (below the threshold) and from EVD (above the threshold) does exist. Thus, this switch will affect E-value guided annotation decisions in an automated mode. To emphasize, switching in the fragment mode is of no practical relevance since it occurs only at E-values far below 0.1. Unfortunately, a critical score region does exist for 185 domain models in the hmmpfam and 1748 domain models in the hmmsearch global-search mode. For 145 out the respective 185 models, the critical score region is indeed populated by actual sequences. In total, 24.4% of their hits have a logistic function-derived E-value<0.1 when the EVD provides an E-value>0.1. Examples of false annotations are provided and the appropriateness of a logistic function as alternative to the EVD is critically discussed. This work shows that misguided E-value computation coupled with non-globular regions embedded in domain model library not only causes annotation errors in public databases but also limits the extrapolation power of protein function prediction tasks. So far, the preceding work has demonstrated that sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches from non-polar residues. We found that there are two types of transmembrane helices (TMs) in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intramembrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information. For extending the homologyconcept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments. Theoretical insights from this work were applied to problems of function prediction for specific uncharacterized gene/protein sequences (for example, APMAP and ARXES) and for the functional classification of TM-containing proteins. info:eu-repo/classification/ddc/000 ddc:000

Search results