• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 374
  • 47
  • 33
  • 20
  • 17
  • 10
  • 8
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 702
  • 702
  • 367
  • 189
  • 173
  • 106
  • 96
  • 94
  • 90
  • 82
  • 80
  • 78
  • 78
  • 76
  • 73
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
261

Algorithms for Characterizing Peptides and Glycopeptides with Mass Spectrometry

He, Lin January 2013 (has links)
The emergence of tandem mass spectrometry (MS/MS) technology has significantly accelerated protein identification and quantification in proteomics. It enables high-throughput analysis of proteins and their quantities in a complex protein mixture. A mass spectrometer can easily and rapidly generate large volumes of mass spectral data for a biological sample. This bulk of data makes manual interpretation impossible and has also brought numerous challenges in automated data analysis. Algorithmic solutions have been proposed and provide indispensable analytical support in current proteomic experiments. However, new algorithms are still needed to either improve result accuracy or provide additional data analysis capabilities for both protein identification and quantification. Accurate identification of proteins in a sample is the preliminary requirement of a proteomic study. In many cases, a mass spectrum cannot provide complete information to identify the peptide without ambiguity because of the inefficiency of the peptide fragmentation technique and the prevalent existence of noise. We propose ADEPTS to this problem using the complementary information provided in different types of mass spectra. Meanwhile, the occurrence of posttranslational modifications (PTMs) on proteins is another major issue that prevents the interpretation of a large portion of spectra. Using current software tools, users have to specify possible PTMs in advance. However, the number of possible PTMs has to be limited since specifying more PTMs to the software leads to a longer running time and lower result accuracy. Thus, we develop DeNovoPTM and PeaksPTM to provide efficient and accurate solutions. Glycosylation is one of the most frequently observed PTMs in proteomics. It plays important roles in many disease processes and thus has attracted growing research interest. However, lack of algorithms that can identify intact glycopeptides has become the major obstacle that hinders glycoprotein studies. We propose a novel algorithm, GlycoMaster DB, to fulfil this urgent requirement. Additional research is presented on protein quantification, which studies the changes of protein quantity by comparing two or more mass spectral datasets. A crucial problem in the quantification is to correct the retention time distortions between different datasets. Heuristic solutions from previous research have been used in practice but none of them has yet claimed a clear optimization goal. To address this issue, we propose a combinatorial model and practical algorithms for this problem.
262

Exploring the structurial diversity and engineering potential of thermophilic periplasmic binding proteins

Cuneo, Matthew Joseph 02 May 2007 (has links)
The periplasmic binding protein (PBP) superfamily is found throughout the genosphere of both prokaryotic and eukaryotic organisms. PBPs function as receptors in bacterial solute transport and chemotaxis systems; however the same fold is also used in transcriptional regulators, enzymes, and eukaryotic neurotransmitter receptors. This versatility has been exploited for structure-based computational protein design experiments where PBPs have been engineered to bind novel ligands and serve as biosensors for the detection of small-molecule ligands relevant to biomedical or defense-related interests. In order to further understand functional adaptation from a structural biology perspective, and to provide a set of robust starting points for engineering novel biosensors by structure-based design, I have characterized the ligand-binding properties and solved the structure of nine PBPs from various thermophilic bacteria. Analysis of these structures reveals a variety of mechanisms by which diverse function can be encoded in a common fold. It is observed that re-modeling of secondary structure elements (such as insertions, deletions, and loop movements), and re-decoration of amino acid side-chains are common diversification mechanisms in PBPs. Furthermore, the relationship between hinge-bending motion and ligand binding is critical to understanding the function of natural or engineered adaptations in PBPs. Three of these proteins were solved in both the presence and absence of ligand which allowed for the first time the observation and analysis of ligand-induced structural rearrangements in thermophilic PBPs. This work revealed that the magnitude and transduction of local and global ligand-induced motions are diverse throughout the PBP superfamily. Through the analysis of the open-to-closed transition, and the identification of natural structural adaptations in thermophilic members of the PBP superfamily, I reveal strategies which can be applied to computational protein design to significantly improve current strategies. / Dissertation
263

Modeling Biological Systems from Heterogeneous Data

Bernard, Allister P. 24 April 2008 (has links)
The past decades have seen rapid development of numerous high-throughput technologies to observe biomolecular phenomena. High-throughput biological data are inherently heterogeneous, providing information at the various levels at which organisms integrate inputs to arrive at an observable phenotype. Approaches are needed to not only analyze heterogeneous biological data, but also model the complex experimental observation procedures. We first present an algorithm for learning dynamic cell cycle transcriptional regulatory networks from gene expression and transcription factor binding data. We learn regulatory networks using dynamic Bayesian network inference algorithms that combine evidence from gene expression data through the likelihood and evidence from binding data through an informative structure prior. We next demonstrate how analysis of cell cycle measurements like gene expression data are obstructed by sychrony loss in synchronized cell populations. Due to synchrony loss, population-level cell cycle measurements are convolutions of the true measurements that would have been observed when monitoring individual cells. We introduce a fully parametric, probabilistic model, CLOCCS, capable of characterizing multiple sources of asynchrony in synchronized cell populations. Using CLOCCS, we formulate a constrained convex optimization deconvolution algorithm that recovers single cell estimates from observed population-level measurements. Our algorithm offers a solution for monitoring individual cells rather than a population of cells that lose synchrony over time. Using our deconvolution algorithm, we provide a global high resolution view of cell cycle gene expression in budding yeast, right from an initial cell progressing through its cell cycle, to across the newly created mother and daughter cell. Proteins, and not gene expression, are responsible for all cellular functions, and we need to understand how proteins and protein complexes operate. We introduce PROCTOR, a statistical approach capable of learning the hidden interaction topology of protein complexes from direct protein-protein interaction data and indirect co-complexed protein interaction data. We provide a global view of the budding yeast interactome depicting how proteins interact with each other via their interfaces to form macromolecular complexes. We conclude by demonstrating how our algorithms, utilizing information from heterogeneous biological data, can provide a dynamic view of regulatory control in the budding yeast cell cycle. / Dissertation
264

Regulation of Global Transcription Dynamics During Cell Division and Root Development

Orlando, David Anthony January 2009 (has links)
<p>The successful completion of many critical biological processes depends on the proper execution of complex spatial and temporal gene expression programs. With the advent of high-throughput microarray technology, it is now possible to measure the dynamics of these expression programs on a genome-wide level. In this thesis we present work focused on utilizing this technology, in combination with novel computational techniques, to examine the role of transcriptional regulatory mechanisms in controlling the complex gene expression programs underlying two fundamental biological processes---the cell cycle and the development and differentiation of an organ.</p><p>We generate a dataset describing the genomic expression program which occurs during the cell division cycle of <italic>Saccharomyces cerevisiae</italic>. By concurrently measuring the dynamics in both wild-type and mutant cells that do not express either S-phase or mitotic cyclins we quantify the relative contributions of cyclin-CDK complexes and transcriptional regulatory networks in the regulation the cell cell expression program. We show that CDKs are not the sole regulators of periodic transcription as contrary to previously accepted models; and we hypothesize an oscillating transcriptional regulatory network which could work independent of, or in tandem with, the CDK oscillator to control the cell cell expression program.</p><p>To understand the acquisition of cellular identity, we generate a nearly complete gene expression map of the <italic>Arabidopsis Thaliana</italic> root at the resolution of individual cell-types and developmental stages. An analysis of this data reveals a representative set of dominant expression patterns which are used to begin defining the spatiotemporal transcriptional programs that control development within the root.</p><p>Additionally, we develop computational tools that improve the interpretability and power of these data. We present CLOCCS, a model for the dynamics of population synchrony loss in time-series experiments. We demonstrate the utility of CLOCCS in integrating disparate datasets and present a CLOCCS based deconvolution of the cell-cycle expression data. A deconvolution method is also developed for the <italic>Arabidopsis</italic> dataset, increasing its resolution to cell-type/section subregion specificity. Finally, a method for identifying biological processes occurring on multiple timescales is presented and applied to both datasets.</p><p>It is through the combination of these new genome-wide expression studies and computational tools that we begin to elucidate the transcriptional regulatory mechanisms controlling fundamental biological processes.</p> / Dissertation
265

Evolutionary Genomics of Methyl-accepting Chemotaxis Proteins

Alexander, Roger Parker 10 September 2007 (has links)
The general goal of this project was to use computational biology to understand signal transduction mechanisms in prokaryotes. Its specific focus was to characterize the cytoplasmic domain of methyl-accepting chemotaxis proteins (MCP_CD), a protein domain central to the function of chemotaxis, the most complex signaling network in prokaryotes. Chemotaxis enables cells to sense and respond to multiple external and internal stimuli by actively navigating to an optimal environment. MCP_CD is a central part of this circuit, but its coiled coil structure is difficult to analyze using traditional tools of computational biology. In this project, a new method for analysis of the domain was developed and used to gain insight into its function and evolution. Research advance 1: Characterization of the MCP_CD protein domain. Before this work, MCP_CD was known to have two distinct functional regions: the signaling region that activates the histidine kinase CheA and the methylation region where adaptation enzymes CheB and CheR store information about recent stimuli. The result of this project is classification of ~2000 MCP_CDs into twelve subfamilies. The unique mechanism of evolution of the domain has been clarified and precise boundaries of the adaptation and signaling regions determined. A new functional region, the flexible bundle subdomain, was identified and its contribution to the signaling mechanism elucidated by analysis of conserved sequence features. Conserved and variable sequence features in the adaptation and signaling subdomains led to a better understanding of the evolutionary history of the adaptation mechanism and of alternative higher-order arrangements of receptors within the membrane. Research advance 2: Development of a sensor / kinase correlation algorithm to couple diverse MCP_CD and kinase subfamilies. The receptor diversity discovered in this work is complemented by diversity in the kinases with which they interact. In this work, an algorithm was developed to associate receptor / kinase pairs which facilitated understanding of the function and evolution of chemotaxis. Research advance 3: Development of Cheops, a database of chemotaxis pathways. The Cheops (Chemotaxis operons) database presents the results of the sensor / kinase correlation algorithm and the information about receptor and kinase diversity in an integrated and intuitive way.
266

Analyzing biological expression data based on decision tree induction

Flöter, André January 2005 (has links)
<P>Modern biological analysis techniques supply scientists with various forms of data. One category of such data are the so called "expression data". These data indicate the quantities of biochemical compounds present in tissue samples.</P> <P>Recently, expression data can be generated at a high speed. This leads in turn to amounts of data no longer analysable by classical statistical techniques. Systems biology is the new field that focuses on the modelling of this information.</P> <P>At present, various methods are used for this purpose. One superordinate class of these meth­ods is machine learning. Methods of this kind had, until recently, predominantly been used for classification and prediction tasks. This neglected a powerful secondary benefit: the ability to induce interpretable models.</P> <P>Obtaining such models from data has become a key issue within Systems biology. Numerous approaches have been proposed and intensively discussed. This thesis focuses on the examination and exploitation of one basic technique: decision trees.</P> <P>The concept of comparing sets of decision trees is developed. This method offers the pos­sibility of identifying significant thresholds in continuous or discrete valued attributes through their corresponding set of decision trees. Finding significant thresholds in attributes is a means of identifying states in living organisms. Knowing about states is an invaluable clue to the un­derstanding of dynamic processes in organisms. Applied to metabolite concentration data, the proposed method was able to identify states which were not found with conventional techniques for threshold extraction.</P> <P>A second approach exploits the structure of sets of decision trees for the discovery of com­binatorial dependencies between attributes. Previous work on this issue has focused either on expensive computational methods or the interpretation of single decision trees ­ a very limited exploitation of the data. This has led to incomplete or unstable results. That is why a new method is developed that uses sets of decision trees to overcome these limitations.</P> <P>Both the introduced methods are available as software tools. They can be applied consecu­tively or separately. That way they make up a package of analytical tools that usefully supplement existing methods.</P> <P>By means of these tools, the newly introduced methods were able to confirm existing knowl­edge and to suggest interesting and new relationships between metabolites.</P> / <P>Neuere biologische Analysetechniken liefern Forschern verschiedenste Arten von Daten. Eine Art dieser Daten sind die so genannten "Expressionsdaten". Sie geben die Konzentrationen biochemischer Inhaltsstoffe in Gewebeproben an.<P> <P>Neuerdings können Expressionsdaten sehr schnell erzeugt werden. Das führt wiederum zu so großen Datenmengen, dass sie nicht mehr mit klassischen statistischen Verfahren analysiert werden können. "System biology" ist eine neue Disziplin, die sich mit der Modellierung solcher Information befasst.</P> <P>Zur Zeit werden dazu verschiedenste Methoden benutzt. Eine Superklasse dieser Methoden ist das maschinelle Lernen. Dieses wurde bis vor kurzem ausschließlich zum Klassifizieren und zum Vorhersagen genutzt. Dabei wurde eine wichtige zweite Eigenschaft vernachlässigt, nämlich die Möglichkeit zum Erlernen von interpretierbaren Modellen.</P> <P>Die Erstellung solcher Modelle hat mittlerweile eine Schlüsselrolle in der "Systems biology" erlangt. Es sind bereits zahlreiche Methoden dazu vorgeschlagen und diskutiert worden. Die vorliegende Arbeit befasst sich mit der Untersuchung und Nutzung einer ganz grundlegenden Technik: den Entscheidungsbäumen.</P> <P>Zunächst wird ein Konzept zum Vergleich von Baummengen entwickelt, welches das Erkennen bedeutsamer Schwellwerte in reellwertigen Daten anhand ihrer zugehörigen Entscheidungswälder ermöglicht. Das Erkennen solcher Schwellwerte dient dem Verständnis von dynamischen Abläufen in lebenden Organismen. Bei der Anwendung dieser Technik auf metabolische Konzentrationsdaten wurden bereits Zustände erkannt, die nicht mit herkömmlichen Techniken entdeckt werden konnten.</P> <P>Ein zweiter Ansatz befasst sich mit der Auswertung der Struktur von Entscheidungswäldern zur Entdeckung von kombinatorischen Abhängigkeiten zwischen Attributen. Bisherige Arbeiten hierzu befassten sich vornehmlich mit rechenintensiven Verfahren oder mit einzelnen Entscheidungsbäumen, eine sehr eingeschränkte Ausbeutung der Daten. Das führte dann entweder zu unvollständigen oder instabilen Ergebnissen. Darum wird hier eine Methode entwickelt, die Mengen von Entscheidungsbäumen nutzt, um diese Beschränkungen zu überwinden.</P> <P>Beide vorgestellten Verfahren gibt es als Werkzeuge für den Computer, die entweder hintereinander oder einzeln verwendet werden können. Auf diese Weise stellen sie eine sinnvolle Ergänzung zu vorhandenen Analyswerkzeugen dar.</P> <P>Mit Hilfe der bereitgestellten Software war es möglich, bekanntes Wissen zu bestätigen und interessante neue Zusammenhänge im Stoffwechsel von Pflanzen aufzuzeigen.</P>
267

PSSMs : not just roadkill on the information superhighway /

Ng, Pauline Crystal. January 2002 (has links)
Thesis (Ph. D.)--University of Washington, 2002. / Vita. Includes bibliographical references (leaves 93-101).
268

Optimal Alignment of Multiple Sequence Alignments

Starrett, Dean January 2008 (has links)
An essential tool in biology is the alignment of multiple sequences. Biologists use multiple sequence alignments for tasks such as predicting protein structure and function, reconstructing phylogenetic trees, and finding motifs. Constructing high-quality multiple alignments is computationally hard, both in theory and in practice, and is typically done using heuristic methods. The majority of state-of-the-art multiple alignment programs employ a form and polish strategy, where in the construction phase, an initial multiple alignment is formed by progressively merging smaller alignments, starting with single sequences. Then in a local-search phase, the resulting alignment is polished by repeatedly splitting it into smaller alignments and re-merging. This merging of alignments, the basic computational problem in the construction and local-search phases of the best multiple alignment heuristics, is called the Aligning Alignments Problem. Under the sum-of-pairs objective for scoring multiple alignments, this problem may seem to be a simple extension of two-sequence alignment. It is proven here, however, that with affine gap costs (which are recognized as necessary to get biologically-informative alignments) the problem is NP-complete when gaps are counted exactly. Interestingly, this form of multiple alignment is polynomial-time solvable when we relax the exact count, showing that exact gap counts themselves are inherently hard in multiple sequence alignment. Unlike general multiple alignment however, we show that Aligning Alignments with affine gap costs and exact counts is tractable in practice, by demonstrating an effective algorithm and a fast implementation. Our software AlignAlign is both time- and space-efficient on biological data. Computational experiments on biological data show instances derived from standard benchmark suites can be optimally aligned with surprising efficiency, and experiments on simulated data show the time and space both scale well.
269

Genome-wide analysis of mutually exclusive splicing

Hatje, Klas 29 January 2013 (has links)
No description available.
270

Algorithms for Characterizing Peptides and Glycopeptides with Mass Spectrometry

He, Lin January 2013 (has links)
The emergence of tandem mass spectrometry (MS/MS) technology has significantly accelerated protein identification and quantification in proteomics. It enables high-throughput analysis of proteins and their quantities in a complex protein mixture. A mass spectrometer can easily and rapidly generate large volumes of mass spectral data for a biological sample. This bulk of data makes manual interpretation impossible and has also brought numerous challenges in automated data analysis. Algorithmic solutions have been proposed and provide indispensable analytical support in current proteomic experiments. However, new algorithms are still needed to either improve result accuracy or provide additional data analysis capabilities for both protein identification and quantification. Accurate identification of proteins in a sample is the preliminary requirement of a proteomic study. In many cases, a mass spectrum cannot provide complete information to identify the peptide without ambiguity because of the inefficiency of the peptide fragmentation technique and the prevalent existence of noise. We propose ADEPTS to this problem using the complementary information provided in different types of mass spectra. Meanwhile, the occurrence of posttranslational modifications (PTMs) on proteins is another major issue that prevents the interpretation of a large portion of spectra. Using current software tools, users have to specify possible PTMs in advance. However, the number of possible PTMs has to be limited since specifying more PTMs to the software leads to a longer running time and lower result accuracy. Thus, we develop DeNovoPTM and PeaksPTM to provide efficient and accurate solutions. Glycosylation is one of the most frequently observed PTMs in proteomics. It plays important roles in many disease processes and thus has attracted growing research interest. However, lack of algorithms that can identify intact glycopeptides has become the major obstacle that hinders glycoprotein studies. We propose a novel algorithm, GlycoMaster DB, to fulfil this urgent requirement. Additional research is presented on protein quantification, which studies the changes of protein quantity by comparing two or more mass spectral datasets. A crucial problem in the quantification is to correct the retention time distortions between different datasets. Heuristic solutions from previous research have been used in practice but none of them has yet claimed a clear optimization goal. To address this issue, we propose a combinatorial model and practical algorithms for this problem.

Page generated in 0.1039 seconds