Global ETD Search

551	Characterisation of a mouse gene-phenotype network Espinosa, Octavio January 2011 (has links) Following advancements in the "omics" fields of molecular biology and genetics, much attention has been focused on categorising and annotating the large volume of data that has been produced since the sequencing of human and model genomes. With high-throughput data generated from these "omics" experiments and the increasing deposition of information from genetics experiments in biological databases, our understanding of the mechanisms that bridge the gap from genotype to phenotype can be explored in a holistic context. This is one of the aims of the relatively new field of systems biology, which aims to understand the complexity of biological systems in a holistic manner by studying the system as an ensemble of interacting parts. With increased volume and comprehensiveness of biological data, prediction of gene function and automatic identification of potential models for human diseases have become important aspects of systems-level analysis for wet-lab geneticists and clinicians. Here, I describe an integrated analysis of mouse phenotype data with high-throughput experiments to give genome-wide information about gene relationships and their function in a systems biology context. I show a functional dissection of mouse gene and phenotype networks and investigate the potential that ontology-compliant phenotype annotations can offer for functional classification of genes. The mouse genome and phenome show modularity at higher levels of cellular, physiological and organismal function. Using high-throughput protein-protein interaction data, the mouse proteome was dissected and computationally extracted communities were used to predict phenotypes of mouse gene ablation. Precision and recall curves show comparable performance for higher levels of the MP ontology to those undertaken by comprehensive mouse gene function prediction such as the Mouse Function Project which predicted Gene Ontology terms. I also developed and tested an automatic procedure that relates mouse phenotypes to human diseases and demonstrate its application to the use cases of identifying mouse models given a query consisting of a set of mouse phenotypes and breaking down human diseases into mouse phenotypes. Taken together, my results may be useful as a map for candidate gene discovery, finding how mouse networks relate to human networks and investigating the evolutionary origins of their components at higher levels of gene function. 572.8
552	BioInformatics, Phylogenetics, and Aspartate Transcarbamoylase Cooke, Patrick Alan 08 1900 (has links) In this research, the necessity of understanding and using bioinformatics is demonstrated using the enzyme aspartate transcarbamoylase (ATCase) as the model enzyme. The first portion of this research focuses on the use of bioinformatics. A partial sequence of the pyrB gene found in Enterococcus faecalis was submitted to GenBank and was analyzed against the contiguous sequence from its own genome project. A BLAST (Basic Local Alignment Search Tool; Atschul, et al., 1990) was performed in order to hypothesize the remaining portion of the gene from the contiguous sequence. This allowed a global comparison to other known aspartate transcarbamoylases (ATCases) and once deduced, a translation of the sequence gave the stop codon and thus the complete sequence of the open reading frame. When this was complete, upstream and downstream primers were designed in order to amplify the gene from genomic DNA. The amplified product was then sequenced and used later in phylogenetic analyses concerning the evolution of ATCase. The second portion of this research involves taking multiple ATCase nucleotide sequences and performing phenetic and phylogenetic analyses of the archaea and eubacter families. From these analyses, ancestral relationships which dictate both structure and function were extrapolated from the data and discussed. Bioinformatics. Phylogeny. Pyrimidine nucleotides -- Metabolism. genome project genetics phylogenetics bioinformatics
553	Sequence Extension of the Tryptophan and Shikimate Operons in Clostridium Scatologenes ATCC 25775 Smiley, Shawn Johnston 01 October 2017 (has links) 3-Methylindole and 4-methylphenol are cytotoxic and malodorant compounds derived from tryptophan and tyrosine, respectively. Each is present in swine waste lagoons and contributes to malodorous emissions from agricultural facilities. Clostridium scatologenes ATCC 25775 produces both compounds and serves as a model organism to study their metabolism and function. Through the repeated assembly and annotation of the Clostridium scatologenes genome, we propose a novel pathway for tryptophan degradation and 3-methylindole production by this organism. The genome of Clostridium scatologenes was sequenced, and re-assembled into contigs. Key elements of the tryptophan and shikimate pathways were identified. Contigs containing these elements were extracted from assemblies and matched to the reference genome of Clostridium carboxidivorans. Sequence for both pathways was then extended and defined using these joined sequence fragments. This sequence could serve as a starting point for the isolation of genes related to 3-methylindole synthesis using biochemical and enzyme analysis microbiology bioinformatics genes malodorants sequencing Bioinformatics Genetics and Genomics Microbiology
554	Exploring Strategies to Integrate Disparate Bioinformatics Datasets Fakhry, Charbel Bader 01 January 2019 (has links) Distinct bioinformatics datasets make it challenging for bioinformatics specialists to locate the required datasets and unify their format for result extraction. The purpose of this single case study was to explore strategies to integrate distinct bioinformatics datasets. The technology acceptance model was used as the conceptual framework to understand the perceived usefulness and ease of use of integrating bioinformatics datasets. The population of this study included bioinformatics specialists of a research institution in Lebanon that has strategies to integrate distinct bioinformatics datasets. The data collection process included interviews with 6 bioinformatics specialists and reviewing 27 organizational documents relating to integrating bioinformatics datasets. Thematic analysis was used to identify codes and themes related to integrating distinct bioinformatics datasets. Key themes resulting from data analysis included a focus on integrating bioinformatics datasets, adding metadata with the submitted bioinformatics datasets, centralized bioinformatics database, resources, and bioinformatics tools. I showed throughout analyzing the findings of this study that specialists who promote standardizing techniques, adding metadata, and centralization may increase efficiency in integrating distinct bioinformatics datasets. Bioinformaticians, bioinformatics providers, the health care field, and society might benefit from this research. Improvement in bioinformatics affects poistevely the health-care field which has a positive social change. The results of this study might also lead to positive social change in research institutions, such as reduced workload, less frustration, reduction in costs, and increased efficiency while integrating distinct bioinformatics datasets. Bioinformatics data integration disparate Information technology Bioinformatics Databases and Information Systems
555	Formalizing life : Towards an improved understanding of the sequence-structure relationship in alpha-helical transmembrane proteins Viklund, Håkan January 2007 (has links) <p>Genes coding for alpha-helical transmembrane proteins constitute roughly 25% of the total number of genes in a typical organism. As these proteins are vital parts of many biological processes, an improved understanding of them is important for achieving a better understanding of the mechanisms that constitute life.</p><p>All proteins consist of an amino acid sequence that fold into a three-dimensional structure in order to perform its biological function. The work presented in this thesis is directed towards improving the understanding of the relationship between sequence and structure for alpha-helical transmembrane proteins. Specifically, five original methods for predicting the topology of alpha-helical transmembrane proteins have been developed: PRO-TMHMM, PRODIV-TMHMM, OCTOPUS, Toppred III and SCAMPI. </p><p>A general conclusion from these studies is that approaches that use multiple sequence information achive the best prediction accuracy. Further, the properties of reentrant regions have been studied, both with respect to sequence and structure. One result of this study is an improved definition of the topological grammar of transmembrane proteins, which is used in OCTOPUS and shown to further improve topology prediction. Finally, Z-coordinates, an alternative system for representation of topological information for transmembrane proteins that is based on distance to the membrane center has been introduced, and a method for predicting Z-coordinates from amino acid sequence, Z-PRED, has been developed.</p> bioinformatics protein topology hidden Markov model prediction Bioinformatics Bioinformatik
556	A Lexicon for Gene Normalization / Ett lexicon för gennormalisering Lingemark, Maria January 2009 (has links) <p>Researchers tend to use their own or favourite gene names in scientific literature, even though there are official names. Some names may even be used for more than one gene. This leads to problems with ambiguity when automatically mining biological literature. To disambiguate the gene names, gene normalization is used. In this thesis, we look into an existing gene normalization system, and develop a new method to find gene candidates for the ambiguous genes. For the new method a lexicon is created, using information about the gene names, symbols and synonyms from three different databases. The gene mention found in the scientific literature is used as input for a search in this lexicon, and all genes in the lexicon that match the mention are returned as gene candidates for that mention. These candidates are then used in the system's disambiguation step. Results show that the new method gives a better over all result from the system, with an increase in precision and a small decrease in recall.</p> Bioinformatics Gene Normalization String Matching Text Mining Bioinformatics Bioinformatik
557	Evaluation and Development of Methods for Identification of Biochemical Networks / Evaluering och utveckling av metoder för identifiering av biokemiska nätverk Jauhiainen, Alexandra January 2005 (has links) <p>Systems biology is an area concerned with understanding biology on a systems level, where structure and dynamics of the system is in focus. Knowledge about structure and dynamics of biological systems is fundamental information about cells and interactions within cells and also play an increasingly important role in medical applications. </p><p>System identification deals with the problem of constructing a model of a system from data and an extensive theory of particularly identification of linear systems exists. </p><p>This is a master thesis in systems biology treating identification of biochemical systems. Methods based on both local parameter perturbation data and time series data have been tested and evaluated in silico. </p><p>The advantage of local parameter perturbation data methods proved to be that they demand less complex data, but the drawbacks are the reduced information content of this data and sensitivity to noise. Methods employing time series data are generally more robust to noise but the lack of available data limits the use of these methods. </p><p>The work has been conducted at the Fraunhofer-Chalmers Research Centre for Industrial Mathematics in Göteborg, and at the division of Computational Biology at the Department of Physics and Measurement Technology, Biology, and Chemistry at Linköping University during the autumn of 2004.</p> Bioinformatics Systems Biology System Identification Biochemical Networks Bioinformatik Bioinformatics Bioinformatik
558	Formalizing life : Towards an improved understanding of the sequence-structure relationship in alpha-helical transmembrane proteins Viklund, Håkan January 2007 (has links) Genes coding for alpha-helical transmembrane proteins constitute roughly 25% of the total number of genes in a typical organism. As these proteins are vital parts of many biological processes, an improved understanding of them is important for achieving a better understanding of the mechanisms that constitute life. All proteins consist of an amino acid sequence that fold into a three-dimensional structure in order to perform its biological function. The work presented in this thesis is directed towards improving the understanding of the relationship between sequence and structure for alpha-helical transmembrane proteins. Specifically, five original methods for predicting the topology of alpha-helical transmembrane proteins have been developed: PRO-TMHMM, PRODIV-TMHMM, OCTOPUS, Toppred III and SCAMPI. A general conclusion from these studies is that approaches that use multiple sequence information achive the best prediction accuracy. Further, the properties of reentrant regions have been studied, both with respect to sequence and structure. One result of this study is an improved definition of the topological grammar of transmembrane proteins, which is used in OCTOPUS and shown to further improve topology prediction. Finally, Z-coordinates, an alternative system for representation of topological information for transmembrane proteins that is based on distance to the membrane center has been introduced, and a method for predicting Z-coordinates from amino acid sequence, Z-PRED, has been developed. bioinformatics protein topology hidden Markov model prediction Bioinformatics Bioinformatik
559	Characterization of protein families, sequence patterns, and functional annotations in large data sets Bresell, Anders January 2008 (has links) Bioinformatics involves storing, analyzing and making predictions on massive amounts of protein and nucleotide sequence data. The thesis consists of six papers and is focused on proteins. It describes the utilization of bioinformatics techniques to characterize protein families and to detect patterns in gene expression and in polypeptide occurrences. Two protein families were bioinformatically characterized - the membrane associated proteins in eicosanoid and glutathione metabolism (MAPEG) and the Tripartite motif (TRIM) protein families. In the study of the MAPEG super-family, application of diﬀerent bioinformatic methods made it possible to characterize many new members leading to a doubling of the family size. Furthermore, the MAPEG members were subdivided into families. Remarkably, in six families with previously predominantly mammalian members, ﬁsh representatives were also now detected, which dated the origin of these families back to the Cambrium ”species explosion”, thus earlier than previously anticipated. Sequence comparisons made it possible to deﬁne diagnostic sequence patterns that can be used in genome annotations. Upon publication of several MAPEG structures, these patterns were conﬁrmed to be part of the active sites. In the TRIM study, the bioinformatic analyses made it possible to subdivide the proteins into three subtypes and to characterize a large number of members. In addition, the analyses showed crucial structural dependencies between the RING and the B-box domains of the TRIM member Ro52. The linker region between the two domains, denoted RBL, is known to be disease associated. Now, an amphipathic helix was found to be a characteristic feature of the RBL region, which also was used to divide the family into three subtypes. The ontology annotation treebrowser (OAT) tool was developed to detect functional similarities or common concepts in long lists of proteins or genes, typically generated from proteomics or microarray experiments. OAT was the ﬁrst annotation browser to include both Gene Ontology (GO) and Medical Subject Headings (MeSH) into the same framework. The complementarity of these two ontologies was demonstrated. OAT was used in the TRIM study to detect diﬀerences in functional annotations between the subtypes. In the oligopeptide study, we investigated pentapeptide patterns that were over- or under-represented in the current de facto standard database of protein knowledge and a set of completed genomes, compared to what could be expected from amino acid compositions. We found three predominant categories of patterns: (i) patterns originating from frequently occurring families, e.g. respiratory chain-associated proteins and translation machinery proteins; (ii) proteins with structurally and/or functionally favored patterns; (iii) multicopy species-speciﬁc retrotransposons, only found in the genome set. Such patterns may inﬂuence amino acid residue based prediction algorithms. These ﬁndings in the oligopeptide study were utilized for development of a new method that detects translated introns in unveriﬁed protein predictions, which are available in great numbers due to the many completed and ongoing genome projects. A new comprehensive database of protein sequences from completed genomes was developed, denoted genomeLKPG. This database was of central importance in the MAPEG, TRIM and oligopeptide studies. The new sequence database has also been proven useful in several other studies. Bioinformatics sequence analysis patterns protein families Bioinformatics Bioinformatik
560	A Lexicon for Gene Normalization / Ett lexicon för gennormalisering Lingemark, Maria January 2009 (has links) Researchers tend to use their own or favourite gene names in scientific literature, even though there are official names. Some names may even be used for more than one gene. This leads to problems with ambiguity when automatically mining biological literature. To disambiguate the gene names, gene normalization is used. In this thesis, we look into an existing gene normalization system, and develop a new method to find gene candidates for the ambiguous genes. For the new method a lexicon is created, using information about the gene names, symbols and synonyms from three different databases. The gene mention found in the scientific literature is used as input for a search in this lexicon, and all genes in the lexicon that match the mention are returned as gene candidates for that mention. These candidates are then used in the system's disambiguation step. Results show that the new method gives a better over all result from the system, with an increase in precision and a small decrease in recall. Bioinformatics Gene Normalization String Matching Text Mining Bioinformatics Bioinformatik

Search results