Global ETD Search

1	Biological database indexing and its applications. January 2002 (has links) Cheung Ching Fung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 71-73). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Biological Sequences --- p.2 / Chapter 1.2 --- User Queries on Biological Sequences --- p.4 / Chapter 1.3 --- Research Contributions --- p.6 / Chapter 1.4 --- Organization of Thesis --- p.6 / Chapter 2 --- Background --- p.7 / Chapter 2.1 --- What is a Suffix-Tree? --- p.7 / Chapter 2.2 --- Disk-Based Suffix-Trees --- p.9 / Chapter 3 --- Disk-Based Suffix Tree Constructions --- p.11 / Chapter 3.1 --- An Existing Algorithm: PrePar-Suff ix --- p.11 / Chapter 3.1.1 --- "Three Issues: Edge Splitting, Random Access and Data Skew" --- p.13 / Chapter 3.2 --- DynaCluster-Suffix: A New Novel Disk-Based Suffix-Tree Construction Algorithm --- p.18 / Chapter 4 --- Suffix Links Rebuilt --- p.29 / Chapter 4.1 --- Suffix-links and Least Common Ancestors --- p.29 / Chapter 5 --- q-Length Exact Sequence Matching --- p.35 / Chapter 5.1 --- q-Length Exact Sequence Matching by Suffix-Tree --- p.35 / Chapter 6 --- Implementation --- p.38 / Chapter 6.1 --- System Overview --- p.38 / Chapter 6.1.1 --- Index Builder --- p.39 / Chapter 6.1.2 --- Exact Query Processor --- p.39 / Chapter 6.1.3 --- Suffix Links Regenerator --- p.40 / Chapter 6.1.4 --- Tandem Repeats Finder --- p.40 / Chapter 6.2 --- Data Structures --- p.40 / Chapter 6.2.1 --- Representation of a Node --- p.40 / Chapter 6.2.2 --- An Alternative Node Representation --- p.42 / Chapter 6.2.3 --- Representation of a Leaf --- p.43 / Chapter 6.3 --- Buffering --- p.44 / Chapter 6.3.1 --- Page Format --- p.44 / Chapter 6.3.2 --- Address Translation --- p.45 / Chapter 6.3.3 --- Page Replacement Strategies --- p.45 / Chapter 7 --- A Performance Studies --- p.48 / Chapter 7.1 --- When Everything Can be Held In Memory --- p.52 / Chapter 7.2 --- When Main Memory Is Limited --- p.54 / Chapter 7.3 --- The Effectiveness of DNA Lengths with Fixed Memory Sizes . --- p.56 / Chapter 7.4 --- The Effectiveness of Memory Sizes --- p.57 / Chapter 7.5 --- Answering q-Length Exact Sequence Matching Queries --- p.60 / Chapter 7.6 --- Suffix Link Rebuilt --- p.61 / Chapter 8 --- Conclusions and Future Works --- p.69 / Chapter 8.1 --- Conclusions --- p.69 / Chapter 8.2 --- Future Works --- p.70 / Bibliography --- p.71 Biology--Databases Database management Indexing
2	Modeling uncertainty in data integration for improving protein function assignment / Louie, Brenton E. January 2008 (has links) Thesis (Ph. D.)--University of Washington, 2008. / Vita. Includes bibliographical references (leaves 150-160).
3	Design, development, and deployment of a locus specific mutation database : the PAHdb example Nowacki, Piotr Marek. January 1998 (has links) Genetics is concerned with inheritance, genomics with the study of genomes. Bioinformatics provides the tools to study the interface between the two. If a particular locus in the human genome could have 100 discrete alleles, then the genome (comprising an estimated 80,000 genes), could harbor 8 million different alleles. To record information about each of these alleles in a meaningful and systematic fashion is a task for the Mutation Database domain of bioinformatics. The HUGO Mutation Database Initiative is an international effort to capture, record and distribute information about variation in genomes. This initiative comprises a growing number of Locus-Specific Mutation databases, and a few large Federated Genomic databases [Cotton et al., 1998]. / Here I present work on a well recognized prototypical Locus-Specific database: PAHdb. PAHdb is a relatively large curated relational database. / This graduate project has had two major aims: to improve PAHdb , by careful analysis of version 1.0 and revision of its design, resulting in PAHdb version 2.0; to document the redesign process and share the experience by the conception of guidelines for content and structure of mutation databases in general. (Abstract shortened by UMI.) Mutation (Biology) -- Databases. Genetics -- Databases. Phenylketonuria. Metabolism, Inborn errors of.
4	Bioinformatic methods in protein characterization / Kallberg, Yvonne, January 2002 (has links) Diss. (sammanfattning) Stockholm : Karol. inst., 2002. / Härtill 5 uppsatser.
5	Design, development, and deployment of a locus specific mutation database : the PAHdb example Nowacki, Piotr Marek. January 1998 (has links) No description available. Genetics -- Databases. Mutation (Biology) -- Databases. Phenylketonuria. Metabolism, Inborn errors of.
6	Supporting the collection and curation of biological observation metadata = Apoio à coleta e curadoria de metadados de observações biológicas / Apoio à coleta e curadoria de metadados de observações biológicas Cugler, Daniel Cintra, 1982- 09 January 2014 (has links) Orientador: Claudia Maria Bauzer Medeiros / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-25T17:19:53Z (GMT). No. of bitstreams: 1 Cugler_DanielCintra_D.pdf: 12940611 bytes, checksum: 857c7cd0b3ea3c5da4930823438c55fa (MD5) Previous issue date: 2014 / Resumo: Bancos de dados de observações biológicas contêm informações sobre ocorrências de um organismo ou um conjunto de organismos detectados em um determinado local e data, de acordo com alguma metodologia. Tais bancos de dados armazenam uma variedade de dados, em múltiplas escalas espaciais e temporais, incluindo imagens, mapas, sons, textos, etc. Estas inestimáveis informações podem ser utilizadas em uma ampla gama de pesquisas, por exemplo, aquecimento global, comportamento de espécies ou produção de alimentos. Todos estes estudos são baseados na análise dos registros e seus respectivos metadados. Na maioria das vezes, análises são iniciadas nos metadados, estes frequentemente utilizados para indexar os registros de observações. No entanto, dada a natureza das atividades de observação, metadados podem possuir problemas de qualidade, dificultando tais análises. Por exemplo, podem haver lacunas nos metadados (por exemplo, atributos faltantes ou registros insuficientes). Isto pode causar sérios problemas: em estudos em biodiversidade, por exemplo, problemas nos metadados relacionados a uma única espécie podem afetar o entendimento não apenas da espécie, mas de amplas interações ecológicas. Esta tese propõe um conjunto de processos para auxiliar na solução de problemas de qualidade em metadados. Enquanto abordagens anteriores enfocam em um dado aspecto do problema, esta tese provê uma arquitetura e algoritmos que englobam o ciclo completo da gerência de metadados de observações biológicas, que vai desde adquirir dados até recuperar registros na base de dados. Nossas contribuições estão divididas em duas categorias: (a) enriquecimento de dados e (b) limpeza de dados. Contribuições na categoria (a) proveem informação adicional para ambos atributos faltantes em registros existentes e registros faltantes para requisitos específicos. Nossas estratégias usam fontes de dados remotas oficiais e VGI (Volunteered Geographic Information) para enriquecer tais metadados, provendo as informações faltantes. Contribuições na categoria (b) detectam anomalias em metadados de observações biológicas através da execução de análises espaciais que contrastam a localização das observações com mapas oficiais de distribuição geográfica de espécies. Deste modo, as principais contribuições são: (i) uma arquitetura para recuperação de registros de observações biológicas, que deriva atributos faltantes através do uso de fontes de dados externas; (ii) uma abordagem espacial para detecção de anomalias e (iii) uma abordagem para aquisição adaptativa de VGI para preencher lacunas em metadados, utilizando dispositivos móveis e sensores. Estas contribuições foram validadas através da implementação de protótipos, utilizando como estudo de caso os desafios oriundos do gerenciamento de metadados de observações biológicas da Fonoteca Neotropical Jacques Vielliard (FNJV), uma das 10 maiores coleções de sons de animais do mundo / Abstract: Biological observation databases contain information about the occurrence of an organism or set of organisms detected at a given place and time according to some methodology. Such databases store a variety of data, at multiple spatial and temporal scales, including images, maps, sounds, texts and so on. This priceless information can be used in a wide range of research initiatives, e.g., global warming, species behavior or food production. All such studies are based on analyzing the records themselves, and their metadata. Most times, analyses start from metadata, often used to index the observation records. However, given the nature of observation activities, metadata may suffer from quality problems, hampering such analyses. For example, there may be metadata gaps (e.g., missing attributes, or insufficient records). This can have serious effects: in biodiversity studies, for instance, metadata problems regarding a single species can affect the understanding not just of the species, but of wider ecological interactions. This thesis proposes a set of processes to help solve problems in metadata quality. While previous approaches concern one given aspect of the problem, the thesis provides an architecture and algorithms that encompass the whole cycle of managing biological observation metadata, which goes from acquiring data to retrieving database records. Our contributions are divided into two categories: (a) data enrichment and (b) data cleaning. Contributions in category (a) provide additional information for both missing attributes in existent records, and missing records for specific requirements. Our strategies use authoritative remote data sources and VGI (Volunteered Geographic Information) to enrich such metadata, providing missing information. Contributions in category (b) detect anomalies in biological observation metadata by performing spatial analyses that contrast location of the observations with authoritative geographic distribution maps. Thus, the main contributions are: (i) an architecture to retrieve biological observation records, which derives missing attributes by using external data sources; (ii) a geographical approach for anomaly detection and (iii) an approach for adaptive acquisition of VGI to fill out metadata gaps, using mobile devices and sensors. These contributions were validated by actual implementations, using as case study the challenges presented by the management of biological observation metadata of the Fonoteca Neotropical Jacques Vielliard (FNJV), one of the top 10 animal sound collections in the world / Doutorado / Ciência da Computação / Doutor em Ciência da Computação Ciência da computação Banco de dados Limpeza de dados Biologia - Banco de dados Computer science Databases Data cleaning Biology - Databases
7	Data analysis and creation of epigenetics database Desai, Akshay A. 21 May 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / This thesis is aimed at creating a pipeline for analyzing DNA methylation epigenetics data and creating a data model structured well enough to store the analysis results of the pipeline. In addition to storing the results, the model is also designed to hold information which will help researchers to decipher a meaningful epigenetics sense from the results made available. Current major epigenetics resources such as PubMeth, MethyCancer, MethDB and NCBI’s Epigenomics database fail to provide holistic view of epigenetics. They provide datasets produced from different analysis techniques which raises an important issue of data integration. The resources also fail to include numerous factors defining the epigenetic nature of a gene. Some of the resources are also struggling to keep the data stored in their databases up-to-date. This has diminished their validity and coverage of epigenetics data. In this thesis we have tackled a major branch of epigenetics: DNA methylation. As a case study to prove the effectiveness of our pipeline, we have used stage-wise DNA methylation and expression raw data for Lung adenocarcinoma (LUAD) from TCGA data repository. The pipeline helped us to identify progressive methylation patterns across different stages of LUAD. It also identified some key targets which have a potential for being a drug target. Along with the results from methylation data analysis pipeline we combined data from various online data reserves such as KEGG database, GO database, UCSC database and BioGRID database which helped us to overcome the shortcomings of existing data collections and present a resource as complete solution for studying DNA methylation epigenetics data. database,epigenetics,data analysis Epigenesis -- Databases Adenocarcinoma -- Genetic aspects Lungs -- Cancer -- Databases Biological systems -- Analysis Genomics -- Data processing Browsers (Computer programs) Genomics -- Mathematical models Computational biology -- Databases

1

Page generated in 0.0347 seconds