Global ETD Search

1	Engineering truly automated data integration and translation systems Warren, Robert H 10 December 2007 (has links) This thesis presents an automated, data-driven integration process for relational databases. Whereas previous integration methods assumed a large amount of user involvement as well as the availability of database meta-data, we make no use of meta-data and little end user input. This is done using a novel join and translation finding algorithm that searches for the proper key / foreign key relationships while inferring the instance transformations from one database to another. Because we rely only on the relations that bind the attributes together, we make no use of the database schema information. A novel searching method allows us to search the database for relevant objects without requiring server side indexes or cooperative databases. Database integration Computer Science
2	Engineering truly automated data integration and translation systems Warren, Robert H 10 December 2007 (has links) This thesis presents an automated, data-driven integration process for relational databases. Whereas previous integration methods assumed a large amount of user involvement as well as the availability of database meta-data, we make no use of meta-data and little end user input. This is done using a novel join and translation finding algorithm that searches for the proper key / foreign key relationships while inferring the instance transformations from one database to another. Because we rely only on the relations that bind the attributes together, we make no use of the database schema information. A novel searching method allows us to search the database for relevant objects without requiring server side indexes or cooperative databases. Database integration Computer Science
3	A framework of services to provide a persistent data access service for the CORBA environment Ball, Craig January 1999 (has links) No description available. 005 Database integration; Meta-schema model
4	The taxonomic name resolution service: an online tool for automated standardization of plant names Boyle, Brad, Hopkins, Nicole, Lu, Zhenyuan, Raygoza Garay, Juan Antonio, Mozzherin, Dmitry, Rees, Tony, Matasci, Naim, Narro, Martha, Piel, William, Mckay, Sheldon, Lowry, Sonya, Freeland, Chris, Peet, Robert, Enquist, Brian January 2013 (has links) BACKGROUND:The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this 'names problem' has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.RESULTS:The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.CONCLUSIONS:We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ webcite and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/ webcite. Biodiversity informatics Database integration Taxonomy Plants
5	Knowledge Integration to Overcome Ontological Heterogeneity: Challenges from Financial Information Systems Firat, Aykut, Madnick, Stuart E., Grosof, Benjamin 01 1900 (has links) The shift towards global networking brings with it many opportunities and challenges. In this paper, we discuss key technologies in achieving global semantic interoperability among heterogeneous information systems, including both traditional and web data sources. In particular, we focus on the importance of this capability and technologies we have designed to overcome ontological heterogeneity, a common type of disparity in financial information systems. Our approach to representing and reasoning with ontological heterogeneities in data sources is an extension of the Context Interchange (COIN) framework, a mediator-based approach for achieving semantic interoperability among heterogeneous sources and receivers. We also analyze the issue of ontological heterogeneity in the context of source-selection, and offer a declarative solution that combines symbolic solvers and mixed integer programming techniques in a constraint logic-programming framework. Finally, we discuss how these techniques can be coupled with emerging Semantic Web related technologies and standards such as Web-Services, DAML+OIL, and RuleML, to offer scalable solutions for global semantic interoperability. We believe that the synergy of database integration and Semantic Web research can make significant contributions to the financial knowledge integration problem, which has implications in financial services, and many other e-business tasks. / Singapore-MIT Alliance (SMA) database integration Semantic Web ontologies source selection
6	Towards a new approach for enterprise integration : the semantic modeling approach Radhakrishnan, Ranga Prasad 01 February 2005 Manufacturing today has become a matter of the effective and efficient application of information technology and knowledge engineering. Manufacturing firms success depends to a great extent on information technology, which emphasizes the integration of the information systems used by a manufacturing enterprise. This integration is also called enterprise application integration (here the term application means information systems or software systems). The methodology for enterprise application integration, in particular enterprise application integration automation, has been studied for at least a decade; however, no satisfactory solution has been found. Enterprise application integration is becoming even more difficult due to the explosive growth of various information systems as a result of ever increasing competition in the software market. This thesis aims to provide a novel solution to enterprise application integration. The semantic data model concept that evolved in database technology is revisited and applied to enterprise application integration. This has led to two novel ideas developed in this thesis. First, an ontology of an enterprise with five levels (following the data abstraction: generalization/specialization) is proposed and represented using unified modeling language. Second, both the ontology for the enterprise functions and the ontology for the enterprise applications are modeled to allow automatic processing of information back and forth between these two domains. The approach with these novel ideas is called the enterprise semantic model approach. The thesis presents a detailed description of the enterprise semantic model approach, including the fundamental rationale behind the enterprise semantic model, the ontology of enterprises with levels, and a systematic way towards the construction of a particular enterprise semantic model for a company. A case study is provided to illustrate how the approach works and to show the high potential of solving the existing problems within enterprise application integration. Database Modeling Database Integration Software Integration Adapter UML Middleware Enterprise Application Integration (EAI) Ontology Enterprise Engineering Semantic Model
7	Towards a new approach for enterprise integration : the semantic modeling approach Radhakrishnan, Ranga Prasad 01 February 2005 (has links) Manufacturing today has become a matter of the effective and efficient application of information technology and knowledge engineering. Manufacturing firms success depends to a great extent on information technology, which emphasizes the integration of the information systems used by a manufacturing enterprise. This integration is also called enterprise application integration (here the term application means information systems or software systems). The methodology for enterprise application integration, in particular enterprise application integration automation, has been studied for at least a decade; however, no satisfactory solution has been found. Enterprise application integration is becoming even more difficult due to the explosive growth of various information systems as a result of ever increasing competition in the software market. This thesis aims to provide a novel solution to enterprise application integration. The semantic data model concept that evolved in database technology is revisited and applied to enterprise application integration. This has led to two novel ideas developed in this thesis. First, an ontology of an enterprise with five levels (following the data abstraction: generalization/specialization) is proposed and represented using unified modeling language. Second, both the ontology for the enterprise functions and the ontology for the enterprise applications are modeled to allow automatic processing of information back and forth between these two domains. The approach with these novel ideas is called the enterprise semantic model approach. The thesis presents a detailed description of the enterprise semantic model approach, including the fundamental rationale behind the enterprise semantic model, the ontology of enterprises with levels, and a systematic way towards the construction of a particular enterprise semantic model for a company. A case study is provided to illustrate how the approach works and to show the high potential of solving the existing problems within enterprise application integration. Database Modeling Database Integration Software Integration Adapter UML Middleware Enterprise Application Integration (EAI) Ontology Enterprise Engineering Semantic Model
8	Biological and clinical data integration and its applications in healthcare Hagen, Matthew 07 January 2016 (has links) Answers to the most complex biological questions are rarely determined solely from the experimental evidence. It requires subsequent analysis of many data sources that are often heterogeneous. Most biological data repositories focus on providing only one particular type of data, such as sequences, molecular interactions, protein structure, or gene expression. In many cases, it is required for researchers to visit several different databases to answer one scientific question. It is essential to develop strategies to integrate disparate biological data sources that are efficient and seamless to facilitate the discovery of novel associations and validate existing hypotheses. This thesis presents the design and development of different integration strategies of biological and clinical systems. The BioSPIDA system is a data warehousing solution that integrates many NCBI databases and other biological sources on protein sequences, protein domains, and biological pathways. It utilizes a universal parser facilitating integration without developing separate source code for each data site. This enables users to execute fine-grained queries that can filter genes by their protein interactions, gene expressions, functional annotation, and protein domain representation. Relational databases can powerfully return and generate quickly filtered results to research questions, but they are not the most suitable solution in all cases. Clinical patients and genes are typically annotated by concepts in hierarchical ontologies and performance of relational databases are weakened considerably when traversing and representing graph structures. This thesis illustrates when relational databases are most suitable as well as comparing the performance benchmarks of semantic web technologies and graph databases when comparing ontological concepts. Several approaches of analyzing integrated data will be discussed to demonstrate the advantages over dependencies on remote data centers. Intensive Care Patients are prioritized by their length of stay and their severity class is estimated by their diagnosis to help minimize wait time and preferentially treat patients by their condition. In a separate study, semantic clustering of patients is conducted by integrating a clinical database and a medical ontology to help identify multi-morbidity patterns. In the biological area, gene pathways, protein interaction networks, and functional annotation are integrated to help predict and prioritize candidate disease genes. This thesis will present the results that were able to be generated from each project through utilizing a local repository of genes, functional annotations, protein interactions, clinical patients, and medical ontologies. Biological database integration Clinical data warehouse Candidate gene prioritization Disease Diffusion kernel Data mining Ontology Semantic similarity Clustering Intensive care unit Hospital prioritization Patient Machine learning
9	UMA ABORDAGEM BASEADA NA ENGENHARIA DIRIGIDA POR MODELOS PARA SUPORTAR MERGING DE BASE DE DADOS HETEROGÊNEAS / AN APPROACH BASED IN MODEL DRIVEN ENGINEERING TO SUPPORT MERGING OF HETEROGENEOUS DATABASE CARVALHO, Marcus Vinícius Ribeiro de 24 February 2014 (has links) Made available in DSpace on 2016-08-17T14:53:26Z (GMT). No. of bitstreams: 1 Dissertacao Marcus Vinicius Ribeiro.pdf: 4694533 bytes, checksum: b84a4bad63b098d054781131cfb9bc26 (MD5) Previous issue date: 2014-02-24 / Model Driven Engineering (MDE) aims to make face to the development, maintenance and evolution of complex software systems, focusing in models and model transformations. This approach can be applied in other domains such as database schema integration. In this research work, we propose a framework to integrate database schema in the MDE context. Metamodels for defining database model, database model matching, database model merging, and integrated database model are proposed in order to support our framework. An algorithm for database model matching and an algorithm for database model merging are presented. We present also, a prototype that extends the MT4MDE and SAMT4MDE tools in order to demonstrate the implementation of our proposed framework, metodology, and algorithms. An illustrative example helps to understand our proposed framework. / A Engenharia Dirigida por Modelos (MDE) fornece suporte para o gerenciamento da complexidade de desenvolvimento, manutenção e evolução de software, através da criação e transformação de modelos. Esta abordagem pode ser utilizada em outros domínios também complexos como a integração de esquemas de base de dados. Neste trabalho de pesquisa, propomos uma metodologia para integrar schema de base de dados no contexto da MDE. Metamodelos para definição de database model, database model matching, database model merging, integrated database model são propostos com a finalidade de apoiar a metodologia. Um algoritmo para database model matching e um algoritmo para database model merging são apresentados. Apresentamos ainda, um protótipo que adapta e estende as ferramentas MT4MDE e SAMT4MDE a fim de demonstrar a implementação do framework, metodologia e algoritmos propostos. Um exemplo ilustrativo ajuda a melhor entender a metodologia apresentada, servindo para explicar os metamodelos e algoritmos propostos neste trabalho. Uma breve avaliação do framework e diretrizes futuras sobre este trabalho são apresentadas. Engenharia dirigida por modelos Integração de base de dados Model Driven Engineering Schema Matching Database Integration Metamodel and Model Matching
10	MIDB : um modelo de integração de dados biológicos Perlin, Caroline Beatriz 29 February 2012 (has links) Made available in DSpace on 2016-06-02T19:05:56Z (GMT). No. of bitstreams: 1 4370.pdf: 1089392 bytes, checksum: 82daa0e51d37184f8864bd92d9342dde (MD5) Previous issue date: 2012-02-29 / In bioinformatics, there is a huge volume of data related to biomolecules and to nucleotide and amino acid sequences that reside (in almost their totality) in several Biological Data Bases (BDBs). For a specific sequence, there are some informational classifications: genomic data, evolution-data, structural data, and others. Some BDBs store just one or some of these classifications. Those BDBs are hosted in different sites and servers, with several data base management systems with different data models. Besides, instances and schema might have semantic heterogeneity. In such scenario, the objective of this project is to propose a biological data integration model, that adopts new schema integration and instance integration techniques. The proposed integration model has a special mechanism of schema integration and another mechanism that performs the instance integration (with support of a dictionary) allowing conflict resolution in the attribute values; and a Clustering Algorithm is used in order to cluster similar entities. Besides, a domain specialist participates managing those clusters. The proposed model was validated through a study case focusing on schema and instance integration about nucleotide sequence data from organisms of Actinomyces gender, captured from four different data sources. The result is that about 97.91% of the attributes were correctly categorized in the schema integration, and the instance integration was able to identify that about 50% of the clusters created need support from a specialist, avoiding errors on the instance resolution. Besides, some contributions are presented, as the Attributes Categorization, the Clustering Algorithm, the distance functions proposed and the proposed model itself. / Na bioinformática, existe um imenso volume de dados sendo produzidos, os quais estão relacionados a sequências de nucleotídeos e aminoácidos que se encontram, em quase a sua totalidade, armazenados em Bancos de Dados Biológicos (BDBs). Para uma determinada sequência existem algumas classificações de informação: dados genômicos, dados evolutivos, dados estruturais, dentre outros. Existem BDBs que armazenam somente uma ou algumas dessas classificações. Tais BDBs estão hospedados em diferentes sites e servidores, com sistemas gerenciadores de banco de dados distintos e com uso de diferentes modelos de dados, além de terem instâncias e esquemas com heterogeneidade semântica. Dentro desse contexto, o objetivo deste projeto de mestrado é propor um Modelo de Integração de Dados Biológicos, com novas técnicas de integração de esquemas e integração de instâncias. O modelo de integração proposto possui um mecanismo especial de integração de esquemas, e outro mecanismo que realiza a integração de instâncias de dados (com um dicionário acoplado) permitindo resolução de conflitos nos valores dos atributos; e um Algoritmo de Clusterização é utilizado, com o objetivo de realizar o agrupamento de entidades similares. Além disso, o especialista de domínio participa do gerenciamento desses agrupamentos. Esse modelo foi validado por meio de um estudo de caso com ênfase na integração de esquemas e integração de instâncias com dados de sequências de nucleotídeos de genes de organismos do gênero Actinomyces, provenientes de quatro diferentes fontes de dados. Como resultado, obteve-se que aproximadamente 97,91% dos atributos foram categorizados corretamente na integração de esquemas e a integração de instâncias conseguiu identificar que aproximadamente 50% dos clusters gerados precisam de tratamento do especialista, evitando erros de resolução de entidades. Além disso, algumas contribuições são apresentadas, como por exemplo a Categorização de Atributos, o Algoritmo de Clusterização, as funções de distância propostas e o modelo MIDB em si. Banco de dados Bioinformática Modelo de integração de dados Integração de esquemas Integração de instâncias Integração de Dados Biológicos Bioinformatics Biological Databases Biological Database Integration Data Integration Model Schema Integration Instance Integration

Search results