• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 35
  • 5
  • 4
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 51
  • 51
  • 23
  • 15
  • 13
  • 12
  • 12
  • 9
  • 9
  • 8
  • 8
  • 8
  • 8
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Conception d'une ontologie hybride à partir d'ontologies métier évolutives : intégration et alignement d'ontologies / Designing a hybrid ontologie from evolutive business ontologies : ontology Integration and Alignment

Ziani, Mina 06 December 2012 (has links)
Cette thèse se situe dans le champ de la gestion des connaissances à l’aide de modèles ontologiques. Pour représenter les connaissances de domaine, nous avons conçu une ontologie hybride à deux niveaux : au niveau local, chaque groupe d’experts (du même métier) a construit sa propre ontologie, au niveau global une ontologie consensuelle regroupant les connaissances partagées est créée de façon automatique. De plus, des liens sémantiques entre les éléments de différentes ontologies locales peuvent être ajoutés.Nous avons construit un système d’aide pour guider les experts dans le processus de création de liens sémantiques ou mises en correspondance. Ses particularités sont de proposer des mesures de similarité en fonction des caractéristiques des ontologies à aligner, de réutiliser des résultats déjà calculés et de vérifier la cohérence des mises en correspondances créées.Par ailleurs, les ontologies locales peuvent être mises à jour. Cela implique des changements au niveau de l’ontologie globale ainsi que des mises en correspondances créées. De ce fait, nous avons développé une approche, adaptée à notre domaine pour gérer l’évolution de l’ontologie hybride. En particulier, nous avons utilisé la notion de versions d’ontologies afin de garder trace de toutes les modifications apportées au niveau des ontologies et de pouvoir revenir à tout moment à une version précédente.Nous avons appliqué notre travail de recherche à la géotechnique qui est un domaine complexe impliquant des experts de différents métiers. Une plateforme logicielle est en cours de réalisation et permettra de tester la faisabilité de nos travaux. / This thesis concerns the scope of knowledge management using ontological models.To represent domain knowledge, we design a hybrid ontology on two levels: In a local level, each experts’ group has designed its own ontology. In a global level, a consensual ontology containing all the shared knowledge is automatically created.We design a computer-aided system to help experts in the process of mapping creation. It allows experts to choice similarity measures relatively to the ontology characteristics, to reuse the calculated similarities and to verify the consistency of the created mappings.In addition, local ontologies can be updated. This involves modifications in the global ontology and on the created mappings. A relevant approach of our domain was developed.In particular, ontology versioning is used in order to keep a record of all the occurred modifications in the ontologies; it allows to return at any time a previous version of the hybrid ontology.The exploited domain is geotechnics which gathers various business experts. A prototype is in progress and currently does not still captures ontology evolution.
42

ROMIE, une approche d'alignement d'ontologies à base d'instances / ROMIE, Resource based Ontology Mapping within an Interactive and Extensible environment

Elbyed, Abdeltif 16 October 2009 (has links)
L'interoperabilite semantique est une question importante, largement identifiee dans les technologies d’organisation et de l'information et dans la communaute de recherche en systemes d'information. L’adoption large du Web afin d’acceder a des informations distribuees necessite l'interoperabilite des systemes qui gerent ces informations. Des solutions et reflexions comme le Web Semantique facilitent la localisation et l’integration des donnees d'une maniere plus intelligente par l'intermediaire des ontologies. Il offre une vision plus semantique et comprehensible du web. Pourtant, il souleve un certain nombre de defis de recherche. Un des principaux defis est de comparer et aligner les differentes ontologies qui apparaissent dans des taches d'integration. Le principal objectif de cette these est de proposer une approche d’alignement pour identifier les liens de correspondance entre des ontologies. Notre approche combine les techniques et les methodes d’appariement linguistiques, syntaxiques, structurelles ou encore semantiques (basees sur les instances). Elle se compose de deux phases principales : la phase d'enrichissement semantique des ontologies a comparer et la phase d’alignement ou de mapping. La phase d'enrichissement est basee sur l'analyse des informations que les ontologies developpent (des ressources web, des donnees, des documents, etc.) et qui sont associes aux concepts de l’ontologie. Notre intuition est que ces informations ainsi que les relations qui peuvent exister entre elles participent a l’enrichissement semantique entre les concepts. A l’issue de la phase d'enrichissement, une ontologie contient plus de relations semantiques entre les concepts qui seront exploitees dans la deuxieme phase. La phase de mapping prend deux ontologies enrichies et calcule la similarite entre les couples de concepts. Un processus de filtrage nous permet de reduire automatiquement le nombre de fausses relations. La validation des correspondances est un processus interactif direct (avec un expert) ou indirect (en mesurant le degre de satisfaction de l’utilisateur). Notre approche a donne lieu a un systeme de mapping appele ROMIE (Resource based Ontology Mapping within an Interactive and Extensible environment). Il a ete experimente et evalue dans deux differentes applications : une application biomedicale et une application dans le domaine de l’apprentissage enrichi par les technologies (ou e-learning). / System interoperability is an important issue, widely recognized in information technology intensive organizations and in the research community of information systems. The wide adoption of the World Wide Web to access and distribute information further stresses the need for system interoperability. Initiative solutions like the Semantic Web facilitate the localization and the integration of the data in a more intelligent way via the use of ontologies. The Semantic Web offers a compelling vision, yet it raises a number of research challenges. One of the key challenges is to compare and map different ontologies, which evidently appears in integration tasks. The main goal of the work is to introduce a method for finding semantic correspondences among ontologies with the intention to support interoperability of Information Systems. The approach brings together syntactic, linguistic, structural and semantic (based on instance information) matching methods in order to provide a semi-automatic mapping. The approach consists of two phases: semantic enrichment phase and mapping phase. The enrichment phase is based on the analysis of the information developed by the ontologies (like web resources, data, documents, etc.) and that are associated to the concepts in the ontologies. Our intuition is that this information as well as the relations that can exist between them is used in semantic enrichment between the concepts. At the end of enrichment phase, the ontology contains more semantic relations between its concepts that will be exploited in the second phase. The phase of mapping takes two enriched ontologies and calculates the similarity between the couples of concepts. A process of filtering enables us to automatically reduce the number of false relations. The validation of the correspondences is a direct interactive process (with an expert) or indirect (by measuring the satisfaction level of the user). The approach has been implemented in a prototype system called ROMIE (Resource based Ontology Mapping within and Interactive and Extensible environment). It was tested and evaluated in two applications: a biomedical application and technology enhanced learning (or e-learning) domain application.
43

Data Integration with XML and Semantic Web Technologies

Tous Liesa, Rubén 04 October 2006 (has links)
En general, la integració de múltiples bases de dades heterogènies té com a objectiu oferir una visió unificada sobre un conjunt de dades preexistent. Aquesta tesi contribueix a diferents aspectes del disseny de sistemes de integració de dades moderns en el context de la World Wide Web. Per un costat, la tesi contribueix a la línia de recerca de la Integració Semàntica, que fa referència al problema de reconciliar dades de fonts autònomes mitjançant l'ús d'ontologies i altres eines semàntiques. La tesi suggereix una nova solució a la integració semàntica XML-RDF, i també contribueix al problema de l'Alineació d'Ontologies, definint una mesura de similitud semàntica rigorosa i escalable per grafs etiquetats i dirigits RDF. Per un altre costat, la tesi suggereix una nova solució al problema de traduir una consulta d'un usuari (dirigida a un esquema lògic intermediari), en consultes sobre un conjunt de fonts de dades autònomes, provistes de interfícies web restringides. / En general, la integración de múltiples bases de datos heterogenias tiene como objetivo ofrecer una visión unificada sobre un conjunto de datos preexistente. Esta tesis contribuye a diferentes aspectos del diseño de sistemas de integración de datos modernos en el contexto de la World Wide Web. Por un lado, la tesis contribuye a la línea de investigación de la Integración Semántica, que hace referencia al problema de reconciliar datos de fuentes autónomas mediante el uso de ontologías i otras herramientas semánticas. La tesis sugiere una nueva solución a la integración semántica XML-RDF, y también contribuye al problema de la Alineación de Ontologías, definiendo una medida de similitud semántica rigurosa i escalable para grafos etiquetados y dirigidos RDF. Por otro lado, la tesis sugiere una nueva solución al problema de traducir una consulta de un usuario (dirigida a un esquema lógico intermediario), en consultas sobre un conjunto de fuentes de datos autónomas, provistas de interfaces web restringidas. / In general, integration of multiple heterogeneous databases aims at giving a unified view over a set of pre-existent data. This thesis contributes to different aspects of the design of modern data integration systems in the context of the World Wide Web. On one hand, this thesis contributes to the Semantic Integration research trend, which refers to the problem of reconciling data from autonomous sources using ontologies and other semantic-based tools. The thesis suggests a novel solution to XML-RDF semantic integration and also contributes to the problem of Ontology Alignment, defining a rigorous and scalable semantic similarity measure for RDF labelled directed graphs. On the other hand, this thesis suggests a novel solution to the problem of translating a user query (targeting a logical mediated schema), into queries over a set of autonomous data sources provided with restricted web interfaces.
44

Toward semantic interoperability for software systems

Lister, Kendall January 2008 (has links)
“In an ill-structured domain you cannot, by definition, have a pre-compiled schema in your mind for every circumstance and context you may find ... you must be able to flexibly select and arrange knowledge sources to most efficaciously pursue the needs of a given situation.” [57] / In order to interact and collaborate effectively, agents, whether human or software, must be able to communicate through common understandings and compatible conceptualisations. Ontological differences that occur either from pre-existing assumptions or as side-effects of the process of specification are a fundamental obstacle that must be overcome before communication can occur. Similarly, the integration of information from heterogeneous sources is an unsolved problem. Efforts have been made to assist integration, through both methods and mechanisms, but automated integration remains an unachieved goal. Communication and information integration are problems of meaning and interaction, or semantic interoperability. This thesis contributes to the study of semantic interoperability by identifying, developing and evaluating three approaches to the integration of information. These approaches have in common that they are lightweight in nature, pragmatic in philosophy and general in application. / The first work presented is an effort to integrate a massive, formal ontology and knowledge-base with semi-structured, informal heterogeneous information sources via a heuristic-driven, adaptable information agent. The goal of the work was to demonstrate a process by which task-specific knowledge can be identified and incorporated into the massive knowledge-base in such a way that it can be generally re-used. The practical outcome of this effort was a framework that illustrates a feasible approach to providing the massive knowledge-base with an ontologically-sound mechanism for automatically generating task-specific information agents to dynamically retrieve information from semi-structured information sources without requiring machine-readable meta-data. / The second work presented is based on reviving a previously published and neglected algorithm for inferring semantic correspondences between fields of tables from heterogeneous information sources. An adapted form of the algorithm is presented and evaluated on relatively simple and consistent data collected from web services in order to verify the original results, and then on poorly-structured and messy data collected from web sites in order to explore the limits of the algorithm. The results are presented via standard measures and are accompanied by detailed discussions on the nature of the data encountered and an analysis of the strengths and weaknesses of the algorithm and the ways in which it complements other approaches that have been proposed. / Acknowledging the cost and difficulty of integrating semantically incompatible software systems and information sources, the third work presented is a proposal and a working prototype for a web site to facilitate the resolving of semantic incompatibilities between software systems prior to deployment, based on the commonly-accepted software engineering principle that the cost of correcting faults increases exponentially as projects progress from phase to phase, with post-deployment corrections being significantly more costly than those performed earlier in a project’s life. The barriers to collaboration in software development are identified and steps taken to overcome them. The system presented draws on the recent collaborative successes of social and collaborative on-line projects such as SourceForge, Del.icio.us, digg and Wikipedia and a variety of techniques for ontology reconciliation to provide an environment in which data definitions can be shared, browsed and compared, with recommendations automatically presented to encourage developers to adopt data definitions compatible with previously developed systems. / In addition to the experimental works presented, this thesis contributes reflections on the origins of semantic incompatibility with a particular focus on interaction between software systems, and between software systems and their users, as well as detailed analysis of the existing body of research into methods and techniques for overcoming these problems.
45

Αξιολόγηση εργαλείων ευθυγράμμισης οντολογιών / Ontology alignment tools evaluation (survey)

Χρηστίδης, Ιωάννης 27 June 2012 (has links)
Η ευθυγράμμιση οντολογιών είναι η διαδικασία καθορισμού των αντιστοιχίσεων μεταξύ εννοιών. Ένα σύνολο αντιστοιχίσεων καλείται ευθυγράμμιση. Στα πρόσφατα έτη έχουν προταθεί διάφορα εργαλεία ως έγκυρη λύση στο πρόβλημα της σημασιολογικής ετερογένειας. Αυτά τα εργαλεία ταυτοποιούν κόμβους σε δύο σχήματα, τα οποία συσχετίζονται συντακτικά ή σημασιολογικά. Τα εργαλεία ευθυγράμμισης οντολογιών έχουν γενικά αναπτυχθεί για να λειτουργούν σε σχήματα βάσεων δεδομένων, XML σχήματα, ταξινομίες, τυπικές γλώσσες, μοντέλα σχέσεων οντοτήτων, λεξικά, θησαυρούς, οντολογίες και άλλα πλαίσια ετικετών. Τα παραπάνω συνήθως μετατρέπονται σε μια αναπαράσταση γράφων πριν την αντιστοίχιση. Εν όψει του Σημασιολογικού Ιστού, οι γράφοι μπορούν να αντιπροσωπευθούν από μορφές RDF (Resource Description Framework). Σε αυτό το πλαίσιο, η ευθυγράμμιση οντολογιών αναφέρεται μερικές φορές ως “ταίριασμα οντολογιών”. Το ταίριασμα οντολογιών είναι μια βασική προϋπόθεση για την ενεργοποίηση της διαλειτουργικότητας στο Σημασιολογικό Ιστό, καθώς επίσης και μια χρήσιμη τακτική για κάποιες κλασσικές εργασίες ολοκλήρωσης δεδομένων. Οι αντιστοιχίες μπορούν να χρησιμοποιηθούν σε διάφορες εργασίες, όπως στη συγχώνευση οντολογιών και στη μετάφραση δεδομένων. Κατά συνέπεια, το ταίριασμα των οντολογιών επιτρέπει στη γνώση και τα στοιχεία που εκφράζονται στις αντιστοιχημένες οντολογίες να επικοινωνήσουν. Τα παραπάνω δίνουν μεγάλη αξία στη σωστή λειτουργία και αποδοτικότητα των εργαλείων ευθυγράμμισης οντολογιών. Για το λόγο αυτό είναι σωστό να γίνονται συχνές αξιολογήσεις των εργαλείων και των αποτελεσμάτων τους, κάτω από διαφορετικές συνθήκες και περιπτώσεις χρήσης. Η αξιολόγηση των ευθυγραμμίσεων οντολογιών γίνεται στην πράξη με δύο τρόπους: (i) αξιολογώντας μεμονωμένες αντιστοιχίες και (ii) συγκρίνοντας την ευθυγράμμιση με μια ευθυγράμμιση αναφοράς. Η παρούσα εργασία έχει ως σκοπό να δώσει μια ικανοποιητική εικόνα για τις επιδόσεις και την αποδοτικότητα πέντε εργαλείων ευθυγράμμισης οντολογιών. Στα πλαίσια της εργασίας περιγράφονται, συγκρίνονται και αξιολογούνται τα χαρακτηριστικά των εργαλείων, οι μέθοδοι και τα αποτελέσματα ευθυγραμμίσεων, ενώ γίνονται συγκριτικές παρατηρήσεις με τα αποτελέσματα των αντίστοιχων εργαλείων στο OAEI (Ontology Alignment Evaluation Initiative). Γίνεται χρήση και των δύο τρόπων αξιολόγησης ευθυγραμμίσεων, δηλαδή καταμετρούνται και παρατηρούνται οι αντιστοιχίες που παρήχθησαν από κάθε μέθοδο, για κάθε εργαλείο και συγκρίνονται με μια ευθυγράμμιση αναφοράς, η οποία παρήχθηκε χειρωνακτικά. Η σύγκριση των συστημάτων και των αλγορίθμων στην ίδια βάση αποτελεί το μέσο που επιτρέπει στον καθένα να σχηματίσει συμπεράσματα για τις καλύτερες στρατηγικές ταιριάσματος. / Ontology alignment is the process of determining correspondences between concepts. A set of mappings is called alignment. In recent years several tools have been proposed as a valid solution to the problem of semantic heterogeneity. These tools identify nodes in two schemas, which are related syntactically or semantically. Ontology alignment tools have been generally developed to operate in database schemas, XML schemas, taxonomies, formal languages, entities relations models, dictionaries, thesauri, ontologies and other contexts labels. These are usually converted into a graph representation before the matching process. In the Semantic Web, graphs can be represented by RDF formats (Resource Description Framework). In this context, ontology alignment is sometimes been referred as "ontology matching". Ontology matching is a prerequisite for the activation of interoperability on the Semantic Web, as well as a useful tactic for some classical data integration tasks. The matches can be used in various tasks such as ontology merging and data translation. Thus, ontology matching enables the knowledge and data expressed in the matched ontologies to communicate. These give great value to the proper functioning and efficiency of ontology alignment tools. For this reason it is right to be frequent reviews of tools and their effects, under different circumstances and use cases. The evaluation of ontology alignment is practically achieved in two ways: (i) evaluating individual matchings and (ii) comparing the alignment with a reference alignment. This paper has the purpose to give a satisfactory picture of the performance and efficiency of five ontology alignment tools. As part of the work are being described, compared and evaluated the characteristics of the tools, the methods and the alignment results, while comparative observations are made with the results of the same tools in OAEI (Ontology Alignment Evaluation Initiative). Both ways of evaluating alignments are being used, while being counted and aware of the matches produced by every method from each tool and compared with a reference alignment, which was manually produced. The comparison of tools and algorithms on the same basis constitutes the way that allows everyone to produce own conclusions about the best matching techniques.
46

Approches vers des modèles unifiés pour l'intégration de bases de connaissances / Approaches Towards Unified Models for Integrating Web Knowledge Bases

Koutraki, Maria 27 September 2016 (has links)
Ma thèse a comme but l’intégration automatique de nouveaux services Web dans une base de connaissances. Pour chaque méthode d’un service Web, une vue est calculée de manière automatique. La vue est représentée comme une requête sur la base de connaissances. L’algorithme que nous avons proposé calcule également une fonction de transformation XSLT associée à la méthode qui est capable de transformer les résultats d’appel dans un fragment conforme au schéma de la base de connaissances. La nouveauté de notre approche c’est que l’alignement repose seulement sur l’alignement des instances. Il ne dépend pas des noms des concepts ni des contraintes qui sont définis par le schéma. Ceci le fait particulièrement pertinent pour les services Web qui sont publiés actuellement sur le Web, parce que ces services utilisent le protocole REST. Ce protocole ne permet pas la publication de schémas. En plus, JSON semble s’imposer comme le standard pour la représentation des résultats d’appels de services. À différence du langage XML, JSON n’utilise pas de noeuds nommés. Donc les algorithmes d’alignement traditionnels sont privés de noms de concepts sur lesquels ils se basent. / My thesis aim the automatic integration of new Web services in a knowledge base. For each method of a Web service, a view is automatically calculated. The view is represented as a query on the knowledge base. Our algorithm also calculates an XSLT transformation function associated to the method that is able to transform the call results in a fragment according to the schema of the knowledge base. The novelty of our approach is that the alignment is based only on the instances. It does not depend on the names of the concepts or constraints that are defined by the schema. This makes it particularly relevant for Web services that are currently available on the Web, because these services use the REST protocol. This protocol does not allow the publication schemes. In addition, JSON seems to establish itself as the standard for the representation of technology call results.
47

Exploiting BioPortal as Background Knowledge in Ontology Alignment

Chen, Xi 11 August 2014 (has links)
No description available.
48

MSSearch: busca semântica de objetos de aprendizagem OBAA com suporte a alinhamento automático de ontologias

Silva, Luiz Rodrigo Jardim da 27 March 2013 (has links)
Submitted by Maicon Juliano Schmidt (maicons) on 2015-07-09T14:56:04Z No. of bitstreams: 1 Luiz Rodrigo Jardim da Silva.pdf: 2565431 bytes, checksum: 6a2df89b794e9afe09546769e43ef4e9 (MD5) / Made available in DSpace on 2015-07-09T14:56:04Z (GMT). No. of bitstreams: 1 Luiz Rodrigo Jardim da Silva.pdf: 2565431 bytes, checksum: 6a2df89b794e9afe09546769e43ef4e9 (MD5) Previous issue date: 2013-01-31 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Problemas relacionados à heterogeneidade semântica vêm se mostrando atualmente como um importante campo de pesquisa. Dentro do contexto educacional, pesquisadores têm se dedicado ao desenvolvimento de novas tecnologias que visam melhorar os processos de localização, recuperação, catalogação, e reutilização de objetos de aprendizagem. Baseado neste cenário, destaca-se o uso de técnicas de alinhamento de ontologias para prover integração entre ontologias distintas. Assim, o objetivo deste trabalho é desenvolver uma ferramenta que forneça mecanismos de busca semântica de objetos de aprendizagem com suporte a alinhamento automático de ontologias. / Semantics heterogeneity problems are becoming an important field of research. Within the educational context, researchers have focused on developing new technologies to improve the processes of localization, retrieval, cataloging, and reuse of learning objects. This scenario highlights the use of ontology alignment techniques to provide integration between different ontologies. Therefore, the goal of the present work is to develop a tool that provides mechanisms for semantic search of learning objects, with support for automatic aligning ontologies.
49

[en] TOWARDS A WELL-INTERLINKED WEB THROUGH MATCHING AND INTERLINKING APPROACHES / [pt] INTERLIGANDO RECURSOS NA WEB ATRAVÉS DE ABORDAGENS DE MATCHING E INTERLINKING

BERNARDO PEREIRA NUNES 07 January 2016 (has links)
[pt] Com o surgimento da Linked (Open) Data, uma série de novos e importantes desafios de pesquisa vieram à tona. A abertura de dados, como muitas vezes a Linked Data é conhecida, oferece uma oportunidade para integrar e conectar, de forma homogênea, fontes de dados heterogêneas na Web. Como diferentes fontes de dados, com recursos em comum ou relacionados, são publicados por diferentes editores, a sua integração e consolidação torna-se um verdadeiro desafio. Outro desafio advindo da Linked Data está na criação de um grafo denso de dados na Web. Com isso, a identificação e interligação, não só de recursos idênticos, mas também dos recursos relacionadas na Web, provê ao consumidor (data consumer) uma representação mais rica dos dados e a possibilidade de exploração dos recursos conectados. Nesta tese, apresentamos três abordagens para enfrentar os problemas de integração, consolidação e interligação de dados. Nossa primeira abordagem combina técnicas de informação mútua e programação genética para solucionar o problema de alinhamento complexo entre fontes de dados, um problema raramente abordado na literatura. Na segunda e terceira abordagens, adotamos e ampliamos uma métrica utilizada em teoria de redes sociais para enfrentar o problema de consolidação e interligação de dados. Além disso, apresentamos um aplicativo Web chamado Cite4Me que fornece uma nova perspectiva sobre a pesquisa e recuperação de conjuntos de Linked Open Data, bem como os benefícios da utilização de nossas abordagens. Por fim, uma série de experimentos utilizando conjuntos de dados reais demonstram que as nossas abordagens superam abordagens consideradas como estado da arte. / [en] With the emergence of Linked (Open) Data, a number of novel and notable research challenges have been raised. The openness that often characterises Linked Data offers an opportunity to homogeneously integrate and connect heterogeneous data sources on the Web. As disparate data sources with overlapping or related resources are provided by different data publishers, their integration and consolidation becomes a real challenge. An additional challenge of Linked Data lies in the creation of a well-interlinked graph of Web data. Identifying and linking not only identical Web resources, but also lateral Web resources, provides the data consumer with richer representation of the data and the possibility of exploiting connected resources. In this thesis, we present three approaches that tackle data integration, consolidation and linkage problems. Our first approach combines mutual information and genetic programming techniques for complex datatype property matching, a rarely addressed problem in the literature. In the second and third approaches, we adopt and extend a measure from social network theory to address data consolidation and interlinking. Furthermore, we present a Web-based application named Cite4Me that provides a new perspective on search and retrieval of Linked Open Data sets, as well as the benefits of using our approaches. Finally, we validate our approaches through extensive evaluations using real-world datasets, reporting results that outperform state of the art approaches.
50

Evolution von ontologiebasierten Mappings in den Lebenswissenschaften / Evolution of ontology-based mappings in the life sciences

Groß, Anika 19 March 2014 (has links) (PDF)
Im Bereich der Lebenswissenschaften steht eine große und wachsende Menge heterogener Datenquellen zur Verfügung, welche häufig in quellübergreifenden Analysen und Auswertungen miteinander kombiniert werden. Um eine einheitliche und strukturierte Erfassung von Wissen sowie einen formalen Austausch zwischen verschiedenen Applikationen zu erleichtern, kommen Ontologien und andere strukturierte Vokabulare zum Einsatz. Sie finden Anwendung in verschiedenen Domänen wie der Molekularbiologie oder Chemie und dienen zumeist der Annotation realer Objekte wie z.B. Gene oder Literaturquellen. Unterschiedliche Ontologien enthalten jedoch teilweise überlappendes Wissen, so dass die Bestimmung einer Abbildung (Ontologiemapping) zwischen ihnen notwendig ist. Oft ist eine manuelle Mappingerstellung zwischen großen Ontologien kaum möglich, weshalb typischerweise automatische Verfahren zu deren Abgleich (Matching) eingesetzt werden. Aufgrund neuer Forschungserkenntnisse und Nutzeranforderungen verändern sich die Ontologien kontinuierlich weiter. Die Evolution der Ontologien hat wiederum Auswirkungen auf abhängige Daten wie beispielsweise Annotations- und Ontologiemappings, welche entsprechend aktualisiert werden müssen. Im Rahmen dieser Arbeit werden neue Methoden und Algorithmen zum Umgang mit der Evolution ontologie-basierter Mappings entwickelt. Dabei wird die generische Infrastruktur GOMMA zur Verwaltung und Analyse der Evolution von Ontologien und Mappings genutzt und erweitert. Zunächst wurde eine vergleichende Analyse der Evolution von Ontologiemappings für drei Subdomänen der Lebenswissenschaften durchgeführt. Ontologien sowie Mappings unterliegen teilweise starken Änderungen, wobei die Evolutionsintensität von der untersuchten Domäne abhängt. Insgesamt zeigt sich ein deutlicher Einfluss von Ontologieänderungen auf Ontologiemappings. Dementsprechend können bestehende Mappings infolge der Weiterentwicklung von Ontologien ungültig werden, so dass sie auf aktuelle Ontologieversionen migriert werden müssen. Dabei sollte eine aufwendige Neubestimmung der Mappings vermieden werden. In dieser Arbeit werden zwei generische Algorithmen zur (semi-) automatischen Adaptierung von Ontologiemappings eingeführt. Ein Ansatz basiert auf der Komposition von Ontologiemappings, wohingegen der andere Ansatz eine individuelle Behandlung von Ontologieänderungen zur Adaptierung der Mappings erlaubt. Beide Verfahren ermöglichen die Wiederverwendung unbeeinflusster, bereits bestätigter Mappingteile und adaptieren nur die von Änderungen betroffenen Bereiche der Mappings. Eine Evaluierung für sehr große, biomedizinische Ontologien und Mappings zeigt, dass beide Verfahren qualitativ hochwertige Ergebnisse produzieren. Ähnlich zu Ontologiemappings werden auch ontologiebasierte Annotationsmappings durch Ontologieänderungen beeinflusst. Die Arbeit stellt einen generischen Ansatz zur Bewertung der Qualität von Annotationsmappings auf Basis ihrer Evolution vor. Verschiedene Qualitätsmaße erlauben die Identifikation glaubwürdiger Annotationen beispielsweise anhand ihrer Stabilität oder Herkunftsinformationen. Eine umfassende Analyse großer Annotationsdatenquellen zeigt zahlreiche Instabilitäten z.B. aufgrund temporärer Annotationslöschungen. Dementsprechend stellt sich die Frage, inwieweit die Datenevolution zu einer Veränderung von abhängigen Analyseergebnissen führen kann. Dazu werden die Auswirkungen der Ontologie- und Annotationsevolution auf sogenannte funktionale Analysen großer biologischer Datensätze untersucht. Eine Evaluierung anhand verschiedener Stabilitätsmaße erlaubt die Bewertung der Änderungsintensität der Ergebnisse und gibt Aufschluss, inwieweit Nutzer mit einer signifikanten Veränderung ihrer Ergebnisse rechnen müssen. Darüber hinaus wird GOMMA um effiziente Verfahren für das Matching sehr großer Ontologien erweitert. Diese werden u.a. für den Abgleich neuer Konzepte während der Adaptierung von Ontologiemappings benötigt. Viele der existierenden Match-Systeme skalieren nicht für das Matching besonders großer Ontologien wie sie im Bereich der Lebenswissenschaften auftreten. Ein effizienter, kompositionsbasierter Ansatz gleicht Ontologien indirekt ab, indem existierende Mappings zu Mediatorontologien wiederverwendet und miteinander kombiniert werden. Mediatorontologien enthalten wertvolles Hintergrundwissen, so dass sich die Mappingqualität im Vergleich zu einem direkten Matching verbessern kann. Zudem werden generelle Strategien für das parallele Ontologie-Matching unter Verwendung mehrerer Rechenknoten vorgestellt. Eine größenbasierte Partitionierung der Eingabeontologien verspricht eine gute Lastbalancierung und Skalierbarkeit, da kleinere Teilaufgaben des Matchings parallel verarbeitet werden können. Die Evaluierung im Rahmen der Ontology Alignment Evaluation Initiative (OAEI) vergleicht GOMMA und andere Systeme für das Matching von Ontologien in verschiedenen Domänen. GOMMA kann u.a. durch Anwendung des parallelen und kompositionsbasierten Matchings sehr gute Ergebnisse bezüglich der Effektivität und Effizienz des Matchings, insbesondere für Ontologien aus dem Bereich der Lebenswissenschaften, erreichen. / In the life sciences, there is an increasing number of heterogeneous data sources that need to be integrated and combined in comprehensive analysis tasks. Often ontologies and other structured vocabularies are used to provide a formal representation of knowledge and to facilitate data exchange between different applications. Ontologies are used in different domains like molecular biology or chemistry. One of their most important applications is the annotation of real-world objects like genes or publications. Since different ontologies can contain overlapping knowledge it is necessary to determine mappings between them (ontology mappings). A manual mapping creation can be very time-consuming or even infeasible such that (semi-) automatic ontology matching methods are typically applied. Ontologies are not static but underlie continuous modifications due to new research insights and changing user requirements. The evolution of ontologies can have impact on dependent data like annotation or ontology mappings. This thesis presents novel methods and algorithms to deal with the evolution of ontology-based mappings. Thereby the generic infrastructure GOMMA is used and extended to manage and analyze the evolution of ontologies and mappings. First, a comparative evolution analysis for ontologies and mappings from three life science domains shows heavy changes in ontologies and mappings as well as an impact of ontology changes on the mappings. Hence, existing ontology mappings can become invalid and need to be migrated to current ontology versions. Thereby an expensive redetermination of the mappings should be avoided. This thesis introduces two generic algorithms to (semi-) automatically adapt ontology mappings: (1) a composition-based adaptation relies on the principle of mapping composition, and (2) a diff-based adaptation algorithm allows for individually handling change operations to update mappings. Both approaches reuse unaffected mapping parts, and adapt only affected parts of the mappings. An evaluation for very large biomedical ontologies and mappings shows that both approaches produce ontology mappings of high quality. Similarly, ontology changes may also affect ontology-based annotation mappings. The thesis introduces a generic evaluation approach to assess the quality of annotation mappings based on their evolution. Different quality measures allow for the identification of reliable annotations, e.g., based on their stability or provenance information. A comprehensive analysis of large annotation data sources shows numerous instabilities, e.g., due to the temporary absence of annotations. Such modifications may influence results of dependent applications such as functional enrichment analyses that describe experimental data in terms of ontological groupings. The question arises to what degree ontology and annotation changes may affect such analyses. Based on different stability measures the evaluation assesses change intensities of application results and gives insights whether users need to expect significant changes of their analysis results. Moreover, GOMMA is extended by large-scale ontology matching techniques. Such techniques are useful, a.o., to match new concepts during ontology mapping adaptation. Many existing match systems do not scale for aligning very large ontologies, e.g., from the life science domain. One efficient composition-based approach indirectly computes ontology mappings by reusing and combining existing mappings to intermediate ontologies. Intermediate ontologies can contain useful background knowledge such that the mapping quality can be improved compared to a direct match approach. Moreover, the thesis introduces general strategies for matching ontologies in parallel using several computing nodes. A size-based partitioning of the input ontologies enables good load balancing and scalability since smaller match tasks can be processed in parallel. The evaluation of the Ontology Alignment Evaluation Initiative (OAEI) compares GOMMA and other systems in terms of matching ontologies from different domains. Using the parallel and composition-based matching, GOMMA can achieve very good results w.r.t. efficiency and effectiveness, especially for ontologies from the life science domain.

Page generated in 0.0999 seconds