Global ETD Search

41	Αξιολόγηση εργαλείων ευθυγράμμισης οντολογιών / Ontology alignment tools evaluation (survey) Χρηστίδης, Ιωάννης 27 June 2012 (has links) Η ευθυγράμμιση οντολογιών είναι η διαδικασία καθορισμού των αντιστοιχίσεων μεταξύ εννοιών. Ένα σύνολο αντιστοιχίσεων καλείται ευθυγράμμιση. Στα πρόσφατα έτη έχουν προταθεί διάφορα εργαλεία ως έγκυρη λύση στο πρόβλημα της σημασιολογικής ετερογένειας. Αυτά τα εργαλεία ταυτοποιούν κόμβους σε δύο σχήματα, τα οποία συσχετίζονται συντακτικά ή σημασιολογικά. Τα εργαλεία ευθυγράμμισης οντολογιών έχουν γενικά αναπτυχθεί για να λειτουργούν σε σχήματα βάσεων δεδομένων, XML σχήματα, ταξινομίες, τυπικές γλώσσες, μοντέλα σχέσεων οντοτήτων, λεξικά, θησαυρούς, οντολογίες και άλλα πλαίσια ετικετών. Τα παραπάνω συνήθως μετατρέπονται σε μια αναπαράσταση γράφων πριν την αντιστοίχιση. Εν όψει του Σημασιολογικού Ιστού, οι γράφοι μπορούν να αντιπροσωπευθούν από μορφές RDF (Resource Description Framework). Σε αυτό το πλαίσιο, η ευθυγράμμιση οντολογιών αναφέρεται μερικές φορές ως “ταίριασμα οντολογιών”. Το ταίριασμα οντολογιών είναι μια βασική προϋπόθεση για την ενεργοποίηση της διαλειτουργικότητας στο Σημασιολογικό Ιστό, καθώς επίσης και μια χρήσιμη τακτική για κάποιες κλασσικές εργασίες ολοκλήρωσης δεδομένων. Οι αντιστοιχίες μπορούν να χρησιμοποιηθούν σε διάφορες εργασίες, όπως στη συγχώνευση οντολογιών και στη μετάφραση δεδομένων. Κατά συνέπεια, το ταίριασμα των οντολογιών επιτρέπει στη γνώση και τα στοιχεία που εκφράζονται στις αντιστοιχημένες οντολογίες να επικοινωνήσουν. Τα παραπάνω δίνουν μεγάλη αξία στη σωστή λειτουργία και αποδοτικότητα των εργαλείων ευθυγράμμισης οντολογιών. Για το λόγο αυτό είναι σωστό να γίνονται συχνές αξιολογήσεις των εργαλείων και των αποτελεσμάτων τους, κάτω από διαφορετικές συνθήκες και περιπτώσεις χρήσης. Η αξιολόγηση των ευθυγραμμίσεων οντολογιών γίνεται στην πράξη με δύο τρόπους: (i) αξιολογώντας μεμονωμένες αντιστοιχίες και (ii) συγκρίνοντας την ευθυγράμμιση με μια ευθυγράμμιση αναφοράς. Η παρούσα εργασία έχει ως σκοπό να δώσει μια ικανοποιητική εικόνα για τις επιδόσεις και την αποδοτικότητα πέντε εργαλείων ευθυγράμμισης οντολογιών. Στα πλαίσια της εργασίας περιγράφονται, συγκρίνονται και αξιολογούνται τα χαρακτηριστικά των εργαλείων, οι μέθοδοι και τα αποτελέσματα ευθυγραμμίσεων, ενώ γίνονται συγκριτικές παρατηρήσεις με τα αποτελέσματα των αντίστοιχων εργαλείων στο OAEI (Ontology Alignment Evaluation Initiative). Γίνεται χρήση και των δύο τρόπων αξιολόγησης ευθυγραμμίσεων, δηλαδή καταμετρούνται και παρατηρούνται οι αντιστοιχίες που παρήχθησαν από κάθε μέθοδο, για κάθε εργαλείο και συγκρίνονται με μια ευθυγράμμιση αναφοράς, η οποία παρήχθηκε χειρωνακτικά. Η σύγκριση των συστημάτων και των αλγορίθμων στην ίδια βάση αποτελεί το μέσο που επιτρέπει στον καθένα να σχηματίσει συμπεράσματα για τις καλύτερες στρατηγικές ταιριάσματος. / Ontology alignment is the process of determining correspondences between concepts. A set of mappings is called alignment. In recent years several tools have been proposed as a valid solution to the problem of semantic heterogeneity. These tools identify nodes in two schemas, which are related syntactically or semantically. Ontology alignment tools have been generally developed to operate in database schemas, XML schemas, taxonomies, formal languages, entities relations models, dictionaries, thesauri, ontologies and other contexts labels. These are usually converted into a graph representation before the matching process. In the Semantic Web, graphs can be represented by RDF formats (Resource Description Framework). In this context, ontology alignment is sometimes been referred as "ontology matching". Ontology matching is a prerequisite for the activation of interoperability on the Semantic Web, as well as a useful tactic for some classical data integration tasks. The matches can be used in various tasks such as ontology merging and data translation. Thus, ontology matching enables the knowledge and data expressed in the matched ontologies to communicate. These give great value to the proper functioning and efficiency of ontology alignment tools. For this reason it is right to be frequent reviews of tools and their effects, under different circumstances and use cases. The evaluation of ontology alignment is practically achieved in two ways: (i) evaluating individual matchings and (ii) comparing the alignment with a reference alignment. This paper has the purpose to give a satisfactory picture of the performance and efficiency of five ontology alignment tools. As part of the work are being described, compared and evaluated the characteristics of the tools, the methods and the alignment results, while comparative observations are made with the results of the same tools in OAEI (Ontology Alignment Evaluation Initiative). Both ways of evaluating alignments are being used, while being counted and aware of the matches produced by every method from each tool and compared with a reference alignment, which was manually produced. The comparison of tools and algorithms on the same basis constitutes the way that allows everyone to produce own conclusions about the best matching techniques. Ευθυγράμμιση Οντολογίες Ταίριασμα οντολογιών 006.332 Alignment Ontologies Ontology matching Ontology alignment tools Alignment API RiMOM MapPSO Anchor flood Aroma
42	Approches vers des modèles unifiés pour l'intégration de bases de connaissances / Approaches Towards Unified Models for Integrating Web Knowledge Bases Koutraki, Maria 27 September 2016 (has links) Ma thèse a comme but l’intégration automatique de nouveaux services Web dans une base de connaissances. Pour chaque méthode d’un service Web, une vue est calculée de manière automatique. La vue est représentée comme une requête sur la base de connaissances. L’algorithme que nous avons proposé calcule également une fonction de transformation XSLT associée à la méthode qui est capable de transformer les résultats d’appel dans un fragment conforme au schéma de la base de connaissances. La nouveauté de notre approche c’est que l’alignement repose seulement sur l’alignement des instances. Il ne dépend pas des noms des concepts ni des contraintes qui sont définis par le schéma. Ceci le fait particulièrement pertinent pour les services Web qui sont publiés actuellement sur le Web, parce que ces services utilisent le protocole REST. Ce protocole ne permet pas la publication de schémas. En plus, JSON semble s’imposer comme le standard pour la représentation des résultats d’appels de services. À différence du langage XML, JSON n’utilise pas de noeuds nommés. Donc les algorithmes d’alignement traditionnels sont privés de noms de concepts sur lesquels ils se basent. / My thesis aim the automatic integration of new Web services in a knowledge base. For each method of a Web service, a view is automatically calculated. The view is represented as a query on the knowledge base. Our algorithm also calculates an XSLT transformation function associated to the method that is able to transform the call results in a fragment according to the schema of the knowledge base. The novelty of our approach is that the alignment is based only on the instances. It does not depend on the names of the concepts or constraints that are defined by the schema. This makes it particularly relevant for Web services that are currently available on the Web, because these services use the REST protocol. This protocol does not allow the publication schemes. In addition, JSON seems to establish itself as the standard for the representation of technology call results. Web sémantique Bases de connaissances en ligne Intégration de données Semantic web Ontology alignment Online knowledge bases Data integration 006.3
43	Exploiting BioPortal as Background Knowledge in Ontology Alignment Chen, Xi 11 August 2014 (has links) No description available. Computer Science
44	MSSearch: busca semântica de objetos de aprendizagem OBAA com suporte a alinhamento automático de ontologias Silva, Luiz Rodrigo Jardim da 27 March 2013 (has links) Submitted by Maicon Juliano Schmidt (maicons) on 2015-07-09T14:56:04Z No. of bitstreams: 1 Luiz Rodrigo Jardim da Silva.pdf: 2565431 bytes, checksum: 6a2df89b794e9afe09546769e43ef4e9 (MD5) / Made available in DSpace on 2015-07-09T14:56:04Z (GMT). No. of bitstreams: 1 Luiz Rodrigo Jardim da Silva.pdf: 2565431 bytes, checksum: 6a2df89b794e9afe09546769e43ef4e9 (MD5) Previous issue date: 2013-01-31 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Problemas relacionados à heterogeneidade semântica vêm se mostrando atualmente como um importante campo de pesquisa. Dentro do contexto educacional, pesquisadores têm se dedicado ao desenvolvimento de novas tecnologias que visam melhorar os processos de localização, recuperação, catalogação, e reutilização de objetos de aprendizagem. Baseado neste cenário, destaca-se o uso de técnicas de alinhamento de ontologias para prover integração entre ontologias distintas. Assim, o objetivo deste trabalho é desenvolver uma ferramenta que forneça mecanismos de busca semântica de objetos de aprendizagem com suporte a alinhamento automático de ontologias. / Semantics heterogeneity problems are becoming an important field of research. Within the educational context, researchers have focused on developing new technologies to improve the processes of localization, retrieval, cataloging, and reuse of learning objects. This scenario highlights the use of ontology alignment techniques to provide integration between different ontologies. Therefore, the goal of the present work is to develop a tool that provides mechanisms for semantic search of learning objects, with support for automatic aligning ontologies. Web semântica Busca semântica Ontologias Alinhamento de ontologias Objetos de aprendizagem Sistemas multiagentes Repositórios semânticos Semantic web Semantic search Ontology alignment Learning objects Multiagent systems Semantic repositories
45	[en] TOWARDS A WELL-INTERLINKED WEB THROUGH MATCHING AND INTERLINKING APPROACHES / [pt] INTERLIGANDO RECURSOS NA WEB ATRAVÉS DE ABORDAGENS DE MATCHING E INTERLINKING BERNARDO PEREIRA NUNES 07 January 2016 (has links) [pt] Com o surgimento da Linked (Open) Data, uma série de novos e importantes desafios de pesquisa vieram à tona. A abertura de dados, como muitas vezes a Linked Data é conhecida, oferece uma oportunidade para integrar e conectar, de forma homogênea, fontes de dados heterogêneas na Web. Como diferentes fontes de dados, com recursos em comum ou relacionados, são publicados por diferentes editores, a sua integração e consolidação torna-se um verdadeiro desafio. Outro desafio advindo da Linked Data está na criação de um grafo denso de dados na Web. Com isso, a identificação e interligação, não só de recursos idênticos, mas também dos recursos relacionadas na Web, provê ao consumidor (data consumer) uma representação mais rica dos dados e a possibilidade de exploração dos recursos conectados. Nesta tese, apresentamos três abordagens para enfrentar os problemas de integração, consolidação e interligação de dados. Nossa primeira abordagem combina técnicas de informação mútua e programação genética para solucionar o problema de alinhamento complexo entre fontes de dados, um problema raramente abordado na literatura. Na segunda e terceira abordagens, adotamos e ampliamos uma métrica utilizada em teoria de redes sociais para enfrentar o problema de consolidação e interligação de dados. Além disso, apresentamos um aplicativo Web chamado Cite4Me que fornece uma nova perspectiva sobre a pesquisa e recuperação de conjuntos de Linked Open Data, bem como os benefícios da utilização de nossas abordagens. Por fim, uma série de experimentos utilizando conjuntos de dados reais demonstram que as nossas abordagens superam abordagens consideradas como estado da arte. / [en] With the emergence of Linked (Open) Data, a number of novel and notable research challenges have been raised. The openness that often characterises Linked Data offers an opportunity to homogeneously integrate and connect heterogeneous data sources on the Web. As disparate data sources with overlapping or related resources are provided by different data publishers, their integration and consolidation becomes a real challenge. An additional challenge of Linked Data lies in the creation of a well-interlinked graph of Web data. Identifying and linking not only identical Web resources, but also lateral Web resources, provides the data consumer with richer representation of the data and the possibility of exploiting connected resources. In this thesis, we present three approaches that tackle data integration, consolidation and linkage problems. Our first approach combines mutual information and genetic programming techniques for complex datatype property matching, a rarely addressed problem in the literature. In the second and third approaches, we adopt and extend a measure from social network theory to address data consolidation and interlinking. Furthermore, we present a Web-based application named Cite4Me that provides a new perspective on search and retrieval of Linked Open Data sets, as well as the benefits of using our approaches. Finally, we validate our approaches through extensive evaluations using real-world datasets, reporting results that outperform state of the art approaches. [pt] WEB SEMANTICA [en] SEMANTIC WEB [pt] INTEGRACAO DE DADOS [en] DATA INTEGRATION [pt] PRIVACIDADE [en] PRIVACY [pt] ALINHAMENTO DE ONTOLOGIAS [en] ONTOLOGY ALIGNMENT [pt] SISTEMAS DE RECOMENDACAO [en] RECOMMENDER SYSTEMS [pt] ALINHAMENTO DE ESQUEMAS [pt] LINKED DATA [en] LINKED DATA [pt] CONSOLIDACAO DE DADOS [en] DATA CONSOLIDATION [pt] ENTITY LINKING [en] ENTITY LINKING [pt] DOCUMENT LINKING [en] DOCUMENT LINKING [pt] CITE4ME [en] CITE4ME [pt] FIREME [en] FIREME
46	Evolution von ontologiebasierten Mappings in den Lebenswissenschaften / Evolution of ontology-based mappings in the life sciences Groß, Anika 19 March 2014 (has links) (PDF) Im Bereich der Lebenswissenschaften steht eine große und wachsende Menge heterogener Datenquellen zur Verfügung, welche häufig in quellübergreifenden Analysen und Auswertungen miteinander kombiniert werden. Um eine einheitliche und strukturierte Erfassung von Wissen sowie einen formalen Austausch zwischen verschiedenen Applikationen zu erleichtern, kommen Ontologien und andere strukturierte Vokabulare zum Einsatz. Sie finden Anwendung in verschiedenen Domänen wie der Molekularbiologie oder Chemie und dienen zumeist der Annotation realer Objekte wie z.B. Gene oder Literaturquellen. Unterschiedliche Ontologien enthalten jedoch teilweise überlappendes Wissen, so dass die Bestimmung einer Abbildung (Ontologiemapping) zwischen ihnen notwendig ist. Oft ist eine manuelle Mappingerstellung zwischen großen Ontologien kaum möglich, weshalb typischerweise automatische Verfahren zu deren Abgleich (Matching) eingesetzt werden. Aufgrund neuer Forschungserkenntnisse und Nutzeranforderungen verändern sich die Ontologien kontinuierlich weiter. Die Evolution der Ontologien hat wiederum Auswirkungen auf abhängige Daten wie beispielsweise Annotations- und Ontologiemappings, welche entsprechend aktualisiert werden müssen. Im Rahmen dieser Arbeit werden neue Methoden und Algorithmen zum Umgang mit der Evolution ontologie-basierter Mappings entwickelt. Dabei wird die generische Infrastruktur GOMMA zur Verwaltung und Analyse der Evolution von Ontologien und Mappings genutzt und erweitert. Zunächst wurde eine vergleichende Analyse der Evolution von Ontologiemappings für drei Subdomänen der Lebenswissenschaften durchgeführt. Ontologien sowie Mappings unterliegen teilweise starken Änderungen, wobei die Evolutionsintensität von der untersuchten Domäne abhängt. Insgesamt zeigt sich ein deutlicher Einfluss von Ontologieänderungen auf Ontologiemappings. Dementsprechend können bestehende Mappings infolge der Weiterentwicklung von Ontologien ungültig werden, so dass sie auf aktuelle Ontologieversionen migriert werden müssen. Dabei sollte eine aufwendige Neubestimmung der Mappings vermieden werden. In dieser Arbeit werden zwei generische Algorithmen zur (semi-) automatischen Adaptierung von Ontologiemappings eingeführt. Ein Ansatz basiert auf der Komposition von Ontologiemappings, wohingegen der andere Ansatz eine individuelle Behandlung von Ontologieänderungen zur Adaptierung der Mappings erlaubt. Beide Verfahren ermöglichen die Wiederverwendung unbeeinflusster, bereits bestätigter Mappingteile und adaptieren nur die von Änderungen betroffenen Bereiche der Mappings. Eine Evaluierung für sehr große, biomedizinische Ontologien und Mappings zeigt, dass beide Verfahren qualitativ hochwertige Ergebnisse produzieren. Ähnlich zu Ontologiemappings werden auch ontologiebasierte Annotationsmappings durch Ontologieänderungen beeinflusst. Die Arbeit stellt einen generischen Ansatz zur Bewertung der Qualität von Annotationsmappings auf Basis ihrer Evolution vor. Verschiedene Qualitätsmaße erlauben die Identifikation glaubwürdiger Annotationen beispielsweise anhand ihrer Stabilität oder Herkunftsinformationen. Eine umfassende Analyse großer Annotationsdatenquellen zeigt zahlreiche Instabilitäten z.B. aufgrund temporärer Annotationslöschungen. Dementsprechend stellt sich die Frage, inwieweit die Datenevolution zu einer Veränderung von abhängigen Analyseergebnissen führen kann. Dazu werden die Auswirkungen der Ontologie- und Annotationsevolution auf sogenannte funktionale Analysen großer biologischer Datensätze untersucht. Eine Evaluierung anhand verschiedener Stabilitätsmaße erlaubt die Bewertung der Änderungsintensität der Ergebnisse und gibt Aufschluss, inwieweit Nutzer mit einer signifikanten Veränderung ihrer Ergebnisse rechnen müssen. Darüber hinaus wird GOMMA um effiziente Verfahren für das Matching sehr großer Ontologien erweitert. Diese werden u.a. für den Abgleich neuer Konzepte während der Adaptierung von Ontologiemappings benötigt. Viele der existierenden Match-Systeme skalieren nicht für das Matching besonders großer Ontologien wie sie im Bereich der Lebenswissenschaften auftreten. Ein effizienter, kompositionsbasierter Ansatz gleicht Ontologien indirekt ab, indem existierende Mappings zu Mediatorontologien wiederverwendet und miteinander kombiniert werden. Mediatorontologien enthalten wertvolles Hintergrundwissen, so dass sich die Mappingqualität im Vergleich zu einem direkten Matching verbessern kann. Zudem werden generelle Strategien für das parallele Ontologie-Matching unter Verwendung mehrerer Rechenknoten vorgestellt. Eine größenbasierte Partitionierung der Eingabeontologien verspricht eine gute Lastbalancierung und Skalierbarkeit, da kleinere Teilaufgaben des Matchings parallel verarbeitet werden können. Die Evaluierung im Rahmen der Ontology Alignment Evaluation Initiative (OAEI) vergleicht GOMMA und andere Systeme für das Matching von Ontologien in verschiedenen Domänen. GOMMA kann u.a. durch Anwendung des parallelen und kompositionsbasierten Matchings sehr gute Ergebnisse bezüglich der Effektivität und Effizienz des Matchings, insbesondere für Ontologien aus dem Bereich der Lebenswissenschaften, erreichen. / In the life sciences, there is an increasing number of heterogeneous data sources that need to be integrated and combined in comprehensive analysis tasks. Often ontologies and other structured vocabularies are used to provide a formal representation of knowledge and to facilitate data exchange between different applications. Ontologies are used in different domains like molecular biology or chemistry. One of their most important applications is the annotation of real-world objects like genes or publications. Since different ontologies can contain overlapping knowledge it is necessary to determine mappings between them (ontology mappings). A manual mapping creation can be very time-consuming or even infeasible such that (semi-) automatic ontology matching methods are typically applied. Ontologies are not static but underlie continuous modifications due to new research insights and changing user requirements. The evolution of ontologies can have impact on dependent data like annotation or ontology mappings. This thesis presents novel methods and algorithms to deal with the evolution of ontology-based mappings. Thereby the generic infrastructure GOMMA is used and extended to manage and analyze the evolution of ontologies and mappings. First, a comparative evolution analysis for ontologies and mappings from three life science domains shows heavy changes in ontologies and mappings as well as an impact of ontology changes on the mappings. Hence, existing ontology mappings can become invalid and need to be migrated to current ontology versions. Thereby an expensive redetermination of the mappings should be avoided. This thesis introduces two generic algorithms to (semi-) automatically adapt ontology mappings: (1) a composition-based adaptation relies on the principle of mapping composition, and (2) a diff-based adaptation algorithm allows for individually handling change operations to update mappings. Both approaches reuse unaffected mapping parts, and adapt only affected parts of the mappings. An evaluation for very large biomedical ontologies and mappings shows that both approaches produce ontology mappings of high quality. Similarly, ontology changes may also affect ontology-based annotation mappings. The thesis introduces a generic evaluation approach to assess the quality of annotation mappings based on their evolution. Different quality measures allow for the identification of reliable annotations, e.g., based on their stability or provenance information. A comprehensive analysis of large annotation data sources shows numerous instabilities, e.g., due to the temporary absence of annotations. Such modifications may influence results of dependent applications such as functional enrichment analyses that describe experimental data in terms of ontological groupings. The question arises to what degree ontology and annotation changes may affect such analyses. Based on different stability measures the evaluation assesses change intensities of application results and gives insights whether users need to expect significant changes of their analysis results. Moreover, GOMMA is extended by large-scale ontology matching techniques. Such techniques are useful, a.o., to match new concepts during ontology mapping adaptation. Many existing match systems do not scale for aligning very large ontologies, e.g., from the life science domain. One efficient composition-based approach indirectly computes ontology mappings by reusing and combining existing mappings to intermediate ontologies. Intermediate ontologies can contain useful background knowledge such that the mapping quality can be improved compared to a direct match approach. Moreover, the thesis introduces general strategies for matching ontologies in parallel using several computing nodes. A size-based partitioning of the input ontologies enables good load balancing and scalability since smaller match tasks can be processed in parallel. The evaluation of the Ontology Alignment Evaluation Initiative (OAEI) compares GOMMA and other systems in terms of matching ontologies from different domains. Using the parallel and composition-based matching, GOMMA can achieve very good results w.r.t. efficiency and effectiveness, especially for ontologies from the life science domain. Ontologien Mappings Ontologie-Mapping Evolution Änderungen Mappingevolution Ontologieevolution Adaptierung Migration Ontologie-Matching Komposition Mediatorontologie Lebenswissenschaften Biomedizin Annotationen funktionale Analysen UMLS FMA SNOMED CT Adult Mouse Anatomy Gene Ontology ontology ontologies ontology mapping ontology alignment ontology evolution mapping migration mapping change mapping composition ontology matching ontology change ontology development mapping adaptationmediator ontology biomedical ontologies UMLS FMA SNOMED CT Adult Mouse Anatomy Gene Ontology functional annotations term enrichment analysis ddc:570
47	Evolution von ontologiebasierten Mappings in den Lebenswissenschaften Groß, Anika 05 March 2014 (has links) Im Bereich der Lebenswissenschaften steht eine große und wachsende Menge heterogener Datenquellen zur Verfügung, welche häufig in quellübergreifenden Analysen und Auswertungen miteinander kombiniert werden. Um eine einheitliche und strukturierte Erfassung von Wissen sowie einen formalen Austausch zwischen verschiedenen Applikationen zu erleichtern, kommen Ontologien und andere strukturierte Vokabulare zum Einsatz. Sie finden Anwendung in verschiedenen Domänen wie der Molekularbiologie oder Chemie und dienen zumeist der Annotation realer Objekte wie z.B. Gene oder Literaturquellen. Unterschiedliche Ontologien enthalten jedoch teilweise überlappendes Wissen, so dass die Bestimmung einer Abbildung (Ontologiemapping) zwischen ihnen notwendig ist. Oft ist eine manuelle Mappingerstellung zwischen großen Ontologien kaum möglich, weshalb typischerweise automatische Verfahren zu deren Abgleich (Matching) eingesetzt werden. Aufgrund neuer Forschungserkenntnisse und Nutzeranforderungen verändern sich die Ontologien kontinuierlich weiter. Die Evolution der Ontologien hat wiederum Auswirkungen auf abhängige Daten wie beispielsweise Annotations- und Ontologiemappings, welche entsprechend aktualisiert werden müssen. Im Rahmen dieser Arbeit werden neue Methoden und Algorithmen zum Umgang mit der Evolution ontologie-basierter Mappings entwickelt. Dabei wird die generische Infrastruktur GOMMA zur Verwaltung und Analyse der Evolution von Ontologien und Mappings genutzt und erweitert. Zunächst wurde eine vergleichende Analyse der Evolution von Ontologiemappings für drei Subdomänen der Lebenswissenschaften durchgeführt. Ontologien sowie Mappings unterliegen teilweise starken Änderungen, wobei die Evolutionsintensität von der untersuchten Domäne abhängt. Insgesamt zeigt sich ein deutlicher Einfluss von Ontologieänderungen auf Ontologiemappings. Dementsprechend können bestehende Mappings infolge der Weiterentwicklung von Ontologien ungültig werden, so dass sie auf aktuelle Ontologieversionen migriert werden müssen. Dabei sollte eine aufwendige Neubestimmung der Mappings vermieden werden. In dieser Arbeit werden zwei generische Algorithmen zur (semi-) automatischen Adaptierung von Ontologiemappings eingeführt. Ein Ansatz basiert auf der Komposition von Ontologiemappings, wohingegen der andere Ansatz eine individuelle Behandlung von Ontologieänderungen zur Adaptierung der Mappings erlaubt. Beide Verfahren ermöglichen die Wiederverwendung unbeeinflusster, bereits bestätigter Mappingteile und adaptieren nur die von Änderungen betroffenen Bereiche der Mappings. Eine Evaluierung für sehr große, biomedizinische Ontologien und Mappings zeigt, dass beide Verfahren qualitativ hochwertige Ergebnisse produzieren. Ähnlich zu Ontologiemappings werden auch ontologiebasierte Annotationsmappings durch Ontologieänderungen beeinflusst. Die Arbeit stellt einen generischen Ansatz zur Bewertung der Qualität von Annotationsmappings auf Basis ihrer Evolution vor. Verschiedene Qualitätsmaße erlauben die Identifikation glaubwürdiger Annotationen beispielsweise anhand ihrer Stabilität oder Herkunftsinformationen. Eine umfassende Analyse großer Annotationsdatenquellen zeigt zahlreiche Instabilitäten z.B. aufgrund temporärer Annotationslöschungen. Dementsprechend stellt sich die Frage, inwieweit die Datenevolution zu einer Veränderung von abhängigen Analyseergebnissen führen kann. Dazu werden die Auswirkungen der Ontologie- und Annotationsevolution auf sogenannte funktionale Analysen großer biologischer Datensätze untersucht. Eine Evaluierung anhand verschiedener Stabilitätsmaße erlaubt die Bewertung der Änderungsintensität der Ergebnisse und gibt Aufschluss, inwieweit Nutzer mit einer signifikanten Veränderung ihrer Ergebnisse rechnen müssen. Darüber hinaus wird GOMMA um effiziente Verfahren für das Matching sehr großer Ontologien erweitert. Diese werden u.a. für den Abgleich neuer Konzepte während der Adaptierung von Ontologiemappings benötigt. Viele der existierenden Match-Systeme skalieren nicht für das Matching besonders großer Ontologien wie sie im Bereich der Lebenswissenschaften auftreten. Ein effizienter, kompositionsbasierter Ansatz gleicht Ontologien indirekt ab, indem existierende Mappings zu Mediatorontologien wiederverwendet und miteinander kombiniert werden. Mediatorontologien enthalten wertvolles Hintergrundwissen, so dass sich die Mappingqualität im Vergleich zu einem direkten Matching verbessern kann. Zudem werden generelle Strategien für das parallele Ontologie-Matching unter Verwendung mehrerer Rechenknoten vorgestellt. Eine größenbasierte Partitionierung der Eingabeontologien verspricht eine gute Lastbalancierung und Skalierbarkeit, da kleinere Teilaufgaben des Matchings parallel verarbeitet werden können. Die Evaluierung im Rahmen der Ontology Alignment Evaluation Initiative (OAEI) vergleicht GOMMA und andere Systeme für das Matching von Ontologien in verschiedenen Domänen. GOMMA kann u.a. durch Anwendung des parallelen und kompositionsbasierten Matchings sehr gute Ergebnisse bezüglich der Effektivität und Effizienz des Matchings, insbesondere für Ontologien aus dem Bereich der Lebenswissenschaften, erreichen. / In the life sciences, there is an increasing number of heterogeneous data sources that need to be integrated and combined in comprehensive analysis tasks. Often ontologies and other structured vocabularies are used to provide a formal representation of knowledge and to facilitate data exchange between different applications. Ontologies are used in different domains like molecular biology or chemistry. One of their most important applications is the annotation of real-world objects like genes or publications. Since different ontologies can contain overlapping knowledge it is necessary to determine mappings between them (ontology mappings). A manual mapping creation can be very time-consuming or even infeasible such that (semi-) automatic ontology matching methods are typically applied. Ontologies are not static but underlie continuous modifications due to new research insights and changing user requirements. The evolution of ontologies can have impact on dependent data like annotation or ontology mappings. This thesis presents novel methods and algorithms to deal with the evolution of ontology-based mappings. Thereby the generic infrastructure GOMMA is used and extended to manage and analyze the evolution of ontologies and mappings. First, a comparative evolution analysis for ontologies and mappings from three life science domains shows heavy changes in ontologies and mappings as well as an impact of ontology changes on the mappings. Hence, existing ontology mappings can become invalid and need to be migrated to current ontology versions. Thereby an expensive redetermination of the mappings should be avoided. This thesis introduces two generic algorithms to (semi-) automatically adapt ontology mappings: (1) a composition-based adaptation relies on the principle of mapping composition, and (2) a diff-based adaptation algorithm allows for individually handling change operations to update mappings. Both approaches reuse unaffected mapping parts, and adapt only affected parts of the mappings. An evaluation for very large biomedical ontologies and mappings shows that both approaches produce ontology mappings of high quality. Similarly, ontology changes may also affect ontology-based annotation mappings. The thesis introduces a generic evaluation approach to assess the quality of annotation mappings based on their evolution. Different quality measures allow for the identification of reliable annotations, e.g., based on their stability or provenance information. A comprehensive analysis of large annotation data sources shows numerous instabilities, e.g., due to the temporary absence of annotations. Such modifications may influence results of dependent applications such as functional enrichment analyses that describe experimental data in terms of ontological groupings. The question arises to what degree ontology and annotation changes may affect such analyses. Based on different stability measures the evaluation assesses change intensities of application results and gives insights whether users need to expect significant changes of their analysis results. Moreover, GOMMA is extended by large-scale ontology matching techniques. Such techniques are useful, a.o., to match new concepts during ontology mapping adaptation. Many existing match systems do not scale for aligning very large ontologies, e.g., from the life science domain. One efficient composition-based approach indirectly computes ontology mappings by reusing and combining existing mappings to intermediate ontologies. Intermediate ontologies can contain useful background knowledge such that the mapping quality can be improved compared to a direct match approach. Moreover, the thesis introduces general strategies for matching ontologies in parallel using several computing nodes. A size-based partitioning of the input ontologies enables good load balancing and scalability since smaller match tasks can be processed in parallel. The evaluation of the Ontology Alignment Evaluation Initiative (OAEI) compares GOMMA and other systems in terms of matching ontologies from different domains. Using the parallel and composition-based matching, GOMMA can achieve very good results w.r.t. efficiency and effectiveness, especially for ontologies from the life science domain. info:eu-repo/classification/ddc/570 ddc:570

Page generated in 0.4297 seconds