Global ETD Search

11	Πολιτισμικοί αλγόριθμοι : Εφαρμογή στην ανάλυση της ελληνικότητας του παγκόσμιου ιστού Κατσικούλη, Παναγιώτα 12 October 2013 (has links) Οι πολιτισμικοί αλγόριθμοι είναι εξελικτικοί αλγόριθμοι εμπνευσμένοι από την κοινωνική εξέλιξη. Περιλαμβάνουν ένα χώρο πεποιθήσεων, ένα πληθυσμό και ένα πρωτόκολλο επικοινωνίας που περιέχει συναρτήσεις που επιτρέπουν την ανταλλαγή γνώσης μεταξύ του πληθυσμού και του χώρου πεποιθήσεων. Στην παρούσα εργασία οι πολιτισμικοί αλγόριθμοι χρησιμοποιούνται για την ανάλυση της ελληνικότητας του παγκόσμιου ιστού. Είναι γνωστό πως η ελληνική γλώσσα αποτελεί πηγή άντλησης πληθώρας λέξεων για τα λεξιλόγια πολλών γλωσσών. Ο παγκόσμιος ιστός αποτελεί πλέον κλαθολικό μέσο επικοινωνίας, χώρο διακίνησης τεράστιου όγκου πληροφορίας και δεδομένων και σύγχρονο μέσο οικονομικής, πολιτικής και κοινωνικής δραστηριοποίησης. Με άλλα λόγια, ο παγκόσμιος ιστός αποτελεί σήμερα το χώρο εκείνο όπου η επίδραση του πολιτισμού, μέσω της γλώσσας, είναι εμφανής στα διάφορα κείμενα που φιλοξενούνται σε αυτόν. Η παρούσα διπλωματικής επιχειρεί να "μετρήσει" το ποσοστό των λέξεων με ελληνική προέλευση που χρησιμοποιούνται στα κάθε είδους κείμενα που εμφανίζονται στις ιστοσελίδες του παγκόσμιου ιστου. Στόχος της εργασίας είναι η διερεύνηση του κατά πόσον είναι εφικτός ο σχεδιασμός κατάλληλου μοντέλου και αντίστοιχων αλγορίθμων που θα επιτρέψουν να εκτιμηθεί η "ελληνικότητα" του παγκόσμιου ιστού. Η μεθοδολογία προσέγγισης του θέματος περιλαμβάνει το σχεδιασμό και την υλοποίηση ενός πολιτισμικού αλγορίθμου και χρήση του περιβάλλοντος προγραμματισμού Python για σχεδιασμό και υλοποίηση κατάλληλης εφαρμογής και για πειραματικό έλεγχο. / Cultural Algorithms are Evolutionary Αlgorithms inspired from societal evolution. They involve a belief space, a population space and a communication protocol which provides functions that enable exchange of knowledge between population and belief space. In this thesis cultural algorithms are used in order to analyze how greek the web is. It is commonly known that the greek language is the source of a plethora of words for other languages' dictionaries. The World Wide Web is, nowadays, a universal means of communication, a place where huge amounts of information and data are transmitted and a modern means of economical, political and social activity. In other words, the world wide web has emerged as a new kind of society. As such, it has become the place where any culture's in uence, throuh their language, is obvious in hosted texts. This thesis attempts to "count" the percentage of words with greek origin used in web hosted texts of any kind. The main objective is to investigate whether it is possible to design a proper model and corresponding algorithms that allow to evaluate how greek the web is. The methodology followed in this approach consists of the design and implementation of a Cultural Algorithm and of the use of the programming language Python for designing and implementing a proper application and for experimental evaluation. Παγκόσμιος ιστός Εξόρυξη λέξεων 006.332 Cultural algorithms World Wide Web Word mining
12	Οντολογίες στο απανταχού υπολογίζειν και σε κινητές εφαρμογές έχοντας επίγνωση του περιβάλλοντος / Ontologies in context-aware ubiquitous and mobile computing Χριστοπούλου, Ελένη 14 October 2013 (has links) Σε αυτή τη διδακτορική διατριβή μελετήσαμε τις δυνατότητες αξιοποίησης των οντολογιών στην αναπαράσταση γνώσης σε συστήματα απανταχού και κινητού υπολογίζειν. / In this thesis we studied the use of ontologies for knowledge representation in ubiquitous and mobile computing. Οντολογίες Επίγνωση πλαισίου Διάχυτο υπολογίζειν 006.332 Ontologies Ubiquitous computing Context-awareness Pervasive computing
13	A knowledge representation framework for the design and the evaluation of a product variety / Cadre de modélisation pour la représentation de la connaissance à l’aide de la conception et l’évaluation de variétés de produits Giovannini, Antonio 16 January 2015 (has links) La conception de variété (ou diversité) de produit est un processus essentiel pour atteindre le niveau de flexibilité requis par la personnalisation de masse. Pendant le processus de conception de la variété, les clients et les experts sont impliqués dans la définition de la meilleure solution. Par conséquent, la compréhension des liens entre les connaissances provenant de ces différents domaines, i.e. client, produit, processus est devenue nécessaire. Dans cette thèse, nous nous intéressons en particulier à la formalisation de ces connaissances. En effet, même si plusieurs efforts ont étés accomplis dans le domaine de la représentation de la connaissance, la pensée logiciste (i.e. utilisation de méthode à base de logiques formelles) reste la base de la majeure partie des travaux sur la formalisation de la connaissance. Des réflexions appropriées sur l’utilisation des logiques peuvent montrer les risques d’ambiguïté de la représentation: l’utilisation de la logique conduit souvent à une représentation sujette à plusieurs interprétations, i.e. une représentation ambiguë. Une représentation avec cette caractéristique ne répond pas à l’exigence de bien comprendre les liens entre les différentes connaissances impliquées dans la conception de la variété. Notre travail s’intéresse, donc, au développement d’un cadre de modélisation de la connaissance de conception basé sur l’anti-logicisme. Les travaux sur les systèmes développés à partir des principes de cette école de représentation de la connaissance montrent à travers des applications concrètes dans les domaines de la robotique ou des systèmes multi-agents que les comportements intelligents peuvent être obtenus sans une représentation de la connaissance basée sur les logiques. Ce cadre permet de développer une variété de produit-processus à partir d’une clientèle définie au départ. Finalement, un critère pour comparer les différentes alternatives de variété générées est aussi proposé. Une méthode pour instancier le cadre de modélisation sur un logiciel de CAO 3D a été développée. De plus, un prototype pour utiliser les modèles de connaissance avec un solveur mathématique a été conçu et développé. Les propositions ont été testées sur un cas d’étude industriel, i.e. batterie froide d’un appareil de réfrigération. Ce test a permis de discuter les avantages et les limites de nos propositions / The product variety design is an essential process in order to deal with the flexibility requested by the mass-customisation. During the product variety stage, customers and expert are involved in the definition of the best variety. Therefore a deep understanding of the links between knowledge coming from the customer domain, product domain and process domain is needed. In this thesis the research focus is on the formalisation of this knowledge. Indeed, even if many efforts are present in the knowledge representation literature, logics are always used to build these links. But appropriate reflections about the use of logics can lead to recognise the risk of ambiguity of the representations, i.e. more than one interpretation of the same represented object are possible. This ambiguity would make the represented knowledge not appropriate for the product variety design. In this work, we propose a framework for the knowledge representation based on the anti-logicism. Since the samples of anti-logicist systems (e.g. multi-agents, robots) have shown an intelligent behaviour without a representation based on logics, we use the principles the anti-logicism to propose our knowledge representation framework. A knowledge representation framework that allows to connect the customer requirements to the manufacturing process parameters is proposed. The core feature of the models based on this framework is the non-ambiguity. Indeed, each piece of knowledge that composes the model can be interpreted in one unique way. This feature allows the perfect collaboration between customer, product engineers and process engineering during the variety design stage. Once the pieces of knowledge coming from different domains are integrated in one model, the framework explains how to generate alternatives of product-process variety by starting from a given customer set. Finally a criterion to compare the different generated alternatives of product-process variety is proposed. A method to instantiate the framework on a 3D CAD has been developed. Moreover, a prototype that uses the knowledge model along with a mathematical solver to propose the best variety has been developed. The impact of the framework on the selection process and on the design process of a customisable product (i.e. water coil) is tested. The test of the instantiation and the prototype allows to show the advantages and the limit of the proposals Représentation de la connaissance Connaissances de conception Conception de la variété de produit Famille de produits Ligne de produits Knowledge representation Design knowledge Product variety design Product family Product line 006.332 658.403 8
14	Knowledge Discovery Considering Domain Literature and Ontologies : Application to Rare Diseases / Découverte de connaissances considérant la littérature et les ontologies de domaine : application aux maladies rares Hassan, Mohsen 11 July 2017 (has links) De par leur grand nombre et leur sévérité, les maladies rares (MR) constituent un enjeu de santé majeur. Des bases de données de référence, comme Orphanet et Orphadata, répertorient les informations disponibles à propos de ces maladies. Cependant, il est difficile pour ces bases de données de proposer un contenu complet et à jour par rapport à ce qui est disponible dans la littérature. En effet, des millions de publications scientifiques sur ces maladies sont disponibles et leur nombre augmente de façon continue. Par conséquent, il serait très fastidieux d’extraire manuellement et de façon exhaustive des informations sur ces maladies. Cela motive le développement des approches semi-automatiques pour extraire l’information des textes et la représenter dans un format approprié pour son utilisation dans d’autres applications. Cette thèse s’intéresse à l’extraction de connaissances à partir de textes et propose d’utiliser les résultats de l’extraction pour enrichir une ontologie de domaine. Nous avons étudié trois directions de recherche: (1) l’extraction de connaissances à partir de textes, et en particulier l’extraction de relations maladie-phénotype (M-P); (2) l’identification d’entité nommées complexes, en particulier de phénotypes de MR; et (3) l’enrichissement d’une ontologie en considérant les connaissances extraites à partir de texte. Tout d’abord, nous avons fouillé une collection de résumés d’articles scientifiques représentés sous la forme graphes pour un extraire des connaissances sur les MR. Nous nous sommes concentrés sur la complétion de la description des MR, en extrayant les relations M-P. Cette trouve des applications dans la mise à jour des bases de données de MR telles que Orphanet. Pour cela, nous avons développé un système appelé SPARE* qui extrait les relations M-P à partir des résumés PubMed, où les phénotypes et les MR sont annotés au préalable par un système de reconnaissance des entités nommées. SPARE* suit une approche hybride qui combine une méthode basée sur des patrons syntaxique, appelée SPARE, et une méthode d’apprentissage automatique (les machines à vecteurs de support ou SVM). SPARE* bénéficié à la fois de la précision relativement bonne de SPARE et du bon rappel des SVM. Ensuite, SPARE* a été utilisé pour identifier des phénotypes candidats à partir de textes. Pour cela, nous avons sélectionné des patrons syntaxiques qui sont spécifiques aux relations M-P uniquement. Ensuite, ces patrons sont relaxés au niveau de leur contrainte sur le phénotype pour permettre l’identification de phénotypes candidats qui peuvent ne pas être références dans les bases de données ou les ontologies. Ces candidats sont vérifiés et validés par une comparaison avec les classes de phénotypes définies dans une ontologie de domaine comme HPO. Cette comparaison repose sur une modèle sémantique et un ensemble de règles de mises en correspondance définies manuellement pour cartographier un phénotype candidate extrait de texte avec une classe de l’ontologie. Nos expériences illustrent la capacité de SPARE* à des phénotypes de MR déjà répertoriés ou complètement inédits. Nous avons appliqué SPARE* à un ensemble de résumés PubMed pour extraire les phénotypes associés à des MR, puis avons mis ces phénotypes en correspondance avec ceux déjà répertoriés dans l’encyclopédie Orphanet et dans Orphadata ; ceci nous a permis d’identifier de nouveaux phénotypes associés à la maladie selon les articles, mais pas encore listés dans Orphanet ou Orphadata.Enfin, nous avons appliqué les structures de patrons pour classer les MR et enrichir une ontologie préexistante. Tout d’abord, nous avons utilisé SPARE* pour compléter les descriptions en terme de phénotypes de MR disponibles dans Orphadata. Ensuite, nous proposons de compter et grouper les MR au regard de leur description phénotypique, et ce en utilisant les structures de patron. [...] / Even if they are uncommon, Rare Diseases (RDs) are numerous and generally sever, what makes their study important from a health-care point of view. Few databases provide information about RDs, such as Orphanet and Orphadata. Despite their laudable effort, they are incomplete and usually not up-to-date in comparison with what exists in the literature. Indeed, there are millions of scientific publications about these diseases, and the number of these publications is increasing in a continuous manner. This makes the manual extraction of this information painful and time consuming and thus motivates the development of semi-automatic approaches to extract information from texts and represent it in a format suitable for further applications. This thesis aims at extracting information from texts and using the result of the extraction to enrich existing ontologies of the considered domain. We studied three research directions (1) extracting relationships from text, i.e., extracting Disease-Phenotype (D-P) relationships; (2) identifying new complex entities, i.e., identifying phenotypes of a RD and (3) enriching an existing ontology on the basis of the relationship previously extracted, i.e., enriching a RD ontology. First, we mined a collection of abstracts of scientific articles that are represented as a collection of graphs for discovering relevant pieces of biomedical knowledge. We focused on the completion of RD description, by extracting D-P relationships. This could find applications in automating the update process of RD databases such as Orphanet. Accordingly, we developed an automatic approach named SPARE, for extracting D-P relationships from PubMed abstracts, where phenotypes and RDs are annotated by a Named Entity Recognizer. SPARE is a hybrid approach that combines a pattern-based method, called SPARE, and a machine learning method (SVM). It benefited both from the relatively good precision of SPARE and from the good recall of the SVM. Second, SPARE* has been used for identifying phenotype candidates from texts. We selected high-quality syntactic patterns that are specific for extracting D-P relationships only. Then, these patterns are relaxed on the phenotype constraint to enable extracting phenotype candidates that are not referenced in databases or ontologies. These candidates are verified and validated by the comparison with phenotype classes in a well-known phenotypic ontology (e.g., HPO). This comparison relies on a compositional semantic model and a set of manually-defined mapping rules for mapping an extracted phenotype candidate to a phenotype term in the ontology. This shows the ability of SPARE* to identify existing and potentially new RD phenotypes. We applied SPARE* on PubMed abstracts to extract RD phenotypes that we either map to the content of Orphanet encyclopedia and Orphadata; or suggest as novel to experts for completing these two resources. Finally, we applied pattern structures for classifying RDs and enriching an existing ontology. First, we used SPARE* to compute the phenotype description of RDs available in Orphadata. We propose comparing and grouping RDs in regard to their phenotypic descriptions, and this by using pattern structures. The pattern structures enable considering both domain knowledge, consisting in a RD ontology and a phenotype ontology, and D-P relationships from various origins. The lattice generated from this pattern structures suggests a new classification of RDs, which in turn suggests new RD classes that do not exist in the original RD ontology. As their number is large, we proposed different selection methods to select a reduced set of interesting RD classes that we suggest for experts for further analysis Extraction d’information Analyse formelle de concepts Structure de patron Enrichissement d’ontologie Natural Language Processing Information Extraction Formal Concept Analysis Pattern Structures Ontology Enrichment 006.332 025.04
15	Apport des ontologies de domaine pour l'extraction de connaissances à partir de données biomédicales / Contribution of domain ontologies for knowledge discovery in biomedical data Personeni, Gabin 09 November 2018 (has links) Le Web sémantique propose un ensemble de standards et d'outils pour la formalisation et l'interopérabilité de connaissances partagées sur le Web, sous la forme d'ontologies. Les ontologies biomédicales et les données associées constituent de nos jours un ensemble de connaissances complexes, hétérogènes et interconnectées, dont l'analyse est porteuse de grands enjeux en santé, par exemple dans le cadre de la pharmacovigilance. On proposera dans cette thèse des méthodes permettant d'utiliser ces ontologies biomédicales pour étendre les possibilités d'un processus de fouille de données, en particulier, permettant de faire cohabiter et d'exploiter les connaissances de plusieurs ontologies biomédicales. Les travaux de cette thèse concernent dans un premier temps une méthode fondée sur les structures de patrons, une extension de l'analyse formelle de concepts pour la découverte de co-occurences de événements indésirables médicamenteux dans des données patients. Cette méthode utilise une ontologie de phénotypes et une ontologie de médicaments pour permettre la comparaison de ces événements complexes, et la découverte d'associations à différents niveaux de généralisation, par exemple, au niveau de médicaments ou de classes de médicaments. Dans un second temps, on utilisera une méthode numérique fondée sur des mesures de similarité sémantique pour la classification de déficiences intellectuelles génétiques. On étudiera deux mesures de similarité utilisant des méthodes de calcul différentes, que l'on utilisera avec différentes combinaisons d'ontologies phénotypiques et géniques. En particulier, on quantifiera l'influence que les différentes connaissances de domaine ont sur la capacité de classification de ces mesures, et comment ces connaissances peuvent coopérer au sein de telles méthodes numériques. Une troisième étude utilise les données ouvertes liées ou LOD du Web sémantique et les ontologies associées dans le but de caractériser des gènes responsables de déficiences intellectuelles. On utilise ici la programmation logique inductive, qui s'avère adaptée pour fouiller des données relationnelles comme les LOD, en prenant en compte leurs relations avec les ontologies, et en extraire un modèle prédictif et descriptif des gènes responsables de déficiences intellectuelles. L'ensemble des contributions de cette thèse montre qu'il est possible de faire coopérer avantageusement une ou plusieurs ontologies dans divers processus de fouille de données / The semantic Web proposes standards and tools to formalize and share knowledge on the Web, in the form of ontologies. Biomedical ontologies and associated data represents a vast collection of complex, heterogeneous and linked knowledge. The analysis of such knowledge presents great opportunities in healthcare, for instance in pharmacovigilance. This thesis explores several ways to make use of this biomedical knowledge in the data mining step of a knowledge discovery process. In particular, we propose three methods in which several ontologies cooperate to improve data mining results. A first contribution of this thesis describes a method based on pattern structures, an extension of formal concept analysis, to extract associations between adverse drug events from patient data. In this context, a phenotype ontology and a drug ontology cooperate to allow a semantic comparison of these complex adverse events, and leading to the discovery of associations between such events at varying degrees of generalization, for instance, at the drug or drug class level. A second contribution uses a numeric method based on semantic similarity measures to classify different types of genetic intellectual disabilities, characterized by both their phenotypes and the functions of their linked genes. We study two different similarity measures, applied with different combinations of phenotypic and gene function ontologies. In particular, we investigate the influence of each domain of knowledge represented in each ontology on the classification process, and how they can cooperate to improve that process. Finally, a third contribution uses the data component of the semantic Web, the Linked Open Data (LOD), together with linked ontologies, to characterize genes responsible for intellectual deficiencies. We use Inductive Logic Programming, a suitable method to mine relational data such as LOD while exploiting domain knowledge from ontologies by using reasoning mechanisms. Here, ILP allows to extract from LOD and ontologies a descriptive and predictive model of genes responsible for intellectual disabilities. These contributions illustrates the possibility of having several ontologies cooperate to improve various data mining processes Bioontologies Données ouvertes liées Programmation logique inductive Similarité sémantique Structures de patrons Web sémantique Bioontologies Inductive Logic Programming Linked Open Data Pattern structures Semantic similarity Semantic Web 006.332 006.312
16	Σχεδιασμός, ανάπτυξη και σύνθεση οντολογιών για την υποστήριξη της εκπαίδευσης στην αντικειμενοστρεφή ανάλυση Μπαγιαμπού, Μαρία 25 January 2012 (has links) Τα τελευταία χρόνια γίνονται πολλές έρευνες οι οποίες δείχνουν πως οι Οντολογίες και οι τεχνολογίες βασισμένες σε οντολογίες, βρίσκουν ευρεία εφαρμογή στην εκπαίδευση και αποτελούν έναν από τους πιο σημαντικούς τομείς έρευνας της εκπαιδευτικής τεχνολογίας. Μια οντολογία αποτελεί την τυπική προδιαγραφή κάποιας περιοχής γνώσης (Gruber, 1993). Παρέχει τις βασικές έννοιες του πεδίου γνώσης που περιγράφεται και τις μεταξύ τους σχέσεις, καθώς και την ορολογία με την οποία αναφερόμαστε στις έννοιες και τις σχέσεις αυτές. Δηλαδή, μια οντολογία παρέχει τόσο λεξιλόγια και όσο και σχήματα οργάνωσης της γνώσης, τα οποία μπορούν να αξιοποιηθούν ως κοινά πλαίσια επικοινωνίας μεταξύ ανθρώπων, συστημάτων και οργανισμών, διευκολύνοντας το διαμοιρασμό, την διαλειτουργικότητα και την επαναχρησιμοποίηση πόρων (Uschold & Gruninger, 1996). Οι Οντολογίες συνδέονται στενά με το λεγόμενο Σημασιολογικό Ιστό, που αναφέρεται στη σημασιολογική διασύνδεση των πληροφοριών που υπάρχουν στον Παγκόσμιο Ιστό με τρόπο κατανοητό από μηχανές (Berners Lee et al., 2001). Μια τέτοια διασύνδεση θα έδινε πολύ μεγάλες προοπτικές όσον αφορά στο διαμοιρασμό, ανάκληση και επαναχρησιμοποίηση της πληροφορίας τόσο στην εκπαίδευση όσο σε όλο το φάσμα των δραστηριοτήτων μας. Η εργασία μας συνίσταται στη δημιουργία μιας εκπαιδευτικής εφαρμογής για τη διαχείριση μαθησιακού υλικού και μαθησιακών στόχων σχετικών με το αντικείμενο της Αντικειμενοστρεφούς Ανάλυσης και συγκεκριμένα με το γνωστικό πεδίο των Διαγραμμάτων Περιπτώσεων Χρήσης, η οποία βασίζεται σε οντολογίες. Χρησιμοποιούμε οντολογίες για να περιγράψουμε με τυπικό τρόπο τρεις βασικές συνιστώσες της μαθησιακής διαδικασίας: το γνωστικό πεδίο, τα μαθησιακά αντικείμενα και τους μαθησιακούς στόχους, με σκοπό να γίνει δυνατή η αυτόματη επεξεργασία των παραπάνω συνιστωσών από εφαρμογές ηλεκτρονικής μάθησης και να προωθείται η επικοινωνία, η διαλειτουργικότητα και ο διαμοιρασμός πόρων. Ακόμα, ζητούμενο της εφαρμογής μας αποτελεί η ενσωμάτωση σε αυτήν δυνατοτήτων παροχής προσωποποιημένων υπηρεσιών. Αφού κάνουμε μια σύντομη επισκόπηση της βιβλιογραφίας σχετικά με τη χρήση οντολογιών στην Εκπαίδευση αναφερόμαστε στις Οντολογίες που δημιουργήσαμε και στον τρόπο που είναι δυνατόν να χρησιμοποιηθούν για να επιτευχθούν οι προαναφερθέντες στόχοι. Σημειώνουμε ότι στην παρούσα εργασία δεν περιλαμβάνεται η εκπαιδευτική αξιολόγηση του συστήματος (μετά από πιλοτική χρήση), αλλά μόνο η επαλήθευση της λειτουργίας του. / An ontology is a formal specification of a conceptualization (Gruber, 1993). It provides terminology and conceptual schemas concerning a domain, and can be used as a communication framework between humans, software systems and organizations, promoting interoperability and reusability of resources. Our work concerns the creation of an ontology-based educational application that aims at the management of educational resources and instructional goals related to the field of Object-Orient Analysis and specifically the field of Use Case Diagrams. As part of our work, we have used ontologies to formally describe three basic components of the educational process: the learning material, the knowledge domain and the learning goals. We created three ontologies: the use case diagram ontology (domain ontology), the competency ontology (to model the learning goals) and the learning object ontology (to describe the learning material), which we ultimately combined in one application. The inclusion of components like learning objects and competencies in our application, as well as the use of ontologies to formally describe them, are features that can promote interoperability and resource reuse and can be used to provide personalised services. In this paper, we first describe ontologies and their current uses in the education field according to recent research and then we proceed with the analytic description of our ontologies and our application. Οντολογίες 006.332 Ontologies Education Competency modeling Learning objects Use case diagrams Protégé
17	Αξιολόγηση εργαλείων ευθυγράμμισης οντολογιών / Ontology alignment tools evaluation (survey) Χρηστίδης, Ιωάννης 27 June 2012 (has links) Η ευθυγράμμιση οντολογιών είναι η διαδικασία καθορισμού των αντιστοιχίσεων μεταξύ εννοιών. Ένα σύνολο αντιστοιχίσεων καλείται ευθυγράμμιση. Στα πρόσφατα έτη έχουν προταθεί διάφορα εργαλεία ως έγκυρη λύση στο πρόβλημα της σημασιολογικής ετερογένειας. Αυτά τα εργαλεία ταυτοποιούν κόμβους σε δύο σχήματα, τα οποία συσχετίζονται συντακτικά ή σημασιολογικά. Τα εργαλεία ευθυγράμμισης οντολογιών έχουν γενικά αναπτυχθεί για να λειτουργούν σε σχήματα βάσεων δεδομένων, XML σχήματα, ταξινομίες, τυπικές γλώσσες, μοντέλα σχέσεων οντοτήτων, λεξικά, θησαυρούς, οντολογίες και άλλα πλαίσια ετικετών. Τα παραπάνω συνήθως μετατρέπονται σε μια αναπαράσταση γράφων πριν την αντιστοίχιση. Εν όψει του Σημασιολογικού Ιστού, οι γράφοι μπορούν να αντιπροσωπευθούν από μορφές RDF (Resource Description Framework). Σε αυτό το πλαίσιο, η ευθυγράμμιση οντολογιών αναφέρεται μερικές φορές ως “ταίριασμα οντολογιών”. Το ταίριασμα οντολογιών είναι μια βασική προϋπόθεση για την ενεργοποίηση της διαλειτουργικότητας στο Σημασιολογικό Ιστό, καθώς επίσης και μια χρήσιμη τακτική για κάποιες κλασσικές εργασίες ολοκλήρωσης δεδομένων. Οι αντιστοιχίες μπορούν να χρησιμοποιηθούν σε διάφορες εργασίες, όπως στη συγχώνευση οντολογιών και στη μετάφραση δεδομένων. Κατά συνέπεια, το ταίριασμα των οντολογιών επιτρέπει στη γνώση και τα στοιχεία που εκφράζονται στις αντιστοιχημένες οντολογίες να επικοινωνήσουν. Τα παραπάνω δίνουν μεγάλη αξία στη σωστή λειτουργία και αποδοτικότητα των εργαλείων ευθυγράμμισης οντολογιών. Για το λόγο αυτό είναι σωστό να γίνονται συχνές αξιολογήσεις των εργαλείων και των αποτελεσμάτων τους, κάτω από διαφορετικές συνθήκες και περιπτώσεις χρήσης. Η αξιολόγηση των ευθυγραμμίσεων οντολογιών γίνεται στην πράξη με δύο τρόπους: (i) αξιολογώντας μεμονωμένες αντιστοιχίες και (ii) συγκρίνοντας την ευθυγράμμιση με μια ευθυγράμμιση αναφοράς. Η παρούσα εργασία έχει ως σκοπό να δώσει μια ικανοποιητική εικόνα για τις επιδόσεις και την αποδοτικότητα πέντε εργαλείων ευθυγράμμισης οντολογιών. Στα πλαίσια της εργασίας περιγράφονται, συγκρίνονται και αξιολογούνται τα χαρακτηριστικά των εργαλείων, οι μέθοδοι και τα αποτελέσματα ευθυγραμμίσεων, ενώ γίνονται συγκριτικές παρατηρήσεις με τα αποτελέσματα των αντίστοιχων εργαλείων στο OAEI (Ontology Alignment Evaluation Initiative). Γίνεται χρήση και των δύο τρόπων αξιολόγησης ευθυγραμμίσεων, δηλαδή καταμετρούνται και παρατηρούνται οι αντιστοιχίες που παρήχθησαν από κάθε μέθοδο, για κάθε εργαλείο και συγκρίνονται με μια ευθυγράμμιση αναφοράς, η οποία παρήχθηκε χειρωνακτικά. Η σύγκριση των συστημάτων και των αλγορίθμων στην ίδια βάση αποτελεί το μέσο που επιτρέπει στον καθένα να σχηματίσει συμπεράσματα για τις καλύτερες στρατηγικές ταιριάσματος. / Ontology alignment is the process of determining correspondences between concepts. A set of mappings is called alignment. In recent years several tools have been proposed as a valid solution to the problem of semantic heterogeneity. These tools identify nodes in two schemas, which are related syntactically or semantically. Ontology alignment tools have been generally developed to operate in database schemas, XML schemas, taxonomies, formal languages, entities relations models, dictionaries, thesauri, ontologies and other contexts labels. These are usually converted into a graph representation before the matching process. In the Semantic Web, graphs can be represented by RDF formats (Resource Description Framework). In this context, ontology alignment is sometimes been referred as "ontology matching". Ontology matching is a prerequisite for the activation of interoperability on the Semantic Web, as well as a useful tactic for some classical data integration tasks. The matches can be used in various tasks such as ontology merging and data translation. Thus, ontology matching enables the knowledge and data expressed in the matched ontologies to communicate. These give great value to the proper functioning and efficiency of ontology alignment tools. For this reason it is right to be frequent reviews of tools and their effects, under different circumstances and use cases. The evaluation of ontology alignment is practically achieved in two ways: (i) evaluating individual matchings and (ii) comparing the alignment with a reference alignment. This paper has the purpose to give a satisfactory picture of the performance and efficiency of five ontology alignment tools. As part of the work are being described, compared and evaluated the characteristics of the tools, the methods and the alignment results, while comparative observations are made with the results of the same tools in OAEI (Ontology Alignment Evaluation Initiative). Both ways of evaluating alignments are being used, while being counted and aware of the matches produced by every method from each tool and compared with a reference alignment, which was manually produced. The comparison of tools and algorithms on the same basis constitutes the way that allows everyone to produce own conclusions about the best matching techniques. Ευθυγράμμιση Οντολογίες Ταίριασμα οντολογιών 006.332 Alignment Ontologies Ontology matching Ontology alignment tools Alignment API RiMOM MapPSO Anchor flood Aroma
18	Représentation sémantique multilingue, multiculturelle et temporelle des relations interpersonnelles, appliquée à une prothèse de mémoire / A semantic multicultural, multilingual and temporal representation of interpersonal relationships, applied to a memory prosthesis Herradi, Noura 20 December 2018 (has links) Dans ce travail de thèse, nous proposons une base de connaissances, destinée à une prothèse de mémoire « intelligente », appelée CaptainMemo, qui a pour but d’aider les malades d’Alzheimer, à pallier leurs problèmes de dégénérescence mnésique. Cette base de connaissances est basée sur l’ontologie temporelle, multiculturelle et multilingue PersonLink, permettant à la prothèse de mémoire une représentation sémantique rigoureuse, multilingue et temporelle des liens interpersonnels. L’ontologie PersonLink est déréférençable et présente dans le Web de données.Le multilinguisme et la représentation temporelle sont deux grands sujets de recherche en informatique et en Web sémantique en particulier. Le multilinguisme appliqué à la représentation des relations interpersonnelles requiert un traitement spécifique, car il est lié au multiculturalisme. Par ailleurs, le passage d’une culture/langue à une autre s’avère une grande problématique de recherche. En effet, la traduction littérale n’est pas toujours permise, surtout quand il s’agit des relations interpersonnelles, car elles sont culturellement dépendantes. Dans ce contexte, nous proposons une approche permettant la représentation des ontologies dans plusieurs cultures/langues. Cette approche, en se basant sur un algorithme de traduction, permet le passage d’une culture/langue à une autre sans faire de la traduction littérale mais plutôt une traduction culturelle. Ainsi, en adoptant cette approche, notre ontologie PersonLink permet une représentation exacte des relations interpersonnelles, qui prend en considération l’aspect culturel pour la définition de chaque relation, et lui attribue le terme adéquat selon la langue liée à la culture dans laquelle elle est représentée. Les relations interpersonnels régissent à des règles et contraintes qui les définissent selon chaque culture, ces contraintes sont représentées sémantiquement dans l’ontologie PersonLink via OWL2. Cependant, il est difficile de prendre en considération ces contraintes lors de l’introduction de la dimension temporelle pour représenter les intervalles de temps de ces relations interpersonnelles, surtout quand ces dernières sont diachroniques et leurs intervalles de temps sont qualitatifs. En effet, les modèles et solutions déjà existantes permettent de faire une représentation temporelle des intervalles de temps (ex 4D-Fluents), et de lier entre ces intervalles de temps (ex Relations d’Allen), mais ne prennent pas en considération les contraintes sémantiques des relations interpersonnelles. Dans ce sens, nous proposons une approche qui permet une représentation sémantique, basée sur les contraintes OWL2, pour la représentation des intervalles de temps qualitatifs. Enfin, pour traiter l’intelligence de la prothèse de mémoire CaptainMemo, nous proposons une approche pour le raisonnement sur les intervalles dans le temps. Dans cette approche nous introduisons un ensemble de règles SWRL pour affirmer des relations d’Allen temporelles inférées, permettant aux raisonneurs, tel que Pellet qui prend en charge les règles DL-Safe, d’être employés pour l'inférence et la vérification de la cohérence sur les relations temporelles entre différents intervalles de temps. La table des compositions des relations entre intervalles de temps a ainsi été considérablement réduite, car elle se base sur un ensemble tractable de ces relations, ce qui en résulte un temps de traitement de raisonnement plus réduit. / In this thesis, we propose a knowledge base for a "smart" memory prosthesis, called CaptainMemo, which aims to help Alzheimer's patients to overcome their memory impairments. This knowledge base is built over the temporal, multicultural and multilingual PersonLink ontology. This ontology gives the memory prosthesis a rigorous, multilingual and temporal semantic representation of interpersonal relationships. The PersonLink ontology is dereferenceable and available in the Linked Data. Multilingualism and temporal representation are two major research topics in computer science and in the Semantic Web in particular.Multilingualism applied to the representation of interpersonal relationships requires specific treatment because it is linked to multiculturalism. In addition, the transition from one culture / language to another is a major research problem. Indeed, literal translation is not always allowed, especially when it comes to interpersonal relationships, because they are culturally dependent. In this context, we propose an approach allowing the representation of ontologies in several cultures / languages. This approach, based on a translation algorithm, allows the transition from one culture / language to another by making a cultural translation rather than a literal one. Thus, by adopting this approach, our PersonLink ontology allows an exact representation of interpersonal relationships, because it takes into consideration the cultural aspect for the definition of each relationship, and assigns the appropriate term according to the language related to this culture. Interpersonal relationships are governed by rules and constraints that define them according to each culture, these constraints are represented semantically in the PersonLink ontology using OWL2. However, it is difficult to consider these constraints when introducing the temporal dimension to represent the time intervals of these interpersonal relationships, especially when these are diachronic and their time intervals are qualitative. Indeed, the legacy models and solutions make it possible to make a temporal representation of the time intervals (e.g. 4D-Fluents), and to link between these time intervals (e.g. Allen Relations), but do not take into account the semantics constraints of interpersonal relationships. In this context, we propose an approach that allows a semantic representation, based on OWL2 constraints, for the representation of qualitative time intervals. Finally, to deal with the intelligence of the CaptainMemo memory prosthesis, we propose an approach for reasoning over time intervals. In this approach we introduce a set of SWRL rules to assert inferred temporal Allen relationships, allowing reasoners, such as Pellet that supports DL-Safe rules, to be used for the inference and the verification of consistency over the temporal relationships between different time intervals. Thus, the table of compositions of the relations between time intervals has been considerably reduced, since it is based on a tractable set of these relations, and, consequently, the processing time of the reasoning becomes shorter. Prothèse de mémoire Web Sémantique Ontologie Multiculturalisme Intervalle temporel qualitatif Raisonnement sémantique Memory prosthesis Semantic Web Ontology Multiculturalism Qualitative time interval Semantic reasoning 006.332 616.831
19	Rapprochement de données pour la reconnaissance d'entités dans les documents océrisés / Data matching for entity recognition in ocred documents Kooli, Nihel 13 September 2016 (has links) Cette thèse traite de la reconnaissance d'entités dans les documents océrisés guidée par une base de données. Une entité peut être, par exemple, une entreprise décrite par son nom, son adresse, son numéro de téléphone, son numéro TVA, etc. ou des méta-données d'un article scientifique tels que son titre, ses auteurs et leurs affiliations, le nom de son journal, etc. Disposant d'un ensemble d'entités structurées sous forme d'enregistrements dans une base de données et d'un document contenant une ou plusieurs de ces entités, nous cherchons à identifier les entités contenues dans le document en utilisant la base de données. Ce travail est motivé par une application industrielle qui vise l'automatisation du traitement des images de documents administratifs arrivant en flux continu. Nous avons abordé ce problème comme un problème de rapprochement entre le contenu du document et celui de la base de données. Les difficultés de cette tâche sont dues à la variabilité de la représentation d'attributs d'entités dans la base et le document et à la présence d'attributs similaires dans des entités différentes. À cela s'ajoutent les redondances d'enregistrements et les erreurs de saisie dans la base de données et l'altération de la structure et du contenu du document, causée par l'OCR. Devant ces problèmes, nous avons opté pour une démarche en deux étapes : la résolution d'entités et la reconnaissance d'entités. La première étape consiste à coupler les enregistrements se référant à une même entité et à les synthétiser dans un modèle entité. Pour ce faire, nous avons proposé une approche supervisée basée sur la combinaison de plusieurs mesures de similarité entre attributs. Ces mesures permettent de tolérer quelques erreurs sur les caractères et de tenir compte des permutations entre termes. La deuxième étape vise à rapprocher les entités mentionnées dans un document avec le modèle entité obtenu. Nous avons procédé par deux manières différentes, l'une utilise le rapprochement par le contenu et l'autre intègre le rapprochement par la structure. Pour le rapprochement par le contenu, nous avons proposé deux méthodes : M-EROCS et ERBL. M-EROCS, une amélioration/adaptation d'une méthode de l'état de l'art, consiste à faire correspondre les blocs de l'OCR avec le modèle entité en se basant sur un score qui tolère les erreurs d'OCR et les variabilités d'attributs. ERBL consiste à étiqueter le document par les attributs d'entités et à regrouper ces labels en entités. Pour le rapprochement par les structures, il s'agit d'exploiter les relations structurelles entre les labels d'une entité pour corriger les erreurs d'étiquetage. La méthode proposée, nommée G-ELSE, consiste à utiliser le rapprochement inexact de graphes attribués modélisant des structures locales, avec un modèle structurel appris pour cet objectif. Cette thèse étant effectuée en collaboration avec la société ITESOFT-Yooz, nous avons expérimenté toutes les étapes proposées sur deux corpus administratifs et un troisième corpus extrait du Web / This thesis focuses on entity recognition in documents recognized by OCR, driven by a database. An entity is a homogeneous group of attributes such as an enterprise in a business form described by the name, the address, the contact numbers, etc. or meta-data of a scientific paper representing the title, the authors and their affiliation, etc. Given a database which describes entities by its records and a document which contains one or more entities from this database, we are looking to identify entities in the document using the database. This work is motivated by an industrial application which aims to automate the image document processing, arriving in a continuous stream. We addressed this problem as a matching issue between the document and the database contents. The difficulties of this task are due to the variability of the entity attributes representation in the database and in the document and to the presence of similar attributes in different entities. Added to this are the record redundancy and typing errors in the database, and the alteration of the structure and the content of the document, caused by OCR. To deal with these problems, we opted for a two-step approach: entity resolution and entity recognition. The first step is to link the records referring to the same entity and to synthesize them in an entity model. For this purpose, we proposed a supervised approach based on a combination of several similarity measures between attributes. These measures tolerate character mistakes and take into account the word permutation. The second step aims to match the entities mentioned in documents with the resulting entity model. We proceeded by two different ways, one uses the content matching and the other integrates the structure matching. For the content matching, we proposed two methods: M-EROCS and ERBL. M-EROCS, an improvement / adaptation of a state of the art method, is to match OCR blocks with the entity model based on a score that tolerates the OCR errors and the attribute variability. ERBL is to label the document with the entity attributes and to group these labels into entities. The structure matching is to exploit the structural relationships between the entity labels to correct the mislabeling. The proposed method, called G-ELSE, is based on local structure graph matching with a structural model which is learned for this purpose. This thesis being carried out in collaboration with the ITESOFT-Yooz society, we have experimented all the proposed steps on two administrative corpuses and a third one extracted from the web Reconnaissance d'entités Document océrisé Base de données Rapprochement d'entités Résolution d'entités Mesures de similarité Rapprochement de graphes Structure locale Entity recognition OCRed document Database Entity matching Entity resolution Similarity measure Graph matching Local structure 006.332 005.741 025.04
20	Dialogue graphique intelligent, fondé sur une ontologie, pour une prothèse de mémoire / Smart graphical dialogue, based on an ontology, for a memory prosthesis Ghorbel, Fatma 10 July 2018 (has links) Dans le cadre de cette thèse, nous proposons une prothèse de mémoire « intelligente », appelée CAPTAIN MEMO, destinée aux malades d’Alzheimer, pour pallier leurs problèmes mnésiques. Cette prothèse est basée sur l’ontologie temporelle, floue et multilingue appelée MemoFuzzyOnto.Cette prothèse offre des interfaces accessibles à cette classe particulière d’utilisateurs. Nous proposons, pour mettre en œuvre ces interfaces, une méthodologie de conception appelée InterfaceToAlz pour concevoir des interfaces accessibles aux malades d’Alzheimer, et qui offre un guide de 146 bonnes pratiques ergonomiques. De plus, nous proposons un outil de visualisation d’ontologies appelé Memo Graph qui génère un graphe dont la visualisation et la manipulation sont accessibles aux malades d’Alzheimer. Cette proposition est motivée par le fait que CAPTAIN MEMO a besoin de générer et d’éditer le graphe de la famille et de l’entourage du patient, à partir de l’ontologie MemoFuzzyOnto qui structure sa base de connaissances. Memo Graph est fondé sur notre guide de bonnes pratiques ergonomiques et notre approche, appelée Incremental Key-Instances Extraction and Visualisation, qui permet une extraction et une visualisation incrémentale du résumé des assertions ABox de l’ontologie. Il supporte également la visualisation des données ouvertes liées (Linked Data) et le passage à l’échelle. Par ailleurs, nous proposons, dans le cadre de cette thèse, une typologie de l’imperfection des données saisies (principalement due à la discordance mnésique provoquée par la maladie), et une méthodologie pour permettre à CAPTAIN MEMO d’être tolérante à la saisie des données fausses. Nous proposons un modèle d’évaluation de la crédibilité et une approche, nommée Data Believability Estimation for Applications to Alzheimer Patients, permettant d’estimer qualitativement et quantitativement la crédibilité de chaque donnée saisie. Enfin, pour que CAPTAIN MEMO soit tolérante à la saisie des intervalles temporels imprécis nous proposons deux approches : l’une basée sur un environnement précis et l’autre basée sur un environnement flou. Dans chacune des deux approches, nous étendons l’approche 4D-fluents pour représenter les intervalles temporels imprécis et les relations temporelles qualitatives, puis nous étendons l’algèbre d’Allen pour prendre en compte les intervalles imprécis dans le cadre de notre ontologie MemoFuzzyOnto. Nos contributions sont implémentées et évaluées. Nous avons évalué l’accessibilité de ses interfaces utilisateurs, le service de CAPTAIN MEMO qui a pour but de stimuler la mémoire du patient, notre approche pour l’estimation quantitative de la crédibilité des données saisies ainsi que la visualisation du graphe générée à l’aide de Memo Graph. Nous avons également évalué la performance de Memo Graph et son utilisabilité par des experts du domaine. / In the context of this thesis, we propose a “smart” memory prosthesis, called CAPTAIN MEMO, to help Alzheimer’s disease patients to palliate mnesic problems. It is based on a temporal, fuzzy and multilingual ontology named MemoFuzzyOnto. It provides accessible user interfaces to this demographic. To design these interfaces, we propose a methodology named InterfaceToAlz which serves as an information base for guiding and evaluating the design of user interfaces for Alzheimer’s disease patients. It identifies 146 design guidelines.Besides, we propose an ontology visualization tool called Memo Graph which offers an accessible and understandable visualization to Alzheimer’s disease patients. In fact, in the context of CAPTAIN MEMO, there is a need to generate the patient entourage/family tree from its personal data structured according to MemoFuzzyOnto. Memo Graph is based on our design guidelines and our approach, named Incremental Key-Instances Extraction and Visualisation, to extract and visualize descriptive instance summarizations from a given ontology and generate “summary instance graphs” from the most important data. It supports Linked Data visualization and scaling.Furthermore, we propose a typology of the imperfection of the data entered (mainly due to the memory discordance caused by this disease), and a methodology to allow false data entry. We propose a believability model and an approach called Data Believability Estimation for Applications to Alzheimer Patients to estimate qualitatively and quantitatively the believability of each data entered. Finally, CAPTAIN MEMO allows imprecise time intervals entry. We propose two approaches: a crisp-based approach and a fuzzy-based approach. The first one uses only crisp standards and tools and is modeled in OWL 2. The second approach is based on fuzzy sets theory and fuzzy tools and is modeled in Fuzzy-OWL 2. For the two approaches, we extend the 4D-fluents model to represent imprecise time intervals and qualitative interval relations. Then, we extend the Allen’s interval algebra to compare imprecise time interval in the context of MemoFuzzyOnto. Our contributions are implemented and evaluated. We evaluated the service of CAPTAIN MEMO which has the aim to stimulate the patient’s memory, the accessibility of its user interfaces, the efficiency of our approach to estimate quantitatively the believability of each data entered and the visualization generated with Memo Graph. We also evaluated Memo Graph with domain expert users. Prothèse de mémoire Accessibilité aux malades d'Alzheimer Visualisation d’ontologies Imperfection des données Crédibilité des données Memory prosthesis Ontology visualization Data imperfection Data believability Imprecise time interval for linked data 006.332 616.831

Search results