Global ETD Search

41	Mise en contexte des traces pour une analyse en niveaux d'abstraction / Exploiting context for an structuration of execution traces in abstraction layers Fopa, Léon Constantin 23 June 2015 (has links) Les techniques d'analyse et de débogage d'applications sont de plus en plus mises à mal dans les systèmes modernes. En particulier dans les systèmes basés sur des composants embarqués multiprocesseurs (ou MPSoc) qui composent aujourd'hui la plupart de nos dispositifs quotidiens. Le recours à des traces d'exécution devient incontournable pour appliquer une analyse fine de tels systèmes et d'en identifier les comportements divergents. Même si la trace représente une source d'information riche mise à disposition du développeur pour travailler, les informations pertinentes à l'analyse se retrouvent noyées dans la masse et sont difficilement utilisables sans une expertise de haut niveau. Des outils dédiés à l'exploitation des traces deviennent nécessaires. Cependant les outils existants prennent rarement en compte les aspects métiers spécifiques à l'application afin d'optimiser le travail d'analyse. Dans cette thèse, nous proposons une approche qui permet au développeur de représenter, manipuler et interroger une trace d'exécution en se basant sur des concepts liés à ses propres connaissances métier. Notre approche consiste en l'utilisation d'une ontologie pour modéliser et interroger les concepts métier dans une trace, et l'utilisation d'un moteur d'inférence pour raisonner sur ces concepts métier. Concrètement, nous proposons VIDECOM l'ontologie du domaine de l'analyse des traces d'exécution des applications embarquées multimédia sur MPSoC. Nous nous intéressons ensuite au passage à l'échelle de l'exploitation de cette ontologie pour l'analyse des traces de grandes tailles. Ainsi, nous faisons une étude comparative des différents systèmes de gestion des ontologies pour déterminer l'architecture la plus adaptée aux traces de très grande taille au sein de notre ontologie VIDECOM. Nous proposons également un moteur d'inférence qui adresse les défis que pose le raisonnement sur les concepts métier, à savoir l'inférence de l'ordre temporel entre les concepts métier dans la trace et la terminaison du processus de génération de nouvelles connaissances métier. Enfin, nous illustrons la mise en pratique de l'utilisation de l'ontologie VIDECOM dans le cadre du projet SoC-Trace pour l'analyse des traces d'exécution réelles sur MPSoC. / Applications analysis and debugging techniques are increasingly challenging task in modern systems. Especially in systems based on embedded multiprocessor components (or MPSoC) that make up the majority of our daily devices today. The use of execution traces is unavoidable to apply a detailed analysis of such systems and identify unexpected behaviors. Even if the trace offers a rich corpus of information to the developer for her work, information relevant to the analysis are hidden in the trace and is unusable without a high level of expertise. Tools dedicated to trace analysis become necessary. However existing tools take little or no account of the specific business aspects to an application or the developer's business knowledge to optimize the analysis task. In this thesis, we propose an approach that allows the developer to represent, manipulate and query an execution trace based on concepts related to her own business knowledge. Our approach is the use of an ontology to model and query business concepts in a trace, and the use of an inference engine to reason about these business concepts. Specifically, we propose VIDECOM, the domain ontology for the analysis of execution traces of multimedia applications embedded on MPSoC. We then focus on scaling the operation of this ontology for the analysis of huge traces. Thus, we make a comparative study of different ontologies management systems (or triplestores) to determine the most appropriate architecture for very large traces in our VIDECOM ontology.We also propose an inference engine that addresses the challenges of reasoning about business concepts, namely the inference of the temporal order between business concepts in the trace and the termination of the process of generating new knowledge from business knowledge. Finally, we illustrate the practical use of VIDECOM in the SoC-Trace project for the analysis of real execution traces on MPSoC. Analyse de traces Web Sémantique Inférence Ontologie RDF Trace analysis Semantic Web Inference Ontology RDF 004
42	Analyse statique de requête pour le Web sémantique / Static Analysis of Semantic Web Queries Chekol, Melisachew Wudage 19 December 2012 (has links) L'inclusion de requête est un problème bien étudié sur plusieurs décennies de recherche. En règle générale, il est défini comme le problème de déterminer si le résultat d'une requête est inclus dans le résultat d'une autre requête pour tout ensemble de données. Elle a des applications importantes dans l'optimisation des requêtes et la vérification de bases de connaissances. L'objectif principal de cette thèse est de fournir des procédures solides et com- plètes pour déterminer l'inclusion des requêtes SPARQL en vertu d'exprimés en axiomes logiques de description. De plus, nous mettons en œuvre ces procédures à l'appui des résultats théoriques par l'expérimentation. À ce jour, test d'inclusion de requête a été effectuée à l'aide de différentes techniques: homomorphisme de graphes, bases de données canoniques, les tech- niques de la théorie des automates et par une réduction au problème de la va- lidité de la logique. Dans cette thèse, nous utilisons la derniere technique pour tester l'inclusion des requêtes SPARQL utilisant une logique expressive appelée μ-calcul. Pour ce faire, les graphes RDF sont codés comme des systèmes de transitions, et les requêtes et les axiomes du schéma sont codés comme des formules de μ-calcul. Ainsi, l'inclusion de requêtes peut être réduit á test de validité de formule logique. L'objectif de cette thèse est d'identifier les divers fragments de SPARQL (et PSPARQL) et les langages de description logique de schéma pour lequelle l'inculsion est décidable. En outre, afin de fournir théoriquement et expériment- alement éprouvées procédures de vérifier l'inclusion de ces fragments décid- ables. Pas durer au moins mais, cette thèse propose un point de repère pour les solveurs d'inclusion. Ce benchmark est utilisé pour tester et comparer l'état actuel des solveurs d'inclusion. / Query containment is a well-studied problem spanning over several decades of research. Generally, it is defined as the problem of determining if the result of one query is included in the result of another query for any given dataset. It has major applications in query optimization and knowledge base verification. The main objective of this thesis is to provide sound and complete procedures to determine containment of SPARQL queries under expressive description logic axioms. Further, to support theoretical results by experimentation. To date query containment has been done using different techniques: containment mapping, canonical databases, automata theory techniques and through a reduction to the validity problem in logic. In this thesis, we use the later technique to address containment using an expressive logic called mu-calculus. In doing so, RDF graphs are encoded as transitions systems, and queries and schema axioms are encoded as mu-calculus formulae. Thereby, query containment can be reduced to validity test in the logic. The focus of this thesis is to identify various fragments of SPARQL (and PSPARQL) and description logic schema languages for which containment is decidable. Additionally, to provide theoretically and experimentally proven procedures to check containment of those decidable fragments. Last not but least, this thesis proposes a benchmark for containment solvers. This benchmark is used to test and compare the current state-of-the-art containment solvers. Inclusion SPARQL PSPARQL RDF OWL Analyse statique Containment SPARQL PSPARQL Static analysis RDF OWL
43	Toward Automatic Fact-Checking of Statistic Claims / Vers une vérification automatique des affirmations statistiques Cao, Tien Duc 26 September 2019 (has links) La thèse vise à explorer des modèles et algorithmes d'extraction de connaissance et d'interconnexion de bases de données hétérogènes, appliquée à la gestion de contenus tels que rencontrés fréquemment dans le quotidien des journalistes. Le travail se déroulera dans le cadre du projet ANR ContentCheck (2016-2019) qui fournit le financement et dans le cadre duquel nous collaborons aussi avec l'équipe "Les Décodeurs" (journalistes spécialisés dans le fact-checking) du journal Le Monde.La démarche scientifique de la thèse se décompose comme suit:1. Identifier les technologies et domaines de gestion de contenu (texte, données, connaissances) intervenant de façon recurrente (ou dont le besoin est ressenti comme important) dans l'activité des journalistes.Il est par exemple déjà clair que ceux-ci ont l'habitude d'utiliser "en interne" quelques bases de données construites par les journalistes eux-mêmes ; ils disposent aussi d'outils internes (à la rédaction) de recherche par mots-clé ; cependant, ils souhaiterait augmenter leur capacité d'indexation sémantique...Parmi ces problèmes, identifier ceux pour lesquels des solutions techniques (informatiques) sont connues, et le cas échéant mis en oeuvre dans des systèmes existants.2. S'attaquer aux problèmes ouverts (sur le plan de la recherche), pour lesquels des réponses satisfaisantes manquent, liés à la modélisation et à l'algorithmique efficace pour des contenus textuels, sémantiques, et des données, dans un contexte journalistique. / Digital content is increasingly produced nowadays in a variety of media such as news and social network sites, personal Web sites, blogs etc. In particular, a large and dynamic part of such content is related to media-worthy events, whether of general interest (e.g., the war in Syria) or of specialized interest to a sub-community of users (e.g., sport events or genetically modified organisms). While such content is primarily meant for the human users (readers), interest is growing in its automatic analysis, understanding and exploitation. Within the ANR project ContentCheck, we are interested in developing textual and semantic tools for analyzing content shared through digital media. The proposed PhD project takes place within this contract, and will be developed based on the interactions with our partner from Le Monde. The PhD project aims at developing algorithms and tools for :Classifying and annotating mixed content (from articles, structured databases, social media etc.) based on an existing set of topics (or ontology) ;Information and relation extraction from a text which may comprise a statement to be fact-checked, with a particular focus on capturing the time dimension ; a sample statement is for instance « VAT on iron in France was the highest in Europe in 2015 ».Building structured queries from extracted information and relations, to be evaluated against reference databases used as trusted information against which facts can be checked. Vérification des faits RDF Natural Language Processing Fact-checking RDF 621.39
44	Zjednodušení přístupu k propojeným datům pomocí tabulkových pohledů / Simplifying access to linked data using tabular views Jareš, Antonín January 2021 (has links) The goal of this thesis is to design and implement a front-end application allowing users to create and manage custom views for arbitrary linked data endpoints. Such views will be executable against a predefined SPARQL endpoint and the users will be able to retrieve and download their requested data in the CSV format. The users will also be able to share these views and store them utilizing Solid Pods. Experienced SPARQL users will be able to manually customize the query. To achieve these goals, the system uses freely available technologies - HTML, JavaScript (namely the React framework) and CSS.
45	Sdílení dat mezi informačními systémy založené na ontologiích / Ontology-Based Data Sharing among Information Systems Hák, Lukáš Unknown Date (has links) This thesis describes data sharing between information systems based on ontologies. In the first chapter shows up the term ontology and used terminology. Then this thesis analyses used basic methods, onthological languages and partially describes semantic web. In the third chapter are write out utilities and plugins which are used for working with ontologies. The other chapters describe created ontology which are useful for car-selling. Especially ontology with cars, sellers and addresses . At the end of the thesis is explained suggested instrument to transfer existing XML to recording advertising in OWL language.
46	Efficient Source Selection For SPARQL Endpoint Query Federation Saleem, Muhammad 13 May 2016 (has links) The Web of Data has grown enormously over the last years. Currently, it comprises a large compendium of linked and distributed datasets from multiple domains. Due to the decentralised architecture of the Web of Data, several of these datasets contain complementary data. Running complex queries on this compendium thus often requires accessing data from different data sources within one query. The abundance of datasets and the need for running complex query has thus motivated a considerable body of work on SPARQL query federation systems, the dedicated means to access data distributed over the Web of Data. This thesis addresses two key areas of federated SPARQL query processing: (1) efficient source selection, and (2) comprehensive SPARQL benchmarks to test and ranked federated SPARQL engines as well as triple stores. Efficient Source Selection: Efficient source selection is one of the most important optimization steps in federated SPARQL query processing. An overestimation of query relevant data sources increases the network traffic, result in irrelevant intermediate results, and can significantly affect the overall query processing time. Previous works have focused on generating optimized query execution plans for fast result retrieval. However, devising source selection approaches beyond triple pattern-wise source selection has not received much attention. Similarly, only little attention has been paid to the effect of duplicated data on federated querying. This thesis presents HiBISCuS and TBSS, novel hypergraph-based source selection approaches, and DAW, a duplicate-aware source selection approach to federated querying over the Web of Data. Each of these approaches can be combined directly with existing SPARQL query federation engines to achieve the same recall while querying fewer data sources. We combined the three (HiBISCuS, DAW, and TBSS) source selections approaches with query rewriting to form a complete SPARQL query federation engine named Quetsal. Furthermore, we present TopFed, a Cancer Genome Atlas (TCGA) tailored federated query processing engine that exploits the data distribution to perform intelligent source selection while querying over large TCGA SPARQL endpoints. Finally, we address the issue of rights managements and privacy while accessing sensitive resources. To this end, we present SAFE: a global source selection approach that enables decentralised, policy-aware access to sensitive clinical information represented as distributed RDF Data Cubes. Comprehensive SPARQL Benchmarks: Benchmarking is indispensable when aiming to assess technologies with respect to their suitability for given tasks. While several benchmarks and benchmark generation frameworks have been developed to evaluate federated SPARQL engines and triple stores, they mostly provide a one-fits-all solution to the benchmarking problem. This approach to benchmarking is however unsuitable to evaluate the performance of a triple store for a given application with particular requirements. The fitness of current SPARQL query federation approaches for real applications is difficult to evaluate with current benchmarks as current benchmarks are either synthetic or too small in size and complexity. Furthermore, state-of-the-art federated SPARQL benchmarks mostly focused on a single performance criterion, i.e., the overall query runtime. Thus, they cannot provide a fine-grained evaluation of the systems. We address these drawbacks by presenting FEASIBLE, an automatic approach for the generation of benchmarks out of the query history of applications, i.e., query logs and LargeRDFBench, a billion-triple benchmark for SPARQL query federation which encompasses real data as well as real queries pertaining to real bio-medical use cases. Our evaluation results show that HiBISCuS, TBSS, TopFed, DAW, and SAFE all can significantly reduce the total number of sources selected and thus improve the overall query performance. In particular, TBSS is the first source selection approach to remain under 5% overall relevant sources overestimation. Quetsal has reduced the number of sources selected (without losing recall), the source selection time as well as the overall query runtime as compared to state-of-the-art federation engines. The LargeRDFBench evaluation results suggests that the performance of current SPARQL query federation systems on simple queries does not reflect the systems\\\'' performance on more complex queries. Moreover, current federation systems seem unable to deal with many of the challenges that await them in the age of Big Data. Finally, the FEASIBLE\\\''s evaluation results shows that it generates better sample queries than the state-of-the-art. In addition, the better query selection and the larger set of query types used lead to triple store rankings which partly differ from the rankings generated by previous works. info:eu-repo/classification/ddc/000 ddc:000 SPARQL, RDF, Federation, Benchmarks SPARQL, RDF, Federation, Benchmarks
47	Towards versioning of arbitrary RDF data Frommhold, Marvin, Navarro Piris, Rubén, Arndt, Natanael, Tramp, Sebastian, Petersen, Niklas, Martin, Michael 23 June 2017 (has links) Coherent and consistent tracking of provenance data and in particular update history information is a crucial building block for any serious information system architecture. Version Control Systems can be a part of such an architecture enabling users to query and manipulate versioning information as well as content revisions. In this paper, we introduce an RDF versioning approach as a foundation for a full featured RDF Version Control System. We argue that such a system needs support for all concepts of the RDF specification including support for RDF datasets and blank nodes. Furthermore, we placed special emphasis on the protection against unperceived history manipulation by hashing the resulting patches. In addition to the conceptual analysis and an RDF vocabulary for representing versioning information, we present a mature implementation which captures versioning information for changes to arbitrary RDF datasets. info:eu-repo/classification/ddc/000 ddc:000
48	Analyse der RDF-Produktion in Vietnam / Analyze of RDF-production in Vietnam Schulenburg, Sven 02 August 2010 (has links) A simplified RDF production was made, together with a waste characterization of MSW from Hanoi area. Three experiments were done, two with active aeration and one without. A high water content was determined at all RDF, which has a negative influence on the lower heating value and a saving-effect. A accumulation of the heating value to the higher class (>40mm), was not completely possible, also no complete transfer of mineral contents to the lower class (<10mm). The RDF reach in most cases the criteria for a fuel for different limit values, heavy metals, chloride and sulfur. An economic benefit could be possible with a surplus income by using RDF instead of coal (lignite), also by avoiding landfill gas and sell emission rights via CDM. More and detailed investigations seem to be necessary to confirm these results.:I. Index I II. Abbreviations III III. List of Tables IV IV. Figures VII V. Acknowledgements VIII VI. Summary IX 1. Introduction 1 2. Materials and methods 3 2.1 Waste Composition Analyze 3 2.2 Sample analyze 4 2.2.1 Water Content 4 2.2.2 Size reduction 5 2.2.3 Carbon content 5 2.2.4 Chloride and Sulfur 7 2.2.5 Heavy metals 7 2.2.6 fossil and biogenic carbon 9 2.2.7 Ash content / Los of Ignition 11 2.3 biological Stabilization 11 2.3.1 Active Aeration 12 2.3.2 Passive aeration 12 2.4 Clean Development Mechanism 13 2.4.1 Kyoto Protocol 13 2.4.2 International emission trading 14 2.4.3 Clean Development Mechanism 15 2.4.4 Avoidance potential of emissions from waste through RDF production 22 2.5 Economic calculation 27 2.6 Comparison to the usage of primary energy sources 28 3. Results 30 3.1 Waste characterization 30 3.2 Mass Balance of RDF Production 33 3.3 Water content 34 3.3.1 Waste from Characterization 34 3.3.2 RDF 35 3.4 Heating value 36 3.4.1 Waste from Characterization 36 3.4.2 RDF 37 3.5 Heavy metals 38 3.5.1 Waste from Characterization 38 3.5.2 RDF 39 3.6 Chloride and Sulfur content 41 3.6.1 Waste from Characterization 41 3.6.2 RDF 42 3.7 Total carbon content 42 3.8 Biogenic / fossil carbon content / Ash 43 3.9 Methane avoidance potential 44 3.10 CO2e emission through RDF usage 45 3.11 Economic comparison 46 3.12 Comparison to coal 47 4. Discussion 48 4.1 1st Thesis 48 4.2 2nd Thesis 53 4.3 3rd Thesis 61 4.3.1 Mechanical requirements 62 4.3.2 Caloric requirements 63 4.3.3 Chemical requirements 64 4.4 4th Thesis 71 4.4.1 Environmental benefit 71 4.4.2 Economical benefit 74 5. Conclusion 77 References 79 Annex I - Tables I Affidavit – Eidesstattliche Erklärung I info:eu-repo/classification/ddc/620 ddc:620 RDF, Vietnam, CDM, MSW RDF, Vietnam
49	Querying big RDF data : semantic heterogeneity and rule-based inconsistency / Interrogation de gros volumes données : hétérogénéité sémantique et incohérence à la base des règles Huang, Xin 30 November 2016 (has links) Le Web sémantique est la vision de la prochaine génération de Web proposé par Tim Berners-Lee en 2001. Avec le développement rapide des technologies du Web sémantique, de grandes quantités de données RDF existent déjà sous forme de données ouvertes et liées et ne cessent d'augmenter très rapidement. Les outils traditionnels d'interrogation et de raisonnement sur les données du Web sémantique sont conçus pour fonctionner dans un environnement centralisé. A ce titre, les algorithmes de calcul traditionnels vont inévitablement rencontrer des problèmes de performances et des limitations de mémoire. De gros volumes de données hétérogènes sont collectés à partir de différentes sources de données par différentes organisations. Ces sources de données présentent souvent des divergences et des incertitudes dont la détection et la résolution sont rendues encore plus difficiles dans le big data. Mes travaux de recherche présentent des approches et algorithmes pour une meilleure exploitation de données dans le contexte big data et du web sémantique. Nous avons tout d'abord développé une approche de résolution des identités (Entity Resolution) avec des algorithmes d'inférence et d'un mécanisme de liaison lorsque la même entité est fournie dans plusieurs ressources RDF décrite avec différentes sémantiques et identifiants de ressources URI. Nous avons également développé un moteur de réécriture de requêtes SPARQL basé le modèle MapReduce pour inférer les données implicites décrites intentionnellement par des règles d'inférence lors de l'évaluation de la requête. L'approche de réécriture traitent également de la fermeture transitive et règles cycliques pour la prise en compte de langages de règles plus riches comme RDFS et OWL. Plusieurs optimisations ont été proposées pour améliorer l'efficacité des algorithmes visant à réduire le nombre de jobs MapReduce. La deuxième contribution concerne le traitement d'incohérence dans le big data. Nous étendons l'approche présentée dans la première contribution en tenant compte des incohérences dans les données. Cela comprend : (1) La détection d'incohérence à base de règles évaluées par le moteur de réécriture de requêtes que nous avons développé; (2) L'évaluation de requêtes permettant de calculer des résultats cohérentes selon une des trois sémantiques définies à cet effet. La troisième contribution concerne le raisonnement et l'interrogation sur la grande quantité données RDF incertaines. Nous proposons une approche basée sur MapReduce pour effectuer l'inférence de nouvelles données en présence d'incertitude. Nous proposons un algorithme d'évaluation de requêtes sur de grandes quantités de données RDF probabilistes pour le calcul et l'estimation des probabilités des résultats. / Semantic Web is the vision of next generation of Web proposed by Tim Berners-Lee in 2001. Indeed, with the rapid development of Semantic Web technologies, large-scale RDF data already exist as linked open data, and their number is growing rapidly. Traditional Semantic Web querying and reasoning tools are designed to run in stand-alone environment. Therefor, Processing large-scale bulk data computation using traditional solutions will result in bottlenecks of memory space and computational performance inevitably. Large volumes of heterogeneous data are collected from different data sources by different organizations. In this context, different sources always exist inconsistencies and uncertainties which are difficult to identify and evaluate. To solve these challenges of Semantic Web, the main research contents and innovative approaches are proposed as follows. For these purposes, we firstly developed an inference based semantic entity resolution approach and linking mechanism when the same entity is provided in multiple RDF resources described using different semantics and URIs identifiers. We also developed a MapReduce based rewriting engine for Sparql query over big RDF data to handle the implicit data described intentionally by inference rules during query evaluation. The rewriting approach also deal with the transitive closure and cyclic rules to provide a rich inference language as RDFS and OWL. The second contribution concerns the distributed inconsistency processing. We extend the approach presented in first contribution by taking into account inconsistency in the data. This includes: (1)Rules based inconsistency detection with the help of our query rewriting engine; (2)Consistent query evaluation in three different semantics. The third contribution concerns the reasoning and querying over large-scale uncertain RDF data. We propose an MapReduce based approach to deal with large-scale reasoning with uncertainty. Unlike possible worlds semantic, we propose an algorithm for generating intensional Sparql query plan over probabilistic RDF graph for computing the probabilities of results within the query. RDF Requête Raisonnement MapReduce Incertitude Incohérence RDF Query Reasoning MapReduce Uncertainty Inconsistency 005.7
50	Exploiting Alignments in Linked Data for Compression and Query Answering Joshi, Amit Krishna 06 June 2017 (has links) No description available. Computer Science Linked Data RDF Compression Ontology Alignment Linked Data Querying Synthetic RDF Generator SPARQL

Search results