Global ETD Search

81	QUERYING GRAPH STRUCTURED RDF DATA Qiao, Shi 27 January 2016 (has links) No description available. Computer Science
82	NEAR NEIGHBOR EXPLORATIONS FOR KEYWORD-BASED SEMANTIC SEARCHES USING RDF SUMMARY GRAPH Ayvaz, Serkan 23 November 2015 (has links) No description available. Computer Science semantic web summary graph RDF graph semantic search
83	Improving RDF data with data mining Abedjan, Ziawasch January 2014 (has links) Linked Open Data (LOD) comprises very many and often large public data sets and knowledge bases. Those datasets are mostly presented in the RDF triple structure of subject, predicate, and object, where each triple represents a statement or fact. Unfortunately, the heterogeneity of available open data requires significant integration steps before it can be used in applications. Meta information, such as ontological definitions and exact range definitions of predicates, are desirable and ideally provided by an ontology. However in the context of LOD, ontologies are often incomplete or simply not available. Thus, it is useful to automatically generate meta information, such as ontological dependencies, range definitions, and topical classifications. Association rule mining, which was originally applied for sales analysis on transactional databases, is a promising and novel technique to explore such data. We designed an adaptation of this technique for min-ing Rdf data and introduce the concept of “mining configurations”, which allows us to mine RDF data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. To this end, we present rule-based approaches for auto-completion, data enrichment, ontology improvement, and query relaxation. Auto-completion remedies the problem of inconsistent ontology usage, providing an editing user with a sorted list of commonly used predicates. A combination of different configurations step extends this approach to create completely new facts for a knowledge base. We present two approaches for fact generation, a user-based approach where a user selects the entity to be amended with new facts and a data-driven approach where an algorithm discovers entities that have to be amended with missing facts. As knowledge bases constantly grow and evolve, another approach to improve the usage of RDF data is to improve existing ontologies. Here, we present an association rule based approach to reconcile ontology and data. Interlacing different mining configurations, we infer an algorithm to discover synonymously used predicates. Those predicates can be used to expand query results and to support users during query formulation. We provide a wide range of experiments on real world datasets for each use case. The experiments and evaluations show the added value of association rule mining for the integration and usability of RDF data and confirm the appropriateness of our mining configuration methodology. / Linked Open Data (LOD) umfasst viele und oft sehr große öffentlichen Datensätze und Wissensbanken, die hauptsächlich in der RDF Triplestruktur bestehend aus Subjekt, Prädikat und Objekt vorkommen. Dabei repräsentiert jedes Triple einen Fakt. Unglücklicherweise erfordert die Heterogenität der verfügbaren öffentlichen Daten signifikante Integrationsschritte bevor die Daten in Anwendungen genutzt werden können. Meta-Daten wie ontologische Strukturen und Bereichsdefinitionen von Prädikaten sind zwar wünschenswert und idealerweise durch eine Wissensbank verfügbar. Jedoch sind Wissensbanken im Kontext von LOD oft unvollständig oder einfach nicht verfügbar. Deshalb ist es nützlich automatisch Meta-Informationen, wie ontologische Abhängigkeiten, Bereichs-und Domänendefinitionen und thematische Assoziationen von Ressourcen generieren zu können. Eine neue und vielversprechende Technik um solche Daten zu untersuchen basiert auf das entdecken von Assoziationsregeln, welche ursprünglich für Verkaufsanalysen in transaktionalen Datenbanken angewendet wurde. Wir haben eine Adaptierung dieser Technik auf RDF Daten entworfen und stellen das Konzept der Mining Konfigurationen vor, welches uns befähigt in RDF Daten auf unterschiedlichen Weisen Muster zu erkennen. Verschiedene Konfigurationen erlauben uns Schema- und Wertbeziehungen zu erkennen, die für interessante Anwendungen genutzt werden können. In dem Sinne, stellen wir assoziationsbasierte Verfahren für eine Prädikatvorschlagsverfahren, Datenvervollständigung, Ontologieverbesserung und Anfrageerleichterung vor. Das Vorschlagen von Prädikaten behandelt das Problem der inkonsistenten Verwendung von Ontologien, indem einem Benutzer, der einen neuen Fakt einem Rdf-Datensatz hinzufügen will, eine sortierte Liste von passenden Prädikaten vorgeschlagen wird. Eine Kombinierung von verschiedenen Konfigurationen erweitert dieses Verfahren sodass automatisch komplett neue Fakten für eine Wissensbank generiert werden. Hierbei stellen wir zwei Verfahren vor, einen nutzergesteuertenVerfahren, bei dem ein Nutzer die Entität aussucht die erweitert werden soll und einen datengesteuerten Ansatz, bei dem ein Algorithmus selbst die Entitäten aussucht, die mit fehlenden Fakten erweitert werden. Da Wissensbanken stetig wachsen und sich verändern, ist ein anderer Ansatz um die Verwendung von RDF Daten zu erleichtern die Verbesserung von Ontologien. Hierbei präsentieren wir ein Assoziationsregeln-basiertes Verfahren, der Daten und zugrundeliegende Ontologien zusammenführt. Durch die Verflechtung von unterschiedlichen Konfigurationen leiten wir einen neuen Algorithmus her, der gleichbedeutende Prädikate entdeckt. Diese Prädikate können benutzt werden um Ergebnisse einer Anfrage zu erweitern oder einen Nutzer während einer Anfrage zu unterstützen. Für jeden unserer vorgestellten Anwendungen präsentieren wir eine große Auswahl an Experimenten auf Realweltdatensätzen. Die Experimente und Evaluierungen zeigen den Mehrwert von Assoziationsregeln-Generierung für die Integration und Nutzbarkeit von RDF Daten und bestätigen die Angemessenheit unserer konfigurationsbasierten Methodologie um solche Regeln herzuleiten. Assoziationsregeln RDF LOD Mustererkennung Synonyme association rule mining RDF LOD knowledge discovery synonym discovery Data processing Computer science
84	Statistical Extraction of Multilingual Natural Language Patterns for RDF Predicates: Algorithms and Applications Gerber, Daniel 29 August 2016 (has links) (PDF) The Data Web has undergone a tremendous growth period. It currently consists of more then 3300 publicly available knowledge bases describing millions of resources from various domains, such as life sciences, government or geography, with over 89 billion facts. In the same way, the Document Web grew to the state where approximately 4.55 billion websites exist, 300 million photos are uploaded on Facebook as well as 3.5 billion Google searches are performed on average every day. However, there is a gap between the Document Web and the Data Web, since for example knowledge bases available on the Data Web are most commonly extracted from structured or semi-structured sources, but the majority of information available on the Web is contained in unstructured sources such as news articles, blog post, photos, forum discussions, etc. As a result, data on the Data Web not only misses a significant fragment of information but also suffers from a lack of actuality since typical extraction methods are time-consuming and can only be carried out periodically. Furthermore, provenance information is rarely taken into consideration and therefore gets lost in the transformation process. In addition, users are accustomed to entering keyword queries to satisfy their information needs. With the availability of machine-readable knowledge bases, lay users could be empowered to issue more specific questions and get more precise answers. In this thesis, we address the problem of Relation Extraction, one of the key challenges pertaining to closing the gap between the Document Web and the Data Web by four means. First, we present a distant supervision approach that allows finding multilingual natural language representations of formal relations already contained in the Data Web. We use these natural language representations to find sentences on the Document Web that contain unseen instances of this relation between two entities. Second, we address the problem of data actuality by presenting a real-time data stream RDF extraction framework and utilize this framework to extract RDF from RSS news feeds. Third, we present a novel fact validation algorithm, based on natural language representations, able to not only verify or falsify a given triple, but also to find trustworthy sources for it on the Web and estimating a time scope in which the triple holds true. The features used by this algorithm to determine if a website is indeed trustworthy are used as provenance information and therewith help to create metadata for facts in the Data Web. Finally, we present a question answering system that uses the natural language representations to map natural language question to formal SPARQL queries, allowing lay users to make use of the large amounts of data available on the Data Web to satisfy their information need. RDF Semantic Web Relationsextraktion Muster Data Web RDF Semantic Web Relation Extraction Pattern Data Web ddc:500
85	Ontological Reasoning with Taxonomies in RDF Database / Ontological Reasoning with Taxonomies in RDF Database Hoferek, Ondřej January 2013 (has links) 13548805670613-46162052c208770f99e83a586780d16c.txt As the technologies for the realisation of the idea of the Semantic Web have evolved rapidly during past few years, it is possible to use them in variety of applications. As they are designed with the ability to process and analyze semantic information found in the data in mind, they are particularly suitable for the task of enhancing relevance of the document retrieval. In this work, we discuss the possibilities of identifying a suitable subset of the expressing capabilities of the SPARQL querying language and create a component that encapsulates the technical details of its usage. Page 1
86	Selective disclosure and inference leakage problem in the Linked Data / Exposition sélective et problème de fuite d’inférence dans le Linked Data Sayah, Tarek 08 September 2016 (has links) L'émergence du Web sémantique a mené à une adoption rapide du format RDF (Resource Description Framework) pour décrire les données et les liens entre elles. Ce modèle de graphe est adapté à la représentation des liens sémantiques entre les objets du Web qui sont identifiés par des IRI. Les applications qui publient et échangent des données RDF potentiellement sensibles augmentent dans de nombreux domaines : bio-informatique, e-gouvernement, mouvements open-data. La problématique du contrôle des accès aux contenus RDF et de l'exposition sélective de l'information en fonction des privilèges des requérants devient de plus en plus importante. Notre principal objectif est d'encourager les entreprises et les organisations à publier leurs données RDF dans l'espace global des données liées. En effet, les données publiées peuvent être sensibles, et par conséquent, les fournisseurs de données peuvent être réticents à publier leurs informations, à moins qu'ils ne soient certains que les droits d'accès à leurs données par les différents requérants sont appliqués correctement. D'où l'importance de la sécurisation des contenus RDF est de l'exposition sélective de l'information pour différentes classes d'utilisateurs. Dans cette thèse, nous nous sommes intéressés à la conception d'un contrôle d'accès pertinents pour les données RDF. De nouvelles problématiques sont posées par l'introduction des mécanismes de déduction pour les données RDF (e.g., RDF/S, OWL), notamment le problème de fuite d'inférence. En effet, quand un propriétaire souhaite interdire l'accès à une information, il faut également qu'il soit sûr que les données diffusées ne pourront pas permettre de déduire des informations secrètes par l'intermédiaire des mécanismes d'inférence sur des données RDF. Dans cette thèse, nous proposons un modèle de contrôle d'accès à grains fins pour les données RDF. Nous illustrons l'expressivité du modèle de contrôle d'accès avec plusieurs stratégies de résolution de conflits, y compris la Most Specific Takes Precedence. Nous proposons un algorithme de vérification statique et nous montrons qu'il est possible de vérifier à l'avance si une politique présente un problème de fuite d'inférence. De plus, nous montrons comment utiliser la réponse de l'algorithme à des fins de diagnostics. Pour traiter les privilèges des sujets, nous définissons la syntaxe et la sémantique d'un langage inspiré de XACML, basé sur les attributs des sujets pour permettre la définition de politiques de contrôle d'accès beaucoup plus fines. Enfin, nous proposons une approche d'annotation de données pour appliquer notre modèle de contrôle d'accès, et nous montrons que notre implémentation entraîne un surcoût raisonnable durant l'exécution / The emergence of the Semantic Web has led to a rapid adoption of the RDF (Resource Description Framework) to describe the data and the links between them. The RDF graph model is tailored for the representation of semantic relations between Web objects that are identified by IRIs (Internationalized Resource Identifier). The applications that publish and exchange potentially sensitive RDF data are increasing in many areas: bioinformatics, e-government, open data movement. The problem of controlling access to RDF content and selective exposure to information based on privileges of the requester becomes increasingly important. Our main objective is to encourage businesses and organizations worldwide to publish their RDF data into the linked data global space. Indeed, the published data may be sensitive, and consequently, data providers may avoid to release their information, unless they are certain that the desired access rights of different accessing entities are enforced properly, to their data. Hence the issue of securing RDF content and ensuring the selective disclosure of information to different classes of users is becoming all the more important. In this thesis, we focused on the design of a relevant access control for RDF data. The problem of providing access controls to RDF data has attracted considerable attention of both the security and the database community in recent years. New issues are raised by the introduction of the deduction mechanisms for RDF data (e.g., RDF/S, OWL), including the inference leakage problem. Indeed, when an owner wishes to prohibit access to information, she/he must also ensure that the information supposed secret, can’t be inferred through inference mechanisms on RDF data. In this PhD thesis we propose a fine-grained access control model for RDF data. We illustrate the expressiveness of the access control model with several conict resolution strategies including most specific takes precedence. To tackle the inference leakage problem, we propose a static verification algorithm and show that it is possible to check in advance whether such a problem will arise. Moreover, we show how to use the answer of the algorithm for diagnosis purposes. To handle the subjects' privileges, we define the syntax and semantics of a XACML inspired language based on the subjects' attributes to allow much finer access control policies. Finally, we propose a data-annotation approach to enforce our access control model, and show that our solution incurs reasonable overhead with respect to the optimal solution which consists in materializing the user's accessible subgraph to enforce our access control model, and show that our solution incurs reasonable overhead with respect to the optimal solution which consists in materializing the user's accessible subgraph RDF Autorisation Raisonnement sémantique Fuite d'inférence Application Linked Data RDF Authorization Semantic Reasoning Inference Leakage Enforcement Linked Data 025.4
87	What's in a query : analyzing, predicting, and managing linked data access Lorey, Johannes January 2014 (has links) The term Linked Data refers to connected information sources comprising structured data about a wide range of topics and for a multitude of applications. In recent years, the conceptional and technical foundations of Linked Data have been formalized and refined. To this end, well-known technologies have been established, such as the Resource Description Framework (RDF) as a Linked Data model or the SPARQL Protocol and RDF Query Language (SPARQL) for retrieving this information. Whereas most research has been conducted in the area of generating and publishing Linked Data, this thesis presents novel approaches for improved management. In particular, we illustrate new methods for analyzing and processing SPARQL queries. Here, we present two algorithms suitable for identifying structural relationships between these queries. Both algorithms are applied to a large number of real-world requests to evaluate the performance of the approaches and the quality of their results. Based on this, we introduce different strategies enabling optimized access of Linked Data sources. We demonstrate how the presented approach facilitates effective utilization of SPARQL endpoints by prefetching results relevant for multiple subsequent requests. Furthermore, we contribute a set of metrics for determining technical characteristics of such knowledge bases. To this end, we devise practical heuristics and validate them through thorough analysis of real-world data sources. We discuss the findings and evaluate their impact on utilizing the endpoints. Moreover, we detail the adoption of a scalable infrastructure for improving Linked Data discovery and consumption. As we outline in an exemplary use case, this platform is eligible both for processing and provisioning the corresponding information. / Unter dem Begriff Linked Data werden untereinander vernetzte Datenbestände verstanden, die große Mengen an strukturierten Informationen für verschiedene Anwendungsgebiete enthalten. In den letzten Jahren wurden die konzeptionellen und technischen Grundlagen für die Veröffentlichung von Linked Data gelegt und verfeinert. Zu diesem Zweck wurden eine Reihe von Technologien eingeführt, darunter das Resource Description Framework (RDF) als Datenmodell für Linked Data und das SPARQL Protocol and RDF Query Language (SPARQL) zum Abfragen dieser Informationen. Während bisher hauptsächlich die Erzeugung und Bereitstellung von Linked Data Forschungsgegenstand war, präsentiert die vorliegende Arbeit neuartige Verfahren zur besseren Nutzbarmachung. Insbesondere werden dafür Methoden zur Analyse und Verarbeitung von SPARQL-Anfragen entwickelt. Zunächst werden daher zwei Algorithmen vorgestellt, die die strukturelle Ähnlichkeit solcher Anfragen bestimmen. Beide Algorithmen werden auf eine große Anzahl von authentischen Anfragen angewandt, um sowohl die Güte der Ansätze als auch die ihrer Resultate zu untersuchen. Darauf aufbauend werden verschiedene Strategien erläutert, mittels derer optimiert auf Quellen von Linked Data zugegriffen werden kann. Es wird gezeigt, wie die dabei entwickelte Methode zur effektiven Verwendung von SPARQL-Endpunkten beiträgt, indem relevante Ergebnisse für mehrere nachfolgende Anfragen vorgeladen werden. Weiterhin werden in dieser Arbeit eine Reihe von Metriken eingeführt, die eine Einschätzung der technischen Eigenschaften solcher Endpunkte erlauben. Hierfür werden praxisrelevante Heuristiken entwickelt, die anschließend ausführlich mit Hilfe von konkreten Datenquellen analysiert werden. Die dabei gewonnenen Erkenntnisse werden erörtert und in Hinblick auf die Verwendung der Endpunkte interpretiert. Des Weiteren wird der Einsatz einer skalierbaren Plattform vorgestellt, die die Entdeckung und Nutzung von Beständen an Linked Data erleichtert. Diese Plattform dient dabei sowohl zur Verarbeitung als auch zur Verfügbarstellung der zugehörigen Information, wie in einem exemplarischen Anwendungsfall erläutert wird. Vernetzte Daten SPARQL RDF Anfragepaare Informationsvorhaltung linked data SPARQL RDF query matching prefetching Data processing Computer science
88	TopFed: TCGA tailored federated query processing and linking to LOD Saleem, Muhammad, Padmanabhuni, Shanmukha S., Ngonga Ngomo, Axel-Cyrille, Iqbal, Aftab, Almeida, Jonas S., Decker, Stefan, Deus, Helena F. 12 January 2015 (has links) (PDF) Methods: We address these issues by transforming the TCGA data into the Semantic Web standard Resource Description Format (RDF), link it to relevant datasets in the Linked Open Data (LOD) cloud and further propose an efficient data distribution strategy to host the resulting 20.4 billion triples data via several SPARQL endpoints. Having the TCGA data distributed across multiple SPARQL endpoints, we enable biomedical scientists to query and retrieve information from these SPARQL endpoints by proposing a TCGA tailored federated SPARQL query processing engine named TopFed. Results: We compare TopFed with a well established federation engine FedX in terms of source selection and query execution time by using 10 different federated SPARQL queries with varying requirements. Our evaluation results show that TopFed selects on average less than half of the sources (with 100% recall) with query execution time equal to one third to that of FedX. Conclusion: With TopFed, we aim to offer biomedical scientists a single-point-of-access through which distributed TCGA data can be accessed in unison. We believe the proposed system can greatly help researchers in the biomedical domain to carry out their research effectively with TCGA as the amount and diversity of data exceeds the ability of local resources to handle its retrieval and parsing. gleichzeitige Suchanfrage SPARQL TCGA RDF Federated queries SPARQL TCGA Cancer genome atlas RDF ddc:610 ddc:004
89	RDF na interoperabilidade entre domínios na web. SANTOS, Domingos Sávio Apolônio. 24 September 2018 (has links) Submitted by Emanuel Varela Cardoso (emanuel.varela@ufcg.edu.br) on 2018-09-24T17:16:28Z No. of bitstreams: 1 DOMINGOS SÁVIO APOLÔNIO SANTOS – DISSERTAÇÃO (PPGCC) 2002.pdf: 785821 bytes, checksum: ad5def5ae29a4a397b01429c9d573491 (MD5) / Made available in DSpace on 2018-09-24T17:16:28Z (GMT). No. of bitstreams: 1 DOMINGOS SÁVIO APOLÔNIO SANTOS – DISSERTAÇÃO (PPGCC) 2002.pdf: 785821 bytes, checksum: ad5def5ae29a4a397b01429c9d573491 (MD5) Previous issue date: 2002-07-30 / Este trabalho trata da aplicabilidade da tecnologia Resource Description Framework – RDF na interoperabilidade entre diferentes domínios. O tema é desenvolvido com uma fundamentação teórica e com um estudo de caso, através do qual são desenvolvidas aplicações para a Web, interoperáveis via RDF. O objetivo específico é traçar uma estratégia para aplicar RDF, exemplificando-a através do estudo de caso. Neste estudo, promove-se a interoperabilidade entre dois domínios (Anúncios Classificados e Serviços de Cartórios), aplicando Resource Description Framework em serviços na Web. Algumas características deste trabalho são: o processo de desenvolvimento das aplicações é baseado no RUP – Rational Unifield Process; os esquemas RDF dos domínios são criados durante a fase de elaboração de cada aplicação a partir do modelo de classes do domínio inicialmente representado através da UML; as aplicações Web para cada domínio são desenvolvidas com características das chamadas Semantic Web Applications, utilizando as especificações RDF para mostrar a interoperabilidade entre elas. Por fim, o trabalho é concluído com algumas análises e comentários acerca dos resultados alcançados com o estudo de caso, e são feitas algumas sugestões para subsidiar trabalhos futuros na área. / This work is a study of the applicability of The Resource Description Framework (RDF) on the interoperability of data between different domains. First, the theoretical context of the theme is given, followed by a case study about the development of Interoperable Web Applications by RDF. The specific objective is presented a strategy for the application of the RDF, demonstrating it through a case study. In this work, the interoperability of two domains (Classified Ads and Notorial Services) is demonstrated through web applications. Some features of this work are presented as the following. The development process of the solution is RUP based (Rational Unifield Process). The RDF schemas of the domains are created during the elaboration phase of each application using the UML language. The Applications for each domain are built with features of Semantic Web Applications, applying RDF specifications, to promote the interoperability between them. In conclusion, some final analyses, comments and conclusions are made about the results of the case study, as well as suggestions for future research are presented. Ciência da Computação SGML HTML XML XSLT Interoperabilidade na Web Tecnologia RDF Domínios na web Web Interoperability RDF Technology Web domains
90	Detection of glass in RDF using NIR spectroscopy Hedlund, Philip January 2018 (has links) Purpose of this study was to investigate the possibilities of using Near-infrared (NIR) spectroscopy to detect glass in refuse derived fuel (RDF) as well as what on-line data of glass content could be used for in terms of boiler operation and performance determination. Sample configurations were done with dried RDF (to prevent mass loss due to moisture and spectroscopic disturbance) and increasing concentrations of colored soda-lime glass, total of 100 samples. Glass was randomly scattered among the RDF by shaking the added glass and RDF in a bucket to generate representative samples of real life conditions. NIR-spectra acquisition was done between 12000 and 4000 cm-1, at 8 cm-1 resolution and average of 32 scans. The determination of boiler performance was done in accordance to Swedish standards for acceptance testing and heat loss due to glass was treated as slag. Resulting performance calculations showed boiler efficiency via indirect method matching efficiency calculated via direct method (deviating at maximum 2 %) which validates the summarized losses (including due to glass). The heat loss due to glass was calculated to 0,068 MW/%glass, which equated to average of 0,16 MW for 2,37 % glass. Total heat loss was amounted to an average of 11,53 MW. The developed models were not satisfactory in their quality of regression prediction. Although some had, through pre-processing, good development of explained variance at increasing factors, but still had a “Not Applicable” coefficient of determination by regression prediction. The poor quality of models can be explained by poor glass detection (poor representation) of the spectroscopic instrument due to a combination of glass being randomly scattered in the background material and sometimes covered by RDF as well as that the NIR-spectroscopy light beam only hits a small area. By increasing the number of samples upwards 300 -500, the effect of random scatter of glass can be mitigated and acceptable models could be acquired. / FUDIPO NIR spectroscopy RDF glass CFB boiler performance losses NIR spektroskopi RDF glas CFB panna prestanda förluster Energy Engineering Energiteknik

Search results