• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 188
  • 27
  • 27
  • 21
  • 20
  • 9
  • 7
  • 6
  • 5
  • 5
  • 3
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 333
  • 146
  • 123
  • 108
  • 81
  • 67
  • 63
  • 56
  • 54
  • 51
  • 49
  • 46
  • 37
  • 35
  • 34
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Exploiting Alignments in Linked Data for Compression and Query Answering

Joshi, Amit Krishna 06 June 2017 (has links)
No description available.
52

A Framework to Support Spatial, Temporal and Thematic Analytics over Semantic Web Data

Perry, Matthew Steven 02 September 2008 (has links)
No description available.
53

La recommandation des jeux de données basée sur le profilage pour le liage des données RDF / Profile-based Datas and Recommendation for RDF Data Linking

Ben Ellefi, Mohamed 01 December 2016 (has links)
Avec l’émergence du Web de données, notamment les données ouvertes liées, une abondance de données est devenue disponible sur le web. Cependant, les ensembles de données LOD et leurs sous-graphes inhérents varient fortement par rapport a leur taille, le thème et le domaine, les schémas et leur dynamicité dans le temps au niveau des données. Dans ce contexte, l'identification des jeux de données appropriés, qui répondent a des critères spécifiques, est devenue une tâche majeure, mais difficile a soutenir, surtout pour répondre a des besoins spécifiques tels que la recherche d'entités centriques et la recherche des liens sémantique des données liées. Notamment, en ce qui concerne le problème de liage des données, le besoin d'une méthode efficace pour la recommandation des jeux de données est devenu un défi majeur, surtout avec l'état actuel de la topologie du LOD, dont la concentration des liens est très forte au niveau des graphes populaires multi-domaines tels que DBpedia et YAGO, alors qu'une grande liste d'autre jeux de données considérés comme candidats potentiels pour le liage est encore ignorée. Ce problème est dû a la tradition du web sémantique dans le traitement du problème de "identification des jeux de données candidats pour le liage". Bien que la compréhension de la nature du contenu d'un jeu de données spécifique est une condition cruciale pour les cas d'usage mentionnées, nous adoptons dans cette thèse la notion de "profil de jeu de données"- un ensemble de caractéristiques représentatives pour un jeu de données spécifique, notamment dans le cadre de la comparaison avec d'autres jeux de données. Notre première direction de recherche était de mettre en œuvre une approche de recommandation basée sur le filtrage collaboratif, qui exploite à la fois les prols thématiques des jeux de données, ainsi que les mesures de connectivité traditionnelles, afin d'obtenir un graphe englobant les jeux de données du LOD et leurs thèmes. Cette approche a besoin d'apprendre le comportement de la connectivité des jeux de données dans le LOD graphe. Cependant, les expérimentations ont montré que la topologie actuelle de ce nuage LOD est loin d'être complète pour être considéré comme des données d'apprentissage.Face aux limites de la topologie actuelle du graphe LOD, notre recherche a conduit a rompre avec cette représentation de profil thématique et notamment du concept "apprendre pour classer" pour adopter une nouvelle approche pour l'identification des jeux de données candidats basée sur le chevauchement des profils intensionnels entre les différents jeux de données. Par profil intensionnel, nous entendons la représentation formelle d'un ensemble d'étiquettes extraites du schéma du jeu de données, et qui peut être potentiellement enrichi par les descriptions textuelles correspondantes. Cette représentation fournit l'information contextuelle qui permet de calculer la similarité entre les différents profils d'une manière efficace. Nous identifions le chevauchement de différentes profils à l'aide d'une mesure de similarité semantico-fréquentielle qui se base sur un classement calcule par le tf*idf et la mesure cosinus. Les expériences, menées sur tous les jeux de données lies disponibles sur le LOD, montrent que notre méthode permet d'obtenir une précision moyenne de 53% pour un rappel de 100%.Afin d'assurer des profils intensionnels de haute qualité, nous introduisons Datavore- un outil oriente vers les concepteurs de métadonnées qui recommande des termes de vocabulaire a réutiliser dans le processus de modélisation des données. Datavore fournit également les métadonnées correspondant aux termes recommandés ainsi que des propositions des triples utilisant ces termes. L'outil repose sur l’écosystème des Vocabulaires Ouverts Lies (LOV) pour l'acquisition des vocabulaires existants et leurs métadonnées. / With the emergence of the Web of Data, most notably Linked Open Data (LOD), an abundance of data has become available on the web. However, LOD datasets and their inherent subgraphs vary heavily with respect to their size, topic and domain coverage, the schemas and their data dynamicity (respectively schemas and metadata) over the time. To this extent, identifying suitable datasets, which meet specific criteria, has become an increasingly important, yet challenging task to supportissues such as entity retrieval or semantic search and data linking. Particularlywith respect to the interlinking issue, the current topology of the LOD cloud underlines the need for practical and efficient means to recommend suitable datasets: currently, only well-known reference graphs such as DBpedia (the most obvious target), YAGO or Freebase show a high amount of in-links, while there exists a long tail of potentially suitable yet under-recognized datasets. This problem is due to the semantic web tradition in dealing with "finding candidate datasets to link to", where data publishers are used to identify target datasets for interlinking.While an understanding of the nature of the content of specific datasets is a crucial prerequisite for the mentioned issues, we adopt in this dissertation the notion of "dataset profile" - a set of features that describe a dataset and allow the comparison of different datasets with regard to their represented characteristics. Our first research direction was to implement a collaborative filtering-like dataset recommendation approach, which exploits both existing dataset topic proles, as well as traditional dataset connectivity measures, in order to link LOD datasets into a global dataset-topic-graph. This approach relies on the LOD graph in order to learn the connectivity behaviour between LOD datasets. However, experiments have shown that the current topology of the LOD cloud group is far from being complete to be considered as a ground truth and consequently as learning data.Facing the limits the current topology of LOD (as learning data), our research has led to break away from the topic proles representation of "learn to rank" approach and to adopt a new approach for candidate datasets identication where the recommendation is based on the intensional profiles overlap between differentdatasets. By intensional profile, we understand the formal representation of a set of schema concept labels that best describe a dataset and can be potentially enriched by retrieving the corresponding textual descriptions. This representation provides richer contextual and semantic information and allows to compute efficiently and inexpensively similarities between proles. We identify schema overlap by the help of a semantico-frequential concept similarity measure and a ranking criterion based on the tf*idf cosine similarity. The experiments, conducted over all available linked datasets on the LOD cloud, show that our method achieves an average precision of up to 53% for a recall of 100%. Furthermore, our method returns the mappings between the schema concepts across datasets, a particularly useful input for the data linking step.In order to ensure a high quality representative datasets schema profiles, we introduce Datavore| a tool oriented towards metadata designers that provides rankedlists of vocabulary terms to reuse in data modeling process, together with additional metadata and cross-terms relations. The tool relies on the Linked Open Vocabulary (LOV) ecosystem for acquiring vocabularies and metadata and is made available for the community.
54

Scalable Preservation, Reconstruction, and Querying of Databases in terms of Semantic Web Representations

Stefanova, Silvia January 2013 (has links)
This Thesis addresses how Semantic Web representations, in particular RDF, can enable flexible and scalable preservation, recreation, and querying of databases. An approach has been developed for selective scalable long-term archival of relational databases (RDBs) as RDF, implemented in the SAQ (Semantic Archive and Query) system. The archival of user-specified parts of an RDB is specified using an extension of SPARQL, A-SPARQL. SAQ automatically generates an RDF view of the RDB, the RD-view. The result of an archival query is RDF triples stored in: i) a data archive file containing the preserved RDB content, and ii) a schema archive file containing sufficient meta-data to reconstruct the archived database. To achieve scalable data preservation and recreation, SAQ uses special query rewriting optimizations for the archival queries. It was experimentally shown that they improve query execution and archival time compared with naïve processing. The performance of SAQ was compared with that of other systems supporting SPARQL queries to views of existing RDBs. When an archived RDB is to be recreated, the reloader module of SAQ first reads the schema archive file and executes a schema reconstruction algorithm to automatically construct the RDB schema. The thus created RDB is populated by reading the data archive and converting the read data into relational attribute values. For scalable recreation of RDF archived data we have developed the Triple Bulk Load (TBL) approach where the relational data is reconstructed by using the bulk load facility of the RDBMS. Our experiments show that the TBL approach is substantially faster than the naïve Insert Attribute Value (IAV) approach, despite the added sorting and post-processing. To view and query semi-structured Topic Maps data as RDF the prototype system TM-Viewer was implemented. A declarative RDF view of Topic Maps, the TM-view, is automatically generated by the TM-viewer using a developed conceptual schema for the Topic Maps data model. To achieve efficient query processing of SPARQL queries to the TM-view query rewrite transformations were developed and evaluated. It was shown that they significantly improve the query execution time. / eSSENCE
55

Semantic Web Queries over Scientific Data

Andrejev, Andrej January 2016 (has links)
Semantic Web and Linked Open Data provide a potential platform for interoperability of scientific data, offering a flexible model for providing machine-readable and queryable metadata. However, RDF and SPARQL gained limited adoption within the scientific community, mainly due to the lack of support for managing massive numeric data, along with certain other important features – such as extensibility with user-defined functions, query modularity, and integration with existing environments and workflows. We present the design, implementation and evaluation of Scientific SPARQL – a language for querying data and metadata combined, represented using the RDF graph model extended with numeric multidimensional arrays as node values – RDF with Arrays. The techniques used to store RDF with Arrays in a scalable way and process Scientific SPARQL queries and updates are implemented in our prototype software – Scientific SPARQL Database Manager, SSDM, and its integrations with data storage systems and computational frameworks. This includes scalable storage solutions for numeric multidimensional arrays and an efficient implementation of array operations. The arrays can be physically stored in a variety of external storage systems, including files, relational databases, and specialized array data stores, using our Array Storage Extensibility Interface. Whenever possible SSDM accumulates array operations and accesses array contents in a lazy fashion. In scientific applications numeric computations are often used for filtering or post-processing the retrieved data, which can be expressed in a functional way. Scientific SPARQL allows expressing common query sub-tasks with functions defined as parameterized queries. This becomes especially useful along with functional language abstractions such as lexical closures and second-order functions, e.g. array mappers. Existing computational libraries can be interfaced and invoked from Scientific SPARQL queries as foreign functions. Cost estimates and alternative evaluation directions may be specified, aiding the construction of better execution plans. Costly array processing, e.g. filtering and aggregation, is thus preformed on the server, saving the amount of communication. Furthermore, common supported operations are delegated to the array storage back-ends, according to their capabilities. Both expressivity and performance of Scientific SPARQL are evaluated on a real-world example, and further performance tests are run using our mini-benchmark for array queries.
56

Semantic Assistance for Data Utilization and Curation

Becker, Brian J 06 August 2013 (has links)
We propose that most data stores for large organizations are ill-designed for the future, due to limited searchability of the databases. The study of the Semantic Web has been an emerging technology since first proposed by Berners-Lee. New vocabularies have emerged, such as FOAF, Dublin Core, and PROV-O ontologies. These vocabularies, combined, can relate people, places, things, and events. Technologies developed for the Semantic Web, namely the standardized vocabularies for expressing metadata, will make data easier to utilize. We gathered use cases for various data sources, from human resources to big enterprise. Most of our use cases reflect real-world data. We developed a software package for transforming data into these semantic vocabularies, and developed a method of querying via graphical constructs. The development and testing proved itself to be useful. We conclude that data can be preserved or revived through the use of the metadata techniques for the Semantic Web.
57

Scalable Discovery and Analytics on Web Linked Data

Abdelaziz, Ibrahim 07 1900 (has links)
Resource Description Framework (RDF) provides a simple way for expressing facts across the web, leading to Web linked data. Several distributed and federated RDF systems have emerged to handle the massive amounts of RDF data available nowadays. Distributed systems are optimized to query massive datasets that appear as a single graph, while federated systems are designed to query hundreds of decentralized and interlinked graphs. This thesis starts with a comprehensive experimental study of the state-of-the-art RDF systems. It identifies a set of research problems for improving the state-of-the-art, including: supporting the emerging RDF analytics required by many modern applications, querying linked data at scale, and enabling discovery on linked data. Addressing these problems is the focus of this thesis. First, we propose Spartex; a versatile framework for complex RDF analytics. Spartex extends SPARQL to seamlessly combine generic graph algorithms with SPARQL queries. Spartex implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. We demonstrate that Spartex scales to datasets with billions of edges, and is at least as fast as the state-of-the-art specialized RDF engines. For analytical tasks, Spartex is an order of magnitude faster than existing alternatives. To address the scalability limitation of federated RDF engines, we propose Lusail; a scalable system for querying geo-distributed RDF graphs. Lusail follows a two-tier strategy: (i) locality-aware decomposition of the query into subqueries to maximize the computations at the endpoints and minimize intermediary results, and (ii) selectivity-aware execution to reduce network latency and increase parallelism. Our experiments on billions of triples show that Lusail outperforms existing systems by orders of magnitude in scalability and response time. Finally, enabling discovery on linked data is challenging due to the prior knowledge required to formulate SPARQL queries. To address these challenges; we develop novel techniques to (i) predict semantically equivalent SPARQL queries from a set of keywords by leveraging word embeddings, and (ii) generate fine-grained and non-blocking query plans to get fast and early results.
58

Podpora sémantiky v CMS Drupal / Semantic support in CMS Drupal

Ivančo, Daniel January 2012 (has links)
Aim of this diploma thesis is to map semantic features of CMS Drupal version 7. The goal of the first part of this work is to theoretically describe semantic web problematic and CMS Drupal. The second -- practical part of this work maps in details all the features of semantic web, which are supported by described CMS Drupal. These semantic features are mapped in two different points of views -- implementation and functional. Main contribution of this work is the method used to map these features. It's based on Drupal plugins code modification and revision in order to draw or demonstrate these features, which are not necessarily completely documented or functional. Furthermore all of these features are demonstrated on examples created as a part of this thesis. Finally the last part of this work compares these mapped features to similar CMS systems.
59

Digitale Archivsysteme - Erfahrungen und Perspektiven

Hübner, Uwe, Thümer, Ingrid, Ziegler, Christoph 09 June 2000 (has links)
1995 wurde mit dem Aufbau des Multimedia Online Archiv Chemnitz (MONARCH) begonnen. Aktuelle Entwicklungen sind der Einsatz von RDF für die Metadaten und von digitalen Signaturen. Die technische Sicherung der Dauerhaftigkeit erfolgt durch Migrationskonzepte. Zu den vorhandenen Dokumentenformaten wird XML kommen, diskutiert werden DTD-Alternativen. Als Anwendungsaspekte werden u.a. die Behandlung aggregierter Dokumente und der Umgang mit dem Plagiatsproblem betrachtet.
60

Contaminant Hydrogeology Knowledge Base (CHKb) of Georgia, USA

Sarajlic, Semir 18 December 2013 (has links)
Hydrogeologists collect data through studies that originate from a diverse and growing set of instruments that measure, for example, geochemical constituents of surface and groundwater. Databases store and publish the collected data on the Web, and the volume of data is quickly increasing, which makes accessing data problematic and time consuming for individuals. One way to overcome this problem is to develop ontology to formally and explicitly represent the domain (e.g., contaminant hydrogeology) knowledge. Using OWL and RDF, contaminant hydrogeology ontology (CHO) is developed to manage hydrological spatial data for Georgia, USA. CHO is a conceptual computer model for the contaminant hydrogeology domain in which concepts (e.g. contaminant, aquifer) and their relationships (e.g. pollutes) are formerly and explicitly defined. Cyberinfrastructure for exposing CHO and datasets (i.e., CHKb) as Linked Data on the Web is developed. Cyberinfrastructure consists of storing, managing, querying, and visualizing CHKb that can be accessed from URL: cho.gsu.edu.

Page generated in 0.0294 seconds