Global ETD Search

171	Ontology-Driven Self-Organization of Politically Engaged Social Groups / Ontology-Driven Self-Organization of Politically Engaged Social Groups Belák, Václav January 2009 (has links) This thesis deals with the use of knowledge technologies in support of self-organization of people with joint political goals. It first provides a theoretical background for a development of a social-semantic system intended to support self-organization and then it applies this background in the development of a core ontology and algorithms for support of self-organization of people. It also presents a design and implementation of a proof-of-concept social-semantic web application that has been built to test our research. The application stores all data in an RDF store and represents them using the core ontology. Descriptions of content are disambiguated using the WordNet thesaurus. Emerging politically engaged groups can establish themselves into local political initiatives, NGOs, or even new political parties. Therefore, the system may help people easily participate on solutions of issues which are influencing them.
172	Podpora sémantiky v CMS Drupal / Support of Semantics in CMS Drupal Kubaliak, Lukáš January 2011 (has links) The work concern about the support of semantics in known content managing systems. It is describing the possibilities of use for these technologies and their public accessibility. We find out, that today's technologies and methods are in the state of public inducting. In the question of semantic support in CMS Drupal we developed a tool for extending its support of semantic formats. This tool allows CMS Drupal to export its information in a Topic Maps format. For this it uses the XTM file.
173	Hromadná extrakce dat veřejné správy do RDF / Bulk extraction of public administration data to RDF Pomykacz, Michal January 2013 (has links) The purpose of this work was to deal with data extraction from various formats (HTML, XML, XLS) and transformation for further processing. As the data sources were used Czech public contracts and related code lists and classifications. Main goal was to implement periodic data extraction, RDF transformation and publishing the output in form of Linked Data using SPARQL endpoint. It was necessary to design and implement extraction modules for UnifiedViews tool as it was used for periodic extractions. Theoretical section of this thesis explains the principles of linked data and key tools used for data extraction and manipulation. Practical section deals with extractors design and implementation. Part describing extractor implementation shows methods for parsing data in various dataset formats and its transformation to RDF. The success of each extractor implementation is presented at the conclusion along with thought of usability in a real world.
174	RDF binario para una publicación, intercambio y consumo escalable en la web de datos Fernández García, Javier David January 2014 (has links) Doctor en Ciencias, Mención Computación / El actual diluvio de datos está inundando la Web con grandes volúmenes de datos representados en RDF, dando lugar a la denominada Web de Datos. En la actualidad, se publican datos abiertos e interrelacionados sobre bioinformática, geografía o sobre redes sociales, entre otros, que forman parte de proyectos tan activos como Linked Open Data. Varias áreas de investigación han emergido de este diluvio; indexación y consulta de RDF (típicamente mediante el lenguaje SPARQL), razonamiento, esquemas de publicación, alineamiento de ontologías, visualización de RDF, etc. Los tópicos de la Web Semántica relacionados con RDF son, de hecho, trending topics en casi cualquier conferencia informática. Sin embargo, podemos discernir tres importantes hechos del actual estado del arte: i) se han realizado aplicaciones e investigaciones apoyándose en datos RDF, pero aún no se ha realizado un trabajo que permita entender la esencia de este modelo de datos, ii) las representaciones clásicas de RDF continúan influenciadas por la visión tradicional de la Web basada en documentos, lo que resulta en sintaxis verbosas, redundantes y, aún, centradas en humanos. Ello conlleva iii) publicaciones pobres y difusas, procesamientos complejos e ineficientes y una falta de escalabilidad para poder desarrollar la Web de Datos en toda su extensión. En esta tesis proponemos, en primer lugar, un estudio profundo de aquellos retos que nos permitan abordar un conocimiento global de la estructura real de los conjuntos de datos RDF. Dicho estudio puede avanzar en la consecución de mejores diseños de conjuntos de datos y mejores y más eficientes estructuras de datos, índices y compresores de RDF. Posteriormente, presentamos nuestra representación binaria de RDF, HDT, que afronta la representación eficiente de grandes volúmenes de datos RDF a través de estructuras optimizadas para su almacenamiento y transmisión en red. HDT representa eficazmente un conjunto de datos RDF a través de su división en tres componentes: La cabecera (Header), el diccionario (Dictionary) y la estructura de sentencias RDF (Triples). A continuación, nos centramos en proveer estructuras eficientes tanto para el diccionario como para dicha estructura de sentencias, ya que forman parte de HDT pero también de la mayoría de aplicaciones sobre grandes volúmenes de datos RDF. Para ello, estudiamos y proponemos nuevas técnicas que permiten disponer de diccionarios e índices de sentencias RDF comprimidos, a la par que altamente funcionales. Por último, planteamos una configuración compacta para explorar y consultar conjuntos de datos codificados en HDT. Esta estructura mantiene la naturaleza compacta de la representación permitiendo el acceso directo a cualquier dato. Web semántica Minería de datos RDF Indexacion de datos
175	Modeling space-time activities and places for a smart space —a semantic approach Fan, Junchuan 01 August 2017 (has links) The rapid advancement of information and communication technologies (ICT) has dramatically changed the way people conduct daily activities. One of the reasons for such advances is the pervasiveness of location-aware devices, and people’s ability to publish and receive information about their surrounding environment. The organization, integration, and analysis of these crowdsensed geographic information is an important task for GIScience research, especially for better understanding place characteristics as well as human activities and movement dynamics in different spaces. In this dissertation research, a semantic modeling and analytic framework based on semantic web technologies is designed to handle information related with human space-time activities (e.g., information about human activities, movement, and surrounding places) for a smart space. Domain ontology for space-time activities and places that captures the essential entities in a spatial domain, and the relationships among them. Based on the developed domain ontology, a Resource Description Framework (RDF) data model is proposed that integrates spatial, temporal and semantic dimensions of space-time activities and places. Three different types of scheduled space-time activities (SXTF, SFTX, SXTX) and their potential spatiotemporal interactions are formalized with OWL and SWRL rules. Using a university campus as an example spatial domain, a RDF knowledgebase is created that integrates scheduled course activities and tweet activities in the campus area. Human movement dynamics for the campus area is analyzed from spatial, temporal, and people’s perspectives using semantic query approach. The ontological knowledge in RDF knowledgebase is further fused with place affordance knowledge learned through training deep learning model on place review data. The integration of place affordance knowledge with people’s intended activities allows the semantic analytic framework to make more personalized location recommendations for people’s daily activities. Deep Learning Human mobility RDF Semantic Web Smart space Space-time Activity Geography
176	Identifying, Relating, Consisting and Querying Large Heterogeneous RDF Sources VALDESTILHAS, ANDRE 12 January 2021 (has links) The Linked Data concept relies on a collection of best practices to publish and link structured web-based data. However, the number of available datasets has been growing significantly over the last decades. These datasets are interconnected and now represent the well-known Web of Data, which stands for an extensive collection of concise and detailed interlinked data sets from multiple domains with large datasets. Thus, linking entries across heterogeneous data sources such as databases or knowledge bases becomes an increasing challenge. However, connections between datasets play a leading role in significant activities such as cross-ontology question answering, large-scale inferences, and data integration. In Linked Data, the Linksets are well known for executing the task of generating links between datasets. Due to the heterogeneity of the datasets, this uniqueness is reflected in the structure of the dataset, making a hard task to find relations among those datasets, i.e., to identify how similar they are. In this way, we can say that Linked Data involves Datasets and Linksets and those Linksets needs to be maintained. Such lack of information directed us to the current issues addressed in this thesis, which are: How to Identify and query datasets from a huge heterogeneous collection of RDF (Resource Description Framework) datasets. To address this issue, we need to assure the consistency and to know how the datasets are related and how similar they are. As results, to deal with the need for identifying LOD (Linked Open Data) Datasets, we created an approach called WIMU, which is a regularly updated database index of more than 660K datasets from LODStats and LOD Laundromat, an efficient, low cost and scalable service on the web that shows which dataset most likely defines a URI and various statistics of datasets indexed from LODStats and LOD Laundromat. To integrate and to query LOD datasets, we provide a hybrid SPARQL query processing engine that can retrieve results from 559 active SPARQL endpoints (with a total of 163.23 billion triples) and 668,166 datasets (with a total of 58.49 billion triples) from LOD Stats and LOD Laundromat. To assure consistency of semantic web Linked repositories where these LOD datasets are located we create an approach for the mitigation of the identifier heterogeneity problem and implement a prototype where the user can evaluate existing links, as well as suggest new links to be rated and a time-efficient algorithm for the detection of erroneous links in large-scale link repositories without computing all closures required by the property axiom. To know how the datasets are related and how similar they are we provide a String similarity algorithm called Most Frequent K Characters, in which is based in two nested filters, (1) First Frequency Filter and (2) Hash Intersection filter, that allows discarding candidates before calculating the actual similarity value, thus giving a considerable performance gain, allowing to build a LOD Dataset Relation Index, in which provides information about how similar are all the datasets from LOD cloud, including statistics about the current state of those datasets. The work in this thesis showed that to identify and query LOD datasets, we need to know how those datasets are related, assuring consistency. Our analysis demonstrated that most of the datasets are disconnected from others needing to pass through a consistency and linking process to integrate them, providing a way to query a large number of datasets simultaneously. There is a considerable step towards totally queryable LOD datasets, where the information contained in this thesis is an essential step towards Identifying, Relating, and Querying datasets on the Web of Data.:1 introduction and motivation 1 1.1 The need for identifying and querying LOD datasets . 1 1.2 The need for consistency of semantic web Linked repositories . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 The need for Relation and integration of LOD datasets 2 1.4 Research Questions and Contributions . . . . . . . . . . 3 1.5 Methodology and Contributions . . . . . . . . . . . . . 3 1.6 General Use Cases . . . . . . . . . . . . . . . . . . . . . 6 1.6.1 The Heloise project . . . . . . . . . . . . . . . . . 6 1.7 Chapter overview . . . . . . . . . . . . . . . . . . . . . . 7 2 preliminaries 8 2.1 Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.1 URIs and URLs . . . . . . . . . . . . . . . . . . . 8 2.1.2 Linked Data . . . . . . . . . . . . . . . . . . . . . 9 2.1.3 Resource Description Framework . . . . . . . . 10 2.1.4 Ontologies . . . . . . . . . . . . . . . . . . . . . . 11 2.2 RDF graph . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Transitive property . . . . . . . . . . . . . . . . . . . . . 12 2.4 Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5 Linkset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.6 RDF graph partitioning . . . . . . . . . . . . . . . . . . 13 2.7 Basic Graph Pattern . . . . . . . . . . . . . . . . . . . . . 13 2.8 RDF Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.9 SPARQL . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.10 Federated Queries . . . . . . . . . . . . . . . . . . . . . . 14 3 state of the art 15 3.1 Identifying Datasets in Large Heterogeneous RDF Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Relating Large amount of RDF datasets . . . . . . . . . 19 3.2.1 Obtaining Similar Resources using String Similarity . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3 Consistency on Large amout of RDF sources . . . . . . 21 3.3.1 Heterogeneity in DBpedia Identifiers . . . . . . 21 3.3.2 Detection of Erroneous Links in Large-Scale RDF Datasets . . . . . . . . . . . . . . . . . . . . 22 3.4 Querying Large Heterogeneous RDF Datasets . . . . . 25 4 relation among large amount of rdf sources 29 4.1 Identifying Datasets in Large Heterogeneous RDF sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.1.1 The WIMU approach . . . . . . . . . . . . . . . . 29 4.1.2 The approach . . . . . . . . . . . . . . . . . . . . 30 4.1.3 Use cases . . . . . . . . . . . . . . . . . . . . . . . 33 4.1.4 Evaluation: Statistics about the Datasets . . . . 35 4.2 Relating RDF sources . . . . . . . . . . . . . . . . . . . . 38 4.2.1 The ReLOD approach . . . . . . . . . . . . . . . 38 4.2.2 The approach . . . . . . . . . . . . . . . . . . . . 40 4.2.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . 45 4.3 Relating Similar Resources using String Similarity . . . 50 4.3.1 The MFKC approach . . . . . . . . . . . . . . . . 50 4.3.2 Approach . . . . . . . . . . . . . . . . . . . . . . 51 4.3.3 Correctness and Completeness . . . . . . . . . . 55 4.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . 57 5 consistency in large amount of rdf sources 67 5.1 Consistency in Heterogeneous DBpedia Identifiers . . 67 5.1.1 The DBpediaSameAs approach . . . . . . . . . . 67 5.1.2 Representation of the idea . . . . . . . . . . . . . 68 5.1.3 The work-flow . . . . . . . . . . . . . . . . . . . 69 5.1.4 Methodology . . . . . . . . . . . . . . . . . . . . 69 5.1.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . 70 5.1.6 Normalization on DBpedia URIs . . . . . . . . . 70 5.1.7 Rate the links . . . . . . . . . . . . . . . . . . . . 71 5.1.8 Results . . . . . . . . . . . . . . . . . . . . . . . . 72 5.1.9 Discussion . . . . . . . . . . . . . . . . . . . . . . 72 5.2 Consistency in Large-Scale RDF sources: Detection of Erroneous Links . . . . . . . . . . . . . . . . . . . . . . . 73 5.2.1 The CEDAL approach . . . . . . . . . . . . . . . 73 5.2.2 Method . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.3 Error Types and Quality Measure for Linkset Repositories . . . . . . . . . . . . . . . . . . . . . 78 5.2.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . 80 5.2.5 Experimental setup . . . . . . . . . . . . . . . . . 80 5.3 Detecting Erroneous Link candidates in Educational Link Repositories . . . . . . . . . . . . . . . . . . . . . . 85 5.3.1 The CEDAL education approach . . . . . . . . . 85 5.3.2 Research questions . . . . . . . . . . . . . . . . . 86 5.3.3 Our contributions . . . . . . . . . . . . . . . . . . 86 5.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . 86 6 querying large amount of heterogeneous rdf datasets 89 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.3 The WimuQ . . . . . . . . . . . . . . . . . . . . . . . . . 91 7.1 Identifying Datasets in Large Heterogeneous RDF Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 7.2 Relating Large Amount of RDF Datasets . . . . . . . . 101 7.3 Obtaining Similar Resources Using String Similarity . . 102 7.4 Heterogeneity in DBpedia Identifiers . . . . . . . . . . . 102 7.5 Detection of Erroneous Links in Large-Scale RDF Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 7.7 Querying Large Heterogeneous RDF Datasets . . . . . 104 info:eu-repo/classification/ddc/500 ddc:500
177	Quit diff: calculating the delta between RDF datasets under version control Arndt, Natanael, Radtke, Norman 23 June 2017 (has links) Distributed actors working on a common RDF dataset regularly encounter the issue to compare the status of one graph with another or generally to synchronize copies of a dataset. A versioning system helps to synchronize the copies of a dataset, combined with a difference calculation system it is also possible to compare versions in a log and to determine, in which version a certain statement was introduced or removed. In this demo we present Quit Diff 1, a tool to compare versions of a Git versioned quad store, while it is also applicable to simple unversioned RDF datasets. We are following an approach to abstract from differences on a syntactical level to differences on the level of the RDF data model, while we leave further semantic interpretation on the schema and instance level to specialized applications. Quit Diff can generate patches in various output formats and can be directly integrated in the distributed version control system Git which provides a foundation for a comprehensive co-evolution work flow on RDF datasets. RDF, dataset, data model info:eu-repo/classification/ddc/000 ddc:000
178	Aspekte der Kommunikation und Datenintegration in semantischen Daten-Wikis Frischmuth, Philipp 20 October 2017 (has links) Das Semantic Web, eine Erweiterung des ursprünglichen World Wide Web um eine se- mantische Schicht, kann die Integration von Informationen aus verschiedenen Datenquellen stark vereinfachen. Mit RDF und der SPARQL-Anfragesprache wurden Standards etabliert, die eine einheitliche Darstellung von strukturierten Informationen ermöglichen und diese abfragbar machen. Mit Linked Data werden diese Informationen über ein einheitliches Pro- tokoll verfügbar gemacht und es entsteht ein Netz aus Daten, anstelle von Dokumenten. In der vorliegenden Arbeit werden Aspekte einer auf solchen semantischen Technologien basierenden Datenintegration betrachtet und analysiert. Darauf aufbauend wird ein System spezifiziert und implementiert, das die Ergebnisse dieser Untersuchungen in einer konkreten Anwendung realisiert. Als Basis für die Implementierung dient OntoWiki, ein semantisches Daten-Wiki. foaf+ssl, Linked Data, RDF, Semantic Web info:eu-repo/classification/ddc/000 ddc:000
179	EAGLE - learning of link specifications using genetic programming Lyko, Klaus 13 February 2018 (has links) Um die Vision eines Linked Data Webs zu verwirklichen werden efﬁziente halbautomatische Verfahren benötigt, um Links zwischen verschiedenen Datenquellen zu generieren. Viele bekannte Link Discovery Frameworks verlangen von einem Benutzer eine Linkspeziﬁkation manuell zu erstellen, bevor der eigentliche Vergleichsprozess zum Finden dieser Links gestartet werden kann. Zwar wurden jüngst zeit- und ressourcenschonende Werkzeuge zur Ausführung von Linking-Operationen entwickelt, aber die Generierung möglichst präziser Linkspeziﬁkationen ist weiterhin ein kompliziertes Unterfangen. Diese Arbeit präsentiert EAGLE - ein Werkzeug zum halbautomatischen Lernen solcher Linkspeziﬁkationen. EAGLE erweitert das zeitefﬁziente LIMES Framework um aktive Lernalgorithmen basierend auf Methoden der Genetischen Programmierung. Ziel ist es den manuellen Arbeitsaufwand während der Generierung präziser Linkspeziﬁkationen für Benutzer zu minimieren. Das heißt insbesondere, dass die Menge an manuell annotierten Trainingsdaten minimiert werden soll. Dazu werden Batch- als auch aktive Lernalgorithmen verglichen. Zur Evaluation werden mehrere Datensätze unterschiedlichen Ursprungs und verschiedener Komplexität herangezogen. Es wird gezeigt, dass EAGLE zeitefﬁzient Linkspeziﬁkationen vergleichbarer Genauigkeit bezüglich der F-Maße gernerieren kann, während ein geringerer Umfang an Trainingsdaten für die aktiven Lernalgorithmen benötigt wird. / On the way to the Linked Data Web, efficient and semi-automatic approaches for generating links between several data sources are needed. Many common Link Discovery frameworks require a user to specify a link specification, before starting the linking process. While time-efficient approaches for executing those link specification have been developed over the last years, the discovery of accurate link specifications remains a non-trivial problem. In this thesis, we present EAGLE, a machine-learning approach for link specifications. The overall goal behind EAGLE is to limit the labeling effort for the user, while generating highly accurate link specifications. To achieve this goal, we rely on the algorithms implemented in the LIMES framework and enhance it with both batch and active learning mechanisms based on genetic programming techniques. We compare both batch and active learning and evaluate our approach on several real world datasets from different domains. We show that we can discover link specifications with f-measures comparable to other approaches while relying on a smaller number of labeled instances and requiring significantly less execution time. info:eu-repo/classification/ddc/000 ddc:000
180	Generierung von SPARQL Anfragen Cherix, Didier 19 February 2018 (has links) Semantic Web ist eine der größten aktuellen Herausforderungen in der Informatik. Um Daten einem semantischen Wert zuzuweisen, werden Ontologien benutzt. Ontologien definieren und verwalten Konzepte. Letztere beschreiben Objekte, haben Eigenschaften und was hier bedeutender ist, Relationen zueinander. Diese Konzepte und Relationen werden mit Hilfe einer Spezifikation (OWL zum Beispiel) charakterisiert. Diesen Konzepten werden Instanzen zugeordnet. Das heisst, dass beispielweise mit dem Konzept ”Physiker“ die Instanz ”Albert Einstein“ verbunden wird. Um zu erfahren, was ”Albert Einstein“ mit der Stadt Berlin verbindet, gibt es Anfragesprachen, die bekannteste ist SPARQL. Ohne Vorkenntnisse der Struktur einer Ontologie, ist es nicht möglich, präzise Anfragen zu erstellen. Die einzige Möglichkeit herauszufinden was zwei Instanzen verbindet, ist die Nutzung einer SPARQL-Anfrage mit Platzhaltern, also eine Anfrage auf Instanzebene durchzuführen. Es ist viel Aufwand nötig, um eine Anfrage auf Instanzebene zu lösen ohne vorher zu wissen, wie diese Instanzen miteinander verknüpft sein können. Um eine solche Anfrage lösen zu können, müssen alle Relationen, die die erste Instanz betreffen, verfolgt werden, und von den so erreichten Instanzen diesen Vorgang weiterführen, bis die richtige Instanz gefunden wird. Die Instanzebene, auch A-Box genannt, ist die Ebene der tatsächlichen Elemente. Sie enthält zum Beispiel: ”Berlin ist die Hauptstadt von Deutschland“ . Als erstes muss hierbei ”Berlin“ als Instanz des richtigen Konzept erkannt werden. In diesem Fall müsste also ”Berlin“ als eine Instanz des Konzept ”Stadt“ erkannt werden. Die Rückfü ̈hrung zu einem bestimmten Konzept wird in dieser Arbeit als gelöst betrachtet. OWL, RDF, Semantic Web, SPARQL info:eu-repo/classification/ddc/000 ddc:000

Search results