• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 130
  • 30
  • 14
  • 13
  • 12
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 227
  • 227
  • 106
  • 91
  • 52
  • 46
  • 38
  • 36
  • 33
  • 32
  • 31
  • 31
  • 28
  • 25
  • 23
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Linked Data Quality Assessment and its Application to Societal Progress Measurement

Zaveri, Amrapali 17 April 2015 (has links)
In recent years, the Linked Data (LD) paradigm has emerged as a simple mechanism for employing the Web as a medium for data and knowledge integration where both documents and data are linked. Moreover, the semantics and structure of the underlying data are kept intact, making this the Semantic Web. LD essentially entails a set of best practices for publishing and connecting structure data on the Web, which allows publish- ing and exchanging information in an interoperable and reusable fashion. Many different communities on the Internet such as geographic, media, life sciences and government have already adopted these LD principles. This is confirmed by the dramatically growing Linked Data Web, where currently more than 50 billion facts are represented. With the emergence of Web of Linked Data, there are several use cases, which are possible due to the rich and disparate data integrated into one global information space. Linked Data, in these cases, not only assists in building mashups by interlinking heterogeneous and dispersed data from multiple sources but also empowers the uncovering of meaningful and impactful relationships. These discoveries have paved the way for scientists to explore the existing data and uncover meaningful outcomes that they might not have been aware of previously. In all these use cases utilizing LD, one crippling problem is the underlying data quality. Incomplete, inconsistent or inaccurate data affects the end results gravely, thus making them unreliable. Data quality is commonly conceived as fitness for use, be it for a certain application or use case. There are cases when datasets that contain quality problems, are useful for certain applications, thus depending on the use case at hand. Thus, LD consumption has to deal with the problem of getting the data into a state in which it can be exploited for real use cases. The insufficient data quality can be caused either by the LD publication process or is intrinsic to the data source itself. A key challenge is to assess the quality of datasets published on the Web and make this quality information explicit. Assessing data quality is particularly a challenge in LD as the underlying data stems from a set of multiple, autonomous and evolving data sources. Moreover, the dynamic nature of LD makes assessing the quality crucial to measure the accuracy of representing the real-world data. On the document Web, data quality can only be indirectly or vaguely defined, but there is a requirement for more concrete and measurable data quality metrics for LD. Such data quality metrics include correctness of facts wrt. the real-world, adequacy of semantic representation, quality of interlinks, interoperability, timeliness or consistency with regard to implicit information. Even though data quality is an important concept in LD, there are few methodologies proposed to assess the quality of these datasets. Thus, in this thesis, we first unify 18 data quality dimensions and provide a total of 69 metrics for assessment of LD. The first methodology includes the employment of LD experts for the assessment. This assessment is performed with the help of the TripleCheckMate tool, which was developed specifically to assist LD experts for assessing the quality of a dataset, in this case DBpedia. The second methodology is a semi-automatic process, in which the first phase involves the detection of common quality problems by the automatic creation of an extended schema for DBpedia. The second phase involves the manual verification of the generated schema axioms. Thereafter, we employ the wisdom of the crowds i.e. workers for online crowdsourcing platforms such as Amazon Mechanical Turk (MTurk) to assess the quality of DBpedia. We then compare the two approaches (previous assessment by LD experts and assessment by MTurk workers in this study) in order to measure the feasibility of each type of the user-driven data quality assessment methodology. Additionally, we evaluate another semi-automated methodology for LD quality assessment, which also involves human judgement. In this semi-automated methodology, selected metrics are formally defined and implemented as part of a tool, namely R2RLint. The user is not only provided the results of the assessment but also specific entities that cause the errors, which help users understand the quality issues and thus can fix them. Finally, we take into account a domain-specific use case that consumes LD and leverages on data quality. In particular, we identify four LD sources, assess their quality using the R2RLint tool and then utilize them in building the Health Economic Research (HER) Observatory. We show the advantages of this semi-automated assessment over the other types of quality assessment methodologies discussed earlier. The Observatory aims at evaluating the impact of research development on the economic and healthcare performance of each country per year. We illustrate the usefulness of LD in this use case and the importance of quality assessment for any data analysis.
62

Data Fusion in Spatial Data Infrastructures

Wiemann, Stefan 28 October 2016 (has links)
Over the past decade, the public awareness and availability as well as methods for the creation and use of spatial data on the Web have steadily increased. Besides the establishment of governmental Spatial Data Infrastructures (SDIs), numerous volunteered and commercial initiatives had a major impact on that development. Nevertheless, data isolation still poses a major challenge. Whereas the majority of approaches focuses on data provision, means to dynamically link and combine spatial data from distributed, often heterogeneous data sources in an ad hoc manner are still very limited. However, such capabilities are essential to support and enhance information retrieval for comprehensive spatial decision making. To facilitate spatial data fusion in current SDIs, this thesis has two main objectives. First, it focuses on the conceptualization of a service-based fusion process to functionally extend current SDI and to allow for the combination of spatial data from different spatial data services. It mainly addresses the decomposition of the fusion process into well-defined and reusable functional building blocks and their implementation as services, which can be used to dynamically compose meaningful application-specific processing workflows. Moreover, geoprocessing patterns, i.e. service chains that are commonly used to solve certain fusion subtasks, are designed to simplify and automate workflow composition. Second, the thesis deals with the determination, description and exploitation of spatial data relations, which play a decisive role for spatial data fusion. The approach adopted is based on the Linked Data paradigm and therefore bridges SDI and Semantic Web developments. Whereas the original spatial data remains within SDI structures, relations between those sources can be used to infer spatial information by means of Semantic Web standards and software tools. A number of use cases were developed, implemented and evaluated to underpin the proposed concepts. Particular emphasis was put on the use of established open standards to realize an interoperable, transparent and extensible spatial data fusion process and to support the formalized description of spatial data relations. The developed software, which is based on a modular architecture, is available online as open source. It allows for the development and seamless integration of new functionality as well as the use of external data and processing services during workflow composition on the Web. / Die Entwicklung des Internet im Laufe des letzten Jahrzehnts hat die Verfügbarkeit und öffentliche Wahrnehmung von Geodaten, sowie Möglichkeiten zu deren Erfassung und Nutzung, wesentlich verbessert. Dies liegt sowohl an der Etablierung amtlicher Geodateninfrastrukturen (GDI), als auch an der steigenden Anzahl Communitybasierter und kommerzieller Angebote. Da der Fokus zumeist auf der Bereitstellung von Geodaten liegt, gibt es jedoch kaum Möglichkeiten die Menge an, über das Internet verteilten, Datensätzen ad hoc zu verlinken und zusammenzuführen, was mitunter zur Isolation von Geodatenbeständen führt. Möglichkeiten zu deren Fusion sind allerdings essentiell, um Informationen zur Entscheidungsunterstützung in Bezug auf raum-zeitliche Fragestellungen zu extrahieren. Um eine ad hoc Fusion von Geodaten im Internet zu ermöglichen, behandelt diese Arbeit zwei Themenschwerpunkte. Zunächst wird eine dienstebasierten Umsetzung des Fusionsprozesses konzipiert, um bestehende GDI funktional zu erweitern. Dafür werden wohldefinierte, wiederverwendbare Funktionsblöcke beschrieben und über standardisierte Diensteschnittstellen bereitgestellt. Dies ermöglicht eine dynamische Komposition anwendungsbezogener Fusionsprozesse über das Internet. Des weiteren werden Geoprozessierungspatterns definiert, um populäre und häufig eingesetzte Diensteketten zur Bewältigung bestimmter Teilaufgaben der Geodatenfusion zu beschreiben und die Komposition und Automatisierung von Fusionsprozessen zu vereinfachen. Als zweiten Schwerpunkt beschäftigt sich die Arbeit mit der Frage, wie Relationen zwischen Geodatenbeständen im Internet erstellt, beschrieben und genutzt werden können. Der gewählte Ansatz basiert auf Linked Data Prinzipien und schlägt eine Brücke zwischen diensteorientierten GDI und dem Semantic Web. Während somit Geodaten in bestehenden GDI verbleiben, können Werkzeuge und Standards des Semantic Web genutzt werden, um Informationen aus den ermittelten Geodatenrelationen abzuleiten. Zur Überprüfung der entwickelten Konzepte wurde eine Reihe von Anwendungsfällen konzipiert und mit Hilfe einer prototypischen Implementierung umgesetzt und anschließend evaluiert. Der Schwerpunkt lag dabei auf einer interoperablen, transparenten und erweiterbaren Umsetzung dienstebasierter Fusionsprozesse, sowie einer formalisierten Beschreibung von Datenrelationen, unter Nutzung offener und etablierter Standards. Die Software folgt einer modularen Struktur und ist als Open Source frei verfügbar. Sie erlaubt sowohl die Entwicklung neuer Funktionalität durch Entwickler als auch die Einbindung existierender Daten- und Prozessierungsdienste während der Komposition eines Fusionsprozesses.
63

Automating Geospatial RDF Dataset Integration and Enrichment / Automatische geografische RDF Datensatzintegration und Anreicherung

Sherif, Mohamed Ahmed Mohamed 12 December 2016 (has links) (PDF)
Over the last years, the Linked Open Data (LOD) has evolved from a mere 12 to more than 10,000 knowledge bases. These knowledge bases come from diverse domains including (but not limited to) publications, life sciences, social networking, government, media, linguistics. Moreover, the LOD cloud also contains a large number of crossdomain knowledge bases such as DBpedia and Yago2. These knowledge bases are commonly managed in a decentralized fashion and contain partly verlapping information. This architectural choice has led to knowledge pertaining to the same domain being published by independent entities in the LOD cloud. For example, information on drugs can be found in Diseasome as well as DBpedia and Drugbank. Furthermore, certain knowledge bases such as DBLP have been published by several bodies, which in turn has lead to duplicated content in the LOD . In addition, large amounts of geo-spatial information have been made available with the growth of heterogeneous Web of Data. The concurrent publication of knowledge bases containing related information promises to become a phenomenon of increasing importance with the growth of the number of independent data providers. Enabling the joint use of the knowledge bases published by these providers for tasks such as federated queries, cross-ontology question answering and data integration is most commonly tackled by creating links between the resources described within these knowledge bases. Within this thesis, we spur the transition from isolated knowledge bases to enriched Linked Data sets where information can be easily integrated and processed. To achieve this goal, we provide concepts, approaches and use cases that facilitate the integration and enrichment of information with other data types that are already present on the Linked Data Web with a focus on geo-spatial data. The first challenge that motivates our work is the lack of measures that use the geographic data for linking geo-spatial knowledge bases. This is partly due to the geo-spatial resources being described by the means of vector geometry. In particular, discrepancies in granularity and error measurements across knowledge bases render the selection of appropriate distance measures for geo-spatial resources difficult. We address this challenge by evaluating existing literature for point set measures that can be used to measure the similarity of vector geometries. Then, we present and evaluate the ten measures that we derived from the literature on samples of three real knowledge bases. The second challenge we address in this thesis is the lack of automatic Link Discovery (LD) approaches capable of dealing with geospatial knowledge bases with missing and erroneous data. To this end, we present Colibri, an unsupervised approach that allows discovering links between knowledge bases while improving the quality of the instance data in these knowledge bases. A Colibri iteration begins by generating links between knowledge bases. Then, the approach makes use of these links to detect resources with probably erroneous or missing information. This erroneous or missing information detected by the approach is finally corrected or added. The third challenge we address is the lack of scalable LD approaches for tackling big geo-spatial knowledge bases. Thus, we present Deterministic Particle-Swarm Optimization (DPSO), a novel load balancing technique for LD on parallel hardware based on particle-swarm optimization. We combine this approach with the Orchid algorithm for geo-spatial linking and evaluate it on real and artificial data sets. The lack of approaches for automatic updating of links of an evolving knowledge base is our fourth challenge. This challenge is addressed in this thesis by the Wombat algorithm. Wombat is a novel approach for the discovery of links between knowledge bases that relies exclusively on positive examples. Wombat is based on generalisation via an upward refinement operator to traverse the space of Link Specifications (LS). We study the theoretical characteristics of Wombat and evaluate it on different benchmark data sets. The last challenge addressed herein is the lack of automatic approaches for geo-spatial knowledge base enrichment. Thus, we propose Deer, a supervised learning approach based on a refinement operator for enriching Resource Description Framework (RDF) data sets. We show how we can use exemplary descriptions of enriched resources to generate accurate enrichment pipelines. We evaluate our approach against manually defined enrichment pipelines and show that our approach can learn accurate pipelines even when provided with a small number of training examples. Each of the proposed approaches is implemented and evaluated against state-of-the-art approaches on real and/or artificial data sets. Moreover, all approaches are peer-reviewed and published in a conference or a journal paper. Throughout this thesis, we detail the ideas, implementation and the evaluation of each of the approaches. Moreover, we discuss each approach and present lessons learned. Finally, we conclude this thesis by presenting a set of possible future extensions and use cases for each of the proposed approaches.
64

Konzeption eines RDF-Vokabulars für die Darstellung von COUNTER-Nutzungsstatistiken: innerhalb des Electronic Resource Management Systems der Universitätsbibliothek Leipzig

Domin, Annika 04 July 2014 (has links)
Die vorliegende Masterarbeit dokumentiert die Erstellung eines RDF-basierten Vokabulars zur Darstellung von Nutzungsstatistiken elektronischer Ressourcen, die nach dem COUNTER-Standard erstellt wurden. Die konkrete Anwendung dieses Vokabulars bildet das Electronic Resource Management System (ERMS), welches momentan von der Universitätsbibliothek Leipzig im Rahmen des kooperativen Projektes AMSL entwickelt wird. Dieses basiert auf Linked Data, soll die veränderten Verwaltungsprozesse elektronischer Ressourcen abbilden können und gleichzeitig anbieterunabhängig und flexibel sein. Das COUNTER-Vokabular soll aber auch über diese Anwendung hinaus einsetzbar sein. Die Arbeit gliedert sich in die beiden Teile Grundlagen und Modellierung. Im ersten Teil wird zu nächst die bibliothekarische Notwendigkeit von ERM-Systemen herausgestellt und der Fokus der Betrachtung auf das Teilgebiet der Nutzungsstatistiken und die COUNTER-Standardisierung gelenkt. Anschließend werden die technischen Grundlagen der Modellierung betrachtet, um die Arbeit auch für nicht mit Linked Data vertraute Leser verständlich zu machen. Darauf folgt der Modellierungsteil, der mit einer Anforderungsanalyse sowie der Analyse des den COUNTER-Dateien zugrunde liegenden XML-Schemas beginnt. Daran schließt sich die Modellierung des Vokabulars mit Hilfe von RDFS und OWL an. Aufbauend auf angestellten Überlegungen zur Übertragung von XML-Statistiken nach RDF und der Vergabe von URIs werden anschließend reale Beispieldateien manuell konvertiert und in einem kurzen Test erfolgreich überprüft. Den Abschluss bilden ein Fazit der Arbeit sowie ein Ausblick auf das weitere Verfahren mit den Ergebnissen. Das erstellte RDF-Vokabular ist bei GitHub unter der folgenden URL zur Weiterverwendung hinterlegt: https://github.com/a-nnika/counter.vocab:Inhaltsverzeichnis Abbildungsverzeichnis 6 Tabellenverzeichnis 7 Abkürzungsverzeichnis 8 1 Einleitung 9 1.1 Problematik, Ziel und Abgrenzung 9 1.2 Zielgruppe, Methodik und Aufbau 11 1.3 Forschungsstand und Quellenlage 13 TEIL I - Grundlagen 17 2 Bibliothekarische Ausgangssituation 18 2.1 Electronic Resource Management 18 2.2 Nutzungsdaten elektronischer Ressourcen 20 2.3 Projekt AMSL 23 3 Technischer Hintergrund 26 3.1 XML 26 3.2 Linked Data und Semantic Web 27 3.3 Grundkonzepte der Modellierung 29 3.4 RDF 30 3.4.1 Datenmodell 30 3.4.2 Serialisierungen 34 3.5 RDFS 36 3.6 OWL 38 TEIL II - Modellierung 41 4 Vorarbeiten 42 4.1 Anforderungsanalyse 42 4.2 Analyse des COUNTER XML-Schemas 45 4.2.1 Grundstruktur 45 4.2.2 Details 48 4.3 Grundkonzeption 54 4.4 Verwendete Programme 56 4.4.1 Notepad++ 56 4.4.2 Raptor 58 4.4.3 OntoWiki 59 5 Realisierung des RDF-Vokabulars 61 5.1 Grundlegende Modellierung: RDFS 61 5.2 Erweiterung: OWL 70 5.3 Übertragung von XML-Daten nach RDF 75 5.4 URI-Vergabe 78 6 Test des Vokabulars 83 6.1 Planung des Tests 83 6.2 Erstellung von Testdatensätzen 85 6.3 Testergebnisse 87 7 Fazit und Ausblick 90 Literatur- und Quellenverzeichnis 93 Selbstständigkeitserklärung 101 Anhänge I
65

A semantic framework for social search / Un cadre de développement sémantique pour la recherche sociale

Stan, Johann 09 November 2011 (has links)
Cette thèse présente un système permettant d’extraire les interactions partagées dans les réseaux sociaux et de construire un profil dynamique d’expertise pour chaque membre dudit réseau social. La difficulté principale dans cette partie est l’analyse de ces interactions, souvent très courtes et avec peu de structure grammaticale et linguistique. L’approche que nous avons mis en place propose de relier les termes importants de ces messages à des concepts dans une base de connaissance sémantique, type Linked Data. Cette connexion permet en effet d’enrichir le champ sémantique des messages en exploitant le voisinage sémantique du concept dans la base de connaissances. Notre première contribution dans ce contexte est un algorithme qui permet d'effectuer cette liaison avec une précision plus augmentée par rapport à l’état de l’art, en considérant le profil de l’utilisateur ainsi que les messages partagés dans la communauté dont il est membre comme source supplémentaire de contexte. La deuxième étape de l’analyse consiste à effectuer l’expansion sémantique du concept en exploitant les liens dans la base de connaissance. Notre algorithme utilise une heuristique basant sur le calcul de similarité entre les descriptions des concepts pour ne garder que ceux les plus pertinents par rapport au profil de l’utilisateur. Les deux algorithmes mentionnés précédemment permettent d’avoir un ensemble de concepts qui illustrent les centres d'expertise de l’utilisateur. Afin de mesurer le degré d'expertise de l’utilisateur qui s’applique sur chaque concept dans son profil, nous appliquons la méthode-standard vectoriel et associons à chaque concept une mesure composée de trois éléments : (i) le tf-idf, (ii) le sentiment moyen que l’utilisateur exprime par rapport au dit concept et (iii) l’entropie moyen des messages partagés contenant ledit concept. L’ensemble des trois mesures combinées permet d’avoir un poids unique associé à chaque concept du profil. Ce modèle de profil vectoriel permet de trouver les « top-k » profils les plus pertinents par rapport à une requête. Afin de propager ces poids sur les concepts dans l’expansion sémantique, nous avons appliqué un algorithme de type propagation sous contrainte (Constrained Spreading Activation), spécialement adapté à la structure d'un graphe sémantique. L’application réalisée pour prouver l’efficacité de notre approche, ainsi que d’illustrer la stratégie de recommandation est un système disponible en ligne, nommé « The Tagging Beak » (http://www.tbeak.com). Nous avons en effet développé une stratégie de recommandation type Q&A (question - réponse), où les utilisateurs peuvent poser des questions en langage naturel et le système recommande des personnes à contacter ou à qui se connecter pour être notifié de nouveaux messages pertinents par rapport au sujet de la question / In recent years, online collaborative environments, e.g. social content sites (such as Twitter or Facebook) have significantly changed the way people share information and interact with peers. These platforms have become the primary common environment for people to communicate about their activity and their information needs and to maintain and create social ties. Status updates or microposts emerged as a convenient way for people to share content frequently without a long investment of time. Some social platforms even limit the length of a “post”. A post generally consists of a single sentence (e.g. news, a question), it can include a picture, a hyperlink, tags or other descriptive data (metadata). Contrarily to traditional documents, posts are informal (with no controlled vocabulary) and don't have a well established structure. Social platforms can become so popular (huge number of users and posts), that it becomes difficult to find relevant information in the flow of notifications. Therefore, organizing this huge quantity of social information is one of the major challenges of such collaborative environments. Traditional information retrieval techniques are not well suited for querying such corpus, because of the short size of the share content, the uncontrolled vocabulary used by author and because these techniques don't take in consideration the ties in-between people. Also, such techniques tend to find the documents that best match a query, which may not be sufficient in the context of social platform where the creation of new connections in the platform has a motivating impact and where the platform tries to keep on-going participation. A new information retrieval paradigm, social search has been introduced as a potential solution to this problem. This solution consists of different strategies to leverage user generated content for information seeking, such as the recommendation of people. However, existing strategies have limitations in the user profile construction process and in the routing of queries to the right people identified as experts. More concretely, the majority of user profiles in such systems are keyword-based, which is not suited for the small size and the informal aspect of the posts. Secondly, expertise is measured only based on statistical scoring mechanisms, which do not take into account the fact that people on social platforms will not precisely consume the results of the query, but will aim to engage into a conversation with the expert. Also a particular focus needs to be done on privacy management, where still traditional methods initially designed for databases are used without taking into account the social ties between people. In this thesis we propose and evaluate an original framework for the organization and retrieval of information in social platforms. Instead of retrieving content that best matches a user query, we retrieve people who have expertise and are most motivated to engage in conversations on its topics. We propose to build dynamically profiles for users based on their interactions in the social platform. The construction of such profiles requires the capture of interactions (microposts), their analysis and the extraction and understanding of their topics. In order to build a more meaningful profile, we leverage Semantic Web Technologies and more specifically, Linked Data, for the transformation of microposts topics into semantic concepts. Our thesis contributes to several fields related to the organization, management and retrieval of information in collaborative environments and to the fields of social computing and human-computer interaction
66

Metadatenbasierte Kontextualisierung architektonischer 3D-Modelle

Blümel, Ina 18 December 2013 (has links)
Digitale 3D-Modelle der Architektur haben innerhalb der letzten fünf Jahrzehnte sowohl die analogen, auf Papier basierenden Zeichnungen als auch die physischen Modelle aus ihrer planungs-, ausführungs- und dokumentationsunterstützenden Rolle verdrängt. Als Herausforderungen bei der Integration von 3D-Modellen in digitale Bibliotheken und Archive sind zunächst die meist nur rudimentäre Annotation mit Metadaten seitens der Autoren und die nur implizit in den Modellen vorhandenen Informationen zu nennen. Aus diesen Defiziten resultiert ein aktuell starkes Interesse an inhaltsbasierter Erschließung durch vernetzte Nutzergruppen oder durch automatisierte Verfahren, die z.B. aufgrund von Form- oder Strukturmerkmalen eine automatische Kategorisierung von 3D-Modellen anhand gegebener Schemata ermöglichen. Die teilweise automatische Erkennung von objektinhärenter Semantik vergrößert die Menge an diskreten und semantisch unterscheidbaren Einheiten. 3D-Modelle als Content im World Wide Web können sowohl untereinander als auch mit anderen textuellen wie nichttextuellen Objekten verknüpft werden, also Teil von aggregierten Dokumenten sein. Die Aggregationen bzw. der Modellkontext sowie die inhärenten Entitäten erfordern Instrumente der Organisation, um dem Benutzer bei der Suche nach Informationen einen Mehrwert zu bieten, insbesondere dann, wenn textbasiert nach Informationen zum Modell und zu dessen Kontext gesucht wird. In der vorliegenden Arbeit wird ein Metadatenmodell zur gezielten Strukturierung von Information entwickelt, welche aus 3D-Architekturmodellen gewonnen wird. Mittels dieser Strukturierung kann das Modell mit weiterer Information vernetzt werden. Die Anwendung etablierter Ontologien sowie der Einsatz von URIs machen die Informationen nicht nur explizit, sondern beinhalten auch eine semantische Information über die Relation selbst, sodass eine Interoperabilität zu anderen verfügbaren Daten im Sinne der Grundprinzipien des Linked-Data-Ansatzes gewährleistet wird. / Digital 3D models from the domain of architecture have replaced analogue paper-based drawings as well as haptic scale models bit by bit during the last five decades. The main challenges for integrating 3D models in digital libraries and archives are posed by mostly only sparse annotation with metadata provided by the author and the fact that information is only implicitly available. This has recently led to an increased interest in context-based indexing using automatic approaches as well as social tagging. Computer based approaches usually rely on methods from artificial intelligence including machine learning for automated categorization based on geometric and structural properties according to a given classification scheme. The partially automated recognition of model-inherent semantics increases the number of discrete and semantically distinguishable entities. 3D models as parts of the World Wide Web can be interlinked which each other. Aggregations as well as the model context along with inherent entities require efficient tools for organization in order to provide real additional benefits for the user during its quest for information. Especially for text-based search on information about a 3D model and its context, a metadata model is an indispensable tool regarding the above described challenges. In this work we develop a metadata model for specific structuring of information, which is obtained from 3D architectural models. Using this structure, the model can be linked to further information. The application of established ontologies and the use of URIs make the information not only explicitly, but also provide semantic information about the relation itself. By that, interoperability according to the principles of the LOD approach is guaranteed.
67

Querying a Web of Linked Data

Hartig, Olaf 28 July 2014 (has links)
In den letzten Jahren haben sich spezielle Prinzipien zur Veröffentlichung strukturierter Daten im World Wide Web (WWW) etabliert. Diese Prinzipien erlauben es, von den jeweils angebotenen Daten auf weitere, nach den selben Prinzipien veröffentlichten Daten zu verweisen. Die daraus resultierende Form von Web-Daten wird entsprechend als Linked Data bezeichnet. Mit der Veröffentlichung von Linked Data im WWW entsteht ein sehr großer Datenraum, welcher Daten verschiedenster Anbieter miteinander verbindet und neuartige Möglichkeiten für Web-basierte Anwendungen bietet. Als Basis für die Entwicklung solcher Anwendungen haben mehrere Forschungsgruppen begonnen, Ansätze zu untersuchen, welche diesen Datenraum als eine Art verteilte Datenbank auffassen und die Ausführung deklarativer Anfragen über dieser Datenbank ermöglichen. Forschungsarbeit zu theoretischen Grundlagen der untersuchten Ansätze fehlt jedoch nahezu vollständig. Die vorliegende Dissertation schließt diese Lücke. / During recent years a set of best practices for publishing and connecting structured data on the World Wide Web (WWW) has emerged. These best practices are referred to as the Linked Data principles and the resulting form of Web data is called Linked Data. The increasing adoption of these principles has lead to the creation of a globally distributed space of Linked Data that covers various domains such as government, libraries, life sciences, and media. Approaches that conceive this data space as a huge distributed database and enable an execution of declarative queries over this database hold an enormous potential; they allow users to benefit from a virtually unbounded set of up-to-date data. As a consequence, several research groups have started to study such approaches. However, the main focus of existing work is to address practical challenges that arise in this context. Research on the foundations of such approaches is largely missing. This dissertation closes this gap.
68

Linked Enterprise Data als semantischer, integrierter Informationsraum für die industrielle Datenhaltung / Linked Enterprise Data as semantic and integrated information space for industrial data

Graube, Markus 01 June 2018 (has links) (PDF)
Zunehmende Vernetzung und gesteigerte Flexibilität in Planungs- und Produktionsprozessen sind die notwendigen Antworten auf die gesteigerten Anforderungen an die Industrie in Bezug auf Agilität und Einführung von Mehrwertdiensten. Dafür ist eine stärkere Digitalisierung aller Prozesse und Vernetzung mit den Informationshaushalten von Partnern notwendig. Heutige Informationssysteme sind jedoch nicht in der Lage, die Anforderungen eines solchen integrierten, verteilten Informationsraums zu erfüllen. Ein vielversprechender Kandidat ist jedoch Linked Data, das aus dem Bereich des Semantic Web stammt. Aus diesem Ansatz wurde Linked Enterprise Data entwickelt, welches die Werkzeuge und Prozesse so erweitert, dass ein für die Industrie nutzbarer und flexibler Informationsraum entsteht. Kernkonzept dabei ist, dass die Informationen aus den Spezialwerkzeugen auf eine semantische Ebene gehoben, direkt auf Datenebene verknüpft und für Abfragen sicher bereitgestellt werden. Dazu kommt die Erfüllung industrieller Anforderungen durch die Bereitstellung des Revisionierungswerkzeugs R43ples, der Integration mit OPC UA über OPCUA2LD, der Anknüpfung an industrielle Systeme (z.B. an COMOS), einer Möglichkeit zur Modelltransformation mit SPARQL sowie feingranularen Informationsabsicherung eines SPARQL-Endpunkts. / Increasing collaboration in production networks and increased flexibility in planning and production processes are responses to the increased demands on industry regarding agility and the introduction of value-added services. A solution is the digitalisation of all processes and a deeper connectivity to the information resources of partners. However, today’s information systems are not able to meet the requirements of such an integrated, distributed information space. A promising candidate is Linked Data, which comes from the Semantic Web area. Based on this approach, Linked Enterprise Data was developed, which expands the existing tools and processes. Thus, an information space can be created that is usable and flexible for the industry. The core idea is to raise information from legacy tools to a semantic level, link them directly on the data level even across organizational boundaries, and make them securely available for queries. This includes the fulfillment of industrial requirements by the provision of the revision tool R43ples, the integration with OPC UA via OPCUA2LD, the connection to industrial systems (for example to COMOS), a possibility for model transformation with SPARQL as well as fine granular information protection of a SPARQL endpoint.
69

[en] RDXEL: A TOOLKIT FOR RDF STATISTICAL DATA MANIPULATION THROUGH SPREADSHEETS / [pt] RDXEL: UM CONJUNTO DE FERRAMENTAS PARA MANIPULAÇÃO DE DADOS ESTATÍSTICOS EM RDF POR MEIO DE PLANILHAS

MARCIA LUCAS PESCE 03 May 2016 (has links)
[pt] Dados estatísticos são uma das mais importantes fontes de informação para atividades humanas e organizações. No entanto, o acesso, consulta e correlação deste tipo de dados demanda grande esforço, principalmente em situações que envolvem diferentes organizações. Soluções que facilitem o acesso e a integração de grandes bases de dados analíticos, desta forma, agregam muito valor a este cenário. Neste trabalho propomos um arcabouço de software que permite com que dados estatísticos sejam eficientemente transformados e representados no formato de triplas RDF. Utilizando como base o DataCube Vocabulary, padrão W3C para o processo de triplificação de informações, a solução proposta facilita a consulta, análise, e reuso dos dados quando no formato RDF. O processo inverso, RDF para Excel, também é suportado, de modo a oferecer uma solução para a integração e consumo de dados RDF a partir de planilha. / [en] Statistical data represent one of the most important sources of information both for humans and organizations alike. However, accessing, querying and correlating statistical data demand a great deal of effort, especially in situations that involve different organizations. Therefore, solutions to facilitate the manipulation and integration of large statistical databases add value to this scenario. In this dissertation we propose a framework that allows statistical data to be efficiently processed and represented as RDF triples. Based on the DataCube Vocabulary, W3C s triplification standard, the proposed solution makes it easy to query, analyze, and reuse statistical data in RDF format. The reverse process, RDF for Excel, is also supported, so as to offer a solution for the integration and use of RDF data in spreadsheets.
70

Methodology for Conflict Detection and Resolution in Semantic Revision Control Systems

Hensel, Stephan, Graube, Markus, Urbas, Leon 11 November 2016 (has links) (PDF)
Revision control mechanisms are a crucial part of information systems to keep track of changes. It is one of the key requirements for industrial application of technologies like Linked Data which provides the possibility to integrate data from different systems and domains in a semantic information space. A corresponding semantic revision control system must have the same functionality as established systems (e.g. Git or Subversion). There is also a need for branching to enable parallel work on the same data or concurrent access to it. This directly introduces the requirement of supporting merges. This paper presents an approach which makes it possible to merge branches and to detect inconsistencies before creating the merged revision. We use a structural analysis of triple differences as the smallest comparison unit between the branches. The differences that are detected can be accumulated to high level changes, which is an essential step towards semantic merging. We implemented our approach as a prototypical extension of therevision control system R43ples to show proof of concept.

Page generated in 0.0322 seconds