• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 188
  • 27
  • 27
  • 21
  • 20
  • 9
  • 7
  • 6
  • 5
  • 5
  • 3
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 333
  • 146
  • 123
  • 108
  • 81
  • 67
  • 63
  • 56
  • 54
  • 51
  • 49
  • 46
  • 37
  • 35
  • 34
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
311

Research Ontology Data Models for Data and Metadata Exchange Repository

Kamenieva, Iryna January 2009 (has links)
For researches in the field of the data mining and machine learning the necessary condition is an availability of various input data set. Now researchers create the databases of such sets. Examples of the following systems are: The UCI Machine Learning Repository, Data Envelopment Analysis Dataset Repository, XMLData Repository, Frequent Itemset Mining Dataset Repository. Along with above specified statistical repositories, the whole pleiad from simple filestores to specialized repositories can be used by researchers during solution of applied tasks, researches of own algorithms and scientific problems. It would seem, a single complexity for the user will be search and direct understanding of structure of so separated storages of the information. However detailed research of such repositories leads us to comprehension of deeper problems existing in usage of data. In particular a complete mismatch and rigidity of data files structure with SDMX - Statistical Data and Metadata Exchange - standard and structure used by many European organizations, impossibility of preliminary data origination to the concrete applied task, lack of data usage history for those or other scientific and applied tasks. Now there are lots of methods of data miming, as well as quantities of data stored in various repositories. In repositories there are no methods of DM (data miming) and moreover, methods are not linked to application areas. An essential problem is subject domain link (problem domain), methods of DM and datasets for an appropriate method. Therefore in this work we consider the building problem of ontological models of DM methods, interaction description of methods of data corresponding to them from repositories and intelligent agents allowing the statistical repository user to choose the appropriate method and data corresponding to the solved task. In this work the system structure is offered, the intelligent search agent on ontological model of DM methods considering the personal inquiries of the user is realized. For implementation of an intelligent data and metadata exchange repository the agent oriented approach has been selected. The model uses the service oriented architecture. Here is used the cross platform programming language Java, multi-agent platform Jadex, database server Oracle Spatial 10g, and also the development environment for ontological models - Protégé Version 3.4.
312

Rozhraní pro aspektové vyhledávání v indexu Wikipedie / Interfaces for Faceted Search in Indexed Wikipedia

Cilip, Peter January 2018 (has links)
Main aim of this thesis is to study existing systems of faceted search and to design own system based on faceted search in the index of Wikipedia. In this thesis we can meet with existing solutions of faceted search. From mistakes and failures of existing solutions was designed our own system, that is output of this thesis. Designed system is described in way of design and implementation. Product of thesis is application and graphical interface. Application interface can be integrated into existing informational system, where it can be used as multidimensional filter. Graphical interface provides option how can application interface be used in real system. System was created focusing on usefullness and simplicity, for using in existing information systems.
313

Digitální knihovna / Digital Library

Krbeček, Daniel January 2008 (has links)
The thesis contains basic information about image documents digitalization. A brief list of common used standards in Czech republic is shown. The standards can be used in description of digitalized documents by institutions such as libraries, scientific departments and universities. The thesis specifically solves the dilemma of the preservation and the accessing of B.P.Molls large map collection stored in Moravian Library in Brno city. It analyses step by step the characteristics of the saved documents, style of their interlacing and data representation. In terms of deposition and manipulation it comes with description list of open-source digital libraries and it chooses the Fedora repository. It solves methods of object-model implementation while using this digital library. The functional parts are web presentation of the mentioned map collection and an effectiveness test showing large-scale maps using the flash Zoomify browser. Web presentation uses the repository services as often as possible, and thus allows searching and searching through the bibliographic records of the presented documents. The end of the thesis sums up the obtained results and presents the incoming development course of presentation and popularization of the map collection.
314

Studentenkonferenz Informatik Leipzig 2011: Leipzig, Deutschland, 12. Dezember 2011Tagungsband

Auer, Sören, Riechert, Thomas, Schmidt, Johannes 18 April 2012 (has links)
Die Studentenkonferenz Informatik Leipzig 2011 bietet die Möglichkeit, die Identifikation für das Studienfach Informatik und die Begeisterung für IT-Themen allgemein bei Studierenden zu wecken. Bei der Studentenkonferenz reichten Studierende kurze Artikel über Studien-, Abschlussarbeiten oder in der Freizeit absolvierte informatikrelevante Projekte ein. Andere Studierende, Doktoranden und wissenschaftliche Mitarbeiter der Leipziger Hochschulen bewerteten und diskutierten die eingereichten Arbeiten. Interessante und gut ausgearbeitete Einreichungen wurden zur Präsentation auf der Konferenz angenommen. Dieses Buch beinhaltet die überarbeiteten Beiträge der studentischen Autoren. Eine Studentenkonferenz unterscheidet sich kaum von einer anderen wissenschaftlichen Konferenz. Die Themenvielfalt kann allerdings durch die Breite der vertretenen Themen größer sein und die wissenschaftliche Innovation ist bei der Bewertung der Arbeiten nicht immer das primäre Kriterium. Eine Studentenkonferenz hilft, das kreative Potential von Studierenden besser sichtbar zu machen und Studierende für die Informatik und die Forschung zu begeistern. Außerdem stärkt sie den Austausch zwischen verschiedenen Disziplinen innerhalb der Informatik und fördert insbesondere das gegenseitige Verständnis von Lehrkräften und Studierenden. In diesem Jahr wurde am Institut für Angewandte Informatik (InfAI) e.V. zum zweiten Mal die Studentenkonferenz Informatik Leipzig (SKIL 2011) organisiert. Initiiert und maßgeblich organisiert wurde die SKIL 2011 von den Forschungsgruppen Agile Knowledge Engineering and Semantic Web (AKSW) und Service Science and Technology (SeSaT) der Universität Leipzig. Die Konferenz fand am 02. Dezember 2011 in Leipzig statt.:TriplePlace: A flexible triple store for Android with six indices Natanael Arndt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Entwicklung von IR-Algorithmen zur automatischen Bewertung von Krankenversicherungstarifen Stefan Veit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 CoVi - a JAVA application to explore Human Disease Networks Klaus Lyko, Victor Christen und Anastasia Chyhir . . . . . . . . . . . . . . . . . 17 Realisierung eines RDF-Interfaces für die Neue Deutsche Biographie Martin Brümmer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Methoden zur Aufwandsschätzung von Softwareprojekten und deren Zuverlässigkeit Florian Pilz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Volumendifferenzmessung an medizinischen Oberflächenbilddaten Henry Borasch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 rdf2wp - Publikation von Daten als RDF mittels Wordpressblog Johannes Frey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Simulating the Spread of Epidemics in Real-world Trading Networks using OpenCL Martin Clauß . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Entwicklung eines Managementsystems für bibliographische Einträge auf Basis von WordPress am Beispiel der Lutherbibliographie Thomas Schöne . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Entwicklung eines Programms zur dynamischen Nutzung freier Ressourcen von Workstations Michael Schmidt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Autorenverzeichnis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
315

Comparative study of open source and dot NET environments for ontology development.

Mahoro, Leki Jovial 05 1900 (has links)
M. Tech. (Department of Information & Communication Technology, Faculty of Applied and Computer Sciences), Vaal University of Technology. / Many studies have evaluated and compared the existing open-sources Semantic Web platforms for ontologies development. However, none of these studies have included the dot NET-based semantic web platforms in the empirical investigations. This study conducted a comparative analysis of open-source and dot NET-based semantic web platforms for ontologies development. Two popular dot NET-based semantic web platforms, namely, SemWeb.NET and dotNetRDF were analyzed and compared against open-source environments including Jena Application Programming Interface (API), Protégé and RDF4J also known as Sesame Software Development Kit (SDK). Various metrics such as storage mode, query support, consistency checking, interoperability with other tools, and many more were used to compare two categories of platforms. Five ontologies of different sizes are used in the experiments. The experimental results showed that the open-source platforms provide more facilities for creating, storing and processing ontologies compared to the dot NET-based tools. Furthermore, the experiments revealed that Protégé and RDF4J open-source and dotNetRDF platforms provide both graphical user interface (GUI) and command line interface for ontologies processing, whereas, Jena open-source and SemWeb.NET are command line platforms. Moreover, the results showed that the open-source platforms are capable of processing multiple ontologies’ files formats including Resource Description Framework (RDF) and Ontology Web Language (OWL) formats, whereas, the dot NET-based tools only process RDF ontologies. Finally, the experiment results indicate that the dot NET-based platforms have limited memory size as they failed to load and query large ontologies compared to open-source environments.
316

Transformace webových aplikací na webové služby / Transformation of Web Applications into Web Services

Zámečník, Miroslav January 2008 (has links)
Present web is aiming to the possibility of automatization of user behavior on web applications. Adding of semantics and creation of web service interface are the main approaches for accomplishment of this user comfort. Nevertheless, this direction brings some problems which can make more difficult publishing and implementation of web documents. Web services can connect heterogeneous systems, because they are based on XML markup language that is a place where all applications can meet without lost of platform independence. The automatic transformation of a web application into a web service could be considerably more effective than to create a web service from the beginning. However, this step is for some applications almost unreal without knowledge of their inner structure. In most cases, the transformation will be done semiautomatically with help of human decisions.
317

Semi-Automatic Mapping of Structured Data to Visual Variables / Halbautomatische Abbildung von strukturierten Daten auf Visuelle Variablen

Polowinski, Jan 09 April 2013 (has links) (PDF)
While semantic web data is machine-understandable and well suited for advanced filtering, in its raw representation it is not conveniently understandable to humans. Therefore, visualization is needed. A core challenge when visualizing the structured but heterogeneous data turned out to be a flexible mapping to Visual Variables. This work deals with a highly flexible, semi-automatic solution with a maximum support of the visualization process, reducing the mapping possibilities to a useful subset. The basis for this is knowledge, concerning metrics and structure of the data on the one hand and available visualization structures, platforms and common graphical facts on the other hand — provided by a novel basic visualization ontology. A declarative, platform-independent mapping vocabulary and a framework was developed, utilizing current standards from the semantic web and the Model-Driven Architecture (MDA). / Während Semantic-Web-Daten maschinenverstehbar und hervorragend filterbar sind, sind sie — in ihrer Rohform — nicht leicht von Menschen verstehbar. Eine Visualisierung der Daten ist deshalb notwendig. Die Kernherausforderung dabei ist eine flexible Abbildung der strukturierten aber heterogenen Daten auf Visuelle Variablen. Diese Arbeit beschreibt eine hochflexible halbautomatische Lösung bei maximaler Unterstützung des Visualisierungsprozesses, welcher die Abbildungsmöglichkeiten, aus denen der Nutzer zu wählen hat, auf eine sinnvolle Teilmenge reduziert. Die Grundlage dafür sind einerseits Metriken und das Wissen über die Struktur der Daten und andererseits das Wissen über verfügbare Visualisierungsstrukturen, -plattformen und bekannte grafische Fakten, welche durch eine neuentwickelte Visualisierungsontologie bereitgestellt werden. Basierend auf Standards des Semantic Webs und der Model-getriebenen Architektur, wurde desweiteren ein deklaratives, plattformunabhängiges Visualisierungsvokabular und -framework entwickelt.
318

Integrating Natural Language Processing (NLP) and Language Resources Using Linked Data

Hellmann, Sebastian 09 January 2014 (has links)
This thesis is a compendium of scientific works and engineering specifications that have been contributed to a large community of stakeholders to be copied, adapted, mixed, built upon and exploited in any way possible to achieve a common goal: Integrating Natural Language Processing (NLP) and Language Resources Using Linked Data The explosion of information technology in the last two decades has led to a substantial growth in quantity, diversity and complexity of web-accessible linguistic data. These resources become even more useful when linked with each other and the last few years have seen the emergence of numerous approaches in various disciplines concerned with linguistic resources and NLP tools. It is the challenge of our time to store, interlink and exploit this wealth of data accumulated in more than half a century of computational linguistics, of empirical, corpus-based study of language, and of computational lexicography in all its heterogeneity. The vision of the Giant Global Graph (GGG) was conceived by Tim Berners-Lee aiming at connecting all data on the Web and allowing to discover new relations between this openly-accessible data. This vision has been pursued by the Linked Open Data (LOD) community, where the cloud of published datasets comprises 295 data repositories and more than 30 billion RDF triples (as of September 2011). RDF is based on globally unique and accessible URIs and it was specifically designed to establish links between such URIs (or resources). This is captured in the Linked Data paradigm that postulates four rules: (1) Referred entities should be designated by URIs, (2) these URIs should be resolvable over HTTP, (3) data should be represented by means of standards such as RDF, (4) and a resource should include links to other resources. Although it is difficult to precisely identify the reasons for the success of the LOD effort, advocates generally argue that open licenses as well as open access are key enablers for the growth of such a network as they provide a strong incentive for collaboration and contribution by third parties. In his keynote at BNCOD 2011, Chris Bizer argued that with RDF the overall data integration effort can be “split between data publishers, third parties, and the data consumer”, a claim that can be substantiated by observing the evolution of many large data sets constituting the LOD cloud. As written in the acknowledgement section, parts of this thesis has received numerous feedback from other scientists, practitioners and industry in many different ways. The main contributions of this thesis are summarized here: Part I – Introduction and Background. During his keynote at the Language Resource and Evaluation Conference in 2012, Sören Auer stressed the decentralized, collaborative, interlinked and interoperable nature of the Web of Data. The keynote provides strong evidence that Semantic Web technologies such as Linked Data are on its way to become main stream for the representation of language resources. The jointly written companion publication for the keynote was later extended as a book chapter in The People’s Web Meets NLP and serves as the basis for “Introduction” and “Background”, outlining some stages of the Linked Data publication and refinement chain. Both chapters stress the importance of open licenses and open access as an enabler for collaboration, the ability to interlink data on the Web as a key feature of RDF as well as provide a discussion about scalability issues and decentralization. Furthermore, we elaborate on how conceptual interoperability can be achieved by (1) re-using vocabularies, (2) agile ontology development, (3) meetings to refine and adapt ontologies and (4) tool support to enrich ontologies and match schemata. Part II - Language Resources as Linked Data. “Linked Data in Linguistics” and “NLP & DBpedia, an Upward Knowledge Acquisition Spiral” summarize the results of the Linked Data in Linguistics (LDL) Workshop in 2012 and the NLP & DBpedia Workshop in 2013 and give a preview of the MLOD special issue. In total, five proceedings – three published at CEUR (OKCon 2011, WoLE 2012, NLP & DBpedia 2013), one Springer book (Linked Data in Linguistics, LDL 2012) and one journal special issue (Multilingual Linked Open Data, MLOD to appear) – have been (co-)edited to create incentives for scientists to convert and publish Linked Data and thus to contribute open and/or linguistic data to the LOD cloud. Based on the disseminated call for papers, 152 authors contributed one or more accepted submissions to our venues and 120 reviewers were involved in peer-reviewing. “DBpedia as a Multilingual Language Resource” and “Leveraging the Crowdsourcing of Lexical Resources for Bootstrapping a Linguistic Linked Data Cloud” contain this thesis’ contribution to the DBpedia Project in order to further increase the size and inter-linkage of the LOD Cloud with lexical-semantic resources. Our contribution comprises extracted data from Wiktionary (an online, collaborative dictionary similar to Wikipedia) in more than four languages (now six) as well as language-specific versions of DBpedia, including a quality assessment of inter-language links between Wikipedia editions and internationalized content negotiation rules for Linked Data. In particular the work described in created the foundation for a DBpedia Internationalisation Committee with members from over 15 different languages with the common goal to push DBpedia as a free and open multilingual language resource. Part III - The NLP Interchange Format (NIF). “NIF 2.0 Core Specification”, “NIF 2.0 Resources and Architecture” and “Evaluation and Related Work” constitute one of the main contribution of this thesis. The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. The core specification is included in and describes which URI schemes and RDF vocabularies must be used for (parts of) natural language texts and annotations in order to create an RDF/OWL-based interoperability layer with NIF built upon Unicode Code Points in Normal Form C. In , classes and properties of the NIF Core Ontology are described to formally define the relations between text, substrings and their URI schemes. contains the evaluation of NIF. In a questionnaire, we asked questions to 13 developers using NIF. UIMA, GATE and Stanbol are extensible NLP frameworks and NIF was not yet able to provide off-the-shelf NLP domain ontologies for all possible domains, but only for the plugins used in this study. After inspecting the software, the developers agreed however that NIF is adequate enough to provide a generic RDF output based on NIF using literal objects for annotations. All developers were able to map the internal data structure to NIF URIs to serialize RDF output (Adequacy). The development effort in hours (ranging between 3 and 40 hours) as well as the number of code lines (ranging between 110 and 445) suggest, that the implementation of NIF wrappers is easy and fast for an average developer. Furthermore the evaluation contains a comparison to other formats and an evaluation of the available URI schemes for web annotation. In order to collect input from the wide group of stakeholders, a total of 16 presentations were given with extensive discussions and feedback, which has lead to a constant improvement of NIF from 2010 until 2013. After the release of NIF (Version 1.0) in November 2011, a total of 32 vocabulary employments and implementations for different NLP tools and converters were reported (8 by the (co-)authors, including Wiki-link corpus, 13 by people participating in our survey and 11 more, of which we have heard). Several roll-out meetings and tutorials were held (e.g. in Leipzig and Prague in 2013) and are planned (e.g. at LREC 2014). Part IV - The NLP Interchange Format in Use. “Use Cases and Applications for NIF” and “Publication of Corpora using NIF” describe 8 concrete instances where NIF has been successfully used. One major contribution in is the usage of NIF as the recommended RDF mapping in the Internationalization Tag Set (ITS) 2.0 W3C standard and the conversion algorithms from ITS to NIF and back. One outcome of the discussions in the standardization meetings and telephone conferences for ITS 2.0 resulted in the conclusion there was no alternative RDF format or vocabulary other than NIF with the required features to fulfill the working group charter. Five further uses of NIF are described for the Ontology of Linguistic Annotations (OLiA), the RDFaCE tool, the Tiger Corpus Navigator, the OntosFeeder and visualisations of NIF using the RelFinder tool. These 8 instances provide an implemented proof-of-concept of the features of NIF. starts with describing the conversion and hosting of the huge Google Wikilinks corpus with 40 million annotations for 3 million web sites. The resulting RDF dump contains 477 million triples in a 5.6 GB compressed dump file in turtle syntax. describes how NIF can be used to publish extracted facts from news feeds in the RDFLiveNews tool as Linked Data. Part V - Conclusions. provides lessons learned for NIF, conclusions and an outlook on future work. Most of the contributions are already summarized above. One particular aspect worth mentioning is the increasing number of NIF-formated corpora for Named Entity Recognition (NER) that have come into existence after the publication of the main NIF paper Integrating NLP using Linked Data at ISWC 2013. These include the corpora converted by Steinmetz, Knuth and Sack for the NLP & DBpedia workshop and an OpenNLP-based CoNLL converter by Brümmer. Furthermore, we are aware of three LREC 2014 submissions that leverage NIF: NIF4OGGD - NLP Interchange Format for Open German Governmental Data, N^3 – A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format and Global Intelligent Content: Active Curation of Language Resources using Linked Data as well as an early implementation of a GATE-based NER/NEL evaluation framework by Dojchinovski and Kliegr. Further funding for the maintenance, interlinking and publication of Linguistic Linked Data as well as support and improvements of NIF is available via the expiring LOD2 EU project, as well as the CSA EU project called LIDER, which started in November 2013. Based on the evidence of successful adoption presented in this thesis, we can expect a decent to high chance of reaching critical mass of Linked Data technology as well as the NIF standard in the field of Natural Language Processing and Language Resources.:CONTENTS i introduction and background 1 1 introduction 3 1.1 Natural Language Processing . . . . . . . . . . . . . . . 3 1.2 Open licenses, open access and collaboration . . . . . . 5 1.3 Linked Data in Linguistics . . . . . . . . . . . . . . . . . 6 1.4 NLP for and by the Semantic Web – the NLP Inter- change Format (NIF) . . . . . . . . . . . . . . . . . . . . 8 1.5 Requirements for NLP Integration . . . . . . . . . . . . 10 1.6 Overview and Contributions . . . . . . . . . . . . . . . 11 2 background 15 2.1 The Working Group on Open Data in Linguistics (OWLG) 15 2.1.1 The Open Knowledge Foundation . . . . . . . . 15 2.1.2 Goals of the Open Linguistics Working Group . 16 2.1.3 Open linguistics resources, problems and chal- lenges . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.4 Recent activities and on-going developments . . 18 2.2 Technological Background . . . . . . . . . . . . . . . . . 18 2.3 RDF as a data model . . . . . . . . . . . . . . . . . . . . 21 2.4 Performance and scalability . . . . . . . . . . . . . . . . 22 2.5 Conceptual interoperability . . . . . . . . . . . . . . . . 22 ii language resources as linked data 25 3 linked data in linguistics 27 3.1 Lexical Resources . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Linguistic Corpora . . . . . . . . . . . . . . . . . . . . . 30 3.3 Linguistic Knowledgebases . . . . . . . . . . . . . . . . 31 3.4 Towards a Linguistic Linked Open Data Cloud . . . . . 32 3.5 State of the Linguistic Linked Open Data Cloud in 2012 33 3.6 Querying linked resources in the LLOD . . . . . . . . . 36 3.6.1 Enriching metadata repositories with linguistic features (Glottolog → OLiA) . . . . . . . . . . . 36 3.6.2 Enriching lexical-semantic resources with lin- guistic information (DBpedia (→ POWLA) → OLiA) . . . . . . . . . . . . . . . . . . . . . . . . 38 4 DBpedia as a multilingual language resource: the case of the greek dbpedia edition. 39 4.1 Current state of the internationalization effort . . . . . 40 4.2 Language-specific design of DBpedia resource identifiers 41 4.3 Inter-DBpedia linking . . . . . . . . . . . . . . . . . . . 42 4.4 Outlook on DBpedia Internationalization . . . . . . . . 44 5 leveraging the crowdsourcing of lexical resources for bootstrapping a linguistic linked data cloud 47 5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . 48 5.2 Problem Description . . . . . . . . . . . . . . . . . . . . 50 5.2.1 Processing Wiki Syntax . . . . . . . . . . . . . . 50 5.2.2 Wiktionary . . . . . . . . . . . . . . . . . . . . . . 52 5.2.3 Wiki-scale Data Extraction . . . . . . . . . . . . . 53 5.3 Design and Implementation . . . . . . . . . . . . . . . . 54 5.3.1 Extraction Templates . . . . . . . . . . . . . . . . 56 5.3.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . 56 5.3.3 Language Mapping . . . . . . . . . . . . . . . . . 58 5.3.4 Schema Mediation by Annotation with lemon . 58 5.4 Resulting Data . . . . . . . . . . . . . . . . . . . . . . . . 58 5.5 Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . 60 5.6 Discussion and Future Work . . . . . . . . . . . . . . . 60 5.6.1 Next Steps . . . . . . . . . . . . . . . . . . . . . . 61 5.6.2 Open Research Questions . . . . . . . . . . . . . 61 6 nlp & dbpedia, an upward knowledge acquisition spiral 63 6.1 Knowledge acquisition and structuring . . . . . . . . . 64 6.2 Representation of knowledge . . . . . . . . . . . . . . . 65 6.3 NLP tasks and applications . . . . . . . . . . . . . . . . 65 6.3.1 Named Entity Recognition . . . . . . . . . . . . 66 6.3.2 Relation extraction . . . . . . . . . . . . . . . . . 67 6.3.3 Question Answering over Linked Data . . . . . 67 6.4 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 6.4.1 Gold and silver standards . . . . . . . . . . . . . 69 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 iii the nlp interchange format (nif) 73 7 nif 2.0 core specification 75 7.1 Conformance checklist . . . . . . . . . . . . . . . . . . . 75 7.2 Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 7.2.1 Definition of Strings . . . . . . . . . . . . . . . . 78 7.2.2 Representation of Document Content with the nif:Context Class . . . . . . . . . . . . . . . . . . 80 7.3 Extension of NIF . . . . . . . . . . . . . . . . . . . . . . 82 7.3.1 Part of Speech Tagging with OLiA . . . . . . . . 83 7.3.2 Named Entity Recognition with ITS 2.0, DBpe- dia and NERD . . . . . . . . . . . . . . . . . . . 84 7.3.3 lemon and Wiktionary2RDF . . . . . . . . . . . 86 8 nif 2.0 resources and architecture 89 8.1 NIF Core Ontology . . . . . . . . . . . . . . . . . . . . . 89 8.1.1 Logical Modules . . . . . . . . . . . . . . . . . . 90 8.2 Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . 91 8.2.1 Access via REST Services . . . . . . . . . . . . . 92 8.2.2 NIF Combinator Demo . . . . . . . . . . . . . . 92 8.3 Granularity Profiles . . . . . . . . . . . . . . . . . . . . . 93 8.4 Further URI Schemes for NIF . . . . . . . . . . . . . . . 95 8.4.1 Context-Hash-based URIs . . . . . . . . . . . . . 99 9 evaluation and related work 101 9.1 Questionnaire and Developers Study for NIF 1.0 . . . . 101 9.2 Qualitative Comparison with other Frameworks and Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 9.3 URI Stability Evaluation . . . . . . . . . . . . . . . . . . 103 9.4 Related URI Schemes . . . . . . . . . . . . . . . . . . . . 104 iv the nlp interchange format in use 109 10 use cases and applications for nif 111 10.1 Internationalization Tag Set 2.0 . . . . . . . . . . . . . . 111 10.1.1 ITS2NIF and NIF2ITS conversion . . . . . . . . . 112 10.2 OLiA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 10.3 RDFaCE . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 10.4 Tiger Corpus Navigator . . . . . . . . . . . . . . . . . . 121 10.4.1 Tools and Resources . . . . . . . . . . . . . . . . 122 10.4.2 NLP2RDF in 2010 . . . . . . . . . . . . . . . . . . 123 10.4.3 Linguistic Ontologies . . . . . . . . . . . . . . . . 124 10.4.4 Implementation . . . . . . . . . . . . . . . . . . . 125 10.4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . 126 10.4.6 Related Work and Outlook . . . . . . . . . . . . 129 10.5 OntosFeeder – a Versatile Semantic Context Provider for Web Content Authoring . . . . . . . . . . . . . . . . 131 10.5.1 Feature Description and User Interface Walk- through . . . . . . . . . . . . . . . . . . . . . . . 132 10.5.2 Architecture . . . . . . . . . . . . . . . . . . . . . 134 10.5.3 Embedding Metadata . . . . . . . . . . . . . . . 135 10.5.4 Related Work and Summary . . . . . . . . . . . 135 10.6 RelFinder: Revealing Relationships in RDF Knowledge Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 10.6.1 Implementation . . . . . . . . . . . . . . . . . . . 137 10.6.2 Disambiguation . . . . . . . . . . . . . . . . . . . 138 10.6.3 Searching for Relationships . . . . . . . . . . . . 139 10.6.4 Graph Visualization . . . . . . . . . . . . . . . . 140 10.6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . 141 11 publication of corpora using nif 143 11.1 Wikilinks Corpus . . . . . . . . . . . . . . . . . . . . . . 143 11.1.1 Description of the corpus . . . . . . . . . . . . . 143 11.1.2 Quantitative Analysis with Google Wikilinks Cor- pus . . . . . . . . . . . . . . . . . . . . . . . . . . 144 11.2 RDFLiveNews . . . . . . . . . . . . . . . . . . . . . . . . 144 11.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . 145 11.2.2 Mapping to RDF and Publication on the Web of Data . . . . . . . . . . . . . . . . . . . . . . . . . 146 v conclusions 149 12 lessons learned, conclusions and future work 151 12.1 Lessons Learned for NIF . . . . . . . . . . . . . . . . . . 151 12.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 151 12.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 153
319

Semi-Automatic Mapping of Structured Data to Visual Variables

Polowinski, Jan 11 October 2007 (has links)
While semantic web data is machine-understandable and well suited for advanced filtering, in its raw representation it is not conveniently understandable to humans. Therefore, visualization is needed. A core challenge when visualizing the structured but heterogeneous data turned out to be a flexible mapping to Visual Variables. This work deals with a highly flexible, semi-automatic solution with a maximum support of the visualization process, reducing the mapping possibilities to a useful subset. The basis for this is knowledge, concerning metrics and structure of the data on the one hand and available visualization structures, platforms and common graphical facts on the other hand — provided by a novel basic visualization ontology. A declarative, platform-independent mapping vocabulary and a framework was developed, utilizing current standards from the semantic web and the Model-Driven Architecture (MDA).:ABSTRACT S. x 1. INTRODUCTION S. 1 2. VISUALIZATION OF STRUCTURED DATA IN GENERAL S. 4 2.1. Global and Local Interfaces S. 4 2.2. Steps of the Visualization Process S. 4 2.3. Existing Visual Selection Mechanisms S. 6 2.4. Existing Visualizations of Structured Data S. 12 2.5. Categorizing SemVis S. 25 3. REQUIREMENTS FOR A FLEXIBLE VISUALIZATION S. 27 3.1. Actors S. 27 3.2. Use Cases S. 27 4. FRESNEL, A STANDARD DISPLAY VOCABULARY FOR RDF S. 31 4.1. Fresnel Lenses S. 31 4.2. Fresnel Formats S. 33 4.3. Fresnel Groups S. 33 4.4. Primaries (Starting Points) S. 33 4.5. Selectors and Inference S. 34 4.6. Application and Reusability S. 34 4.7. Implementation S. 35 5. A VISUALIZATION ONTOLOGY S. 37 5.1. Describing and Formalizing the Field of Visualization S. 37 5.2. Overview S. 37 5.3. VisualVariable S. 38 5.4. DiscreteVisualValue S. 39 5.5. VisualElement S. 41 5.6. VisualizationStructure S. 42 5.7. VisualizationPlatform S. 42 5.8. PresentationScenario S. 43 5.9. Facts S. 44 6. A NOVEL MAPPING VOCABULARY FOR SEMANTIC VISUALIZATION S. 45 6.1. Overview S. 45 6.2. Mapping S. 46 6.3. PropertyMapping S. 47 6.4. ImplicitMapping S. 48 6.5. ExplicitMapping S. 53 6.6. MixedMapping S. 54 6.7. ComplexMapping S. 55 6.8. Inference S. 58 6.9. Explicit Display of Relations S. 58 6.10. Limitations s. 59 7. A MODEL-DRIVEN ARCHITECTURE FOR FLEXIBLE VISUALIZATION S. 60 7.1. A Model-Driven Architecture S. 61 7.2. Applications of the MDA Pattern S. 62 7.3. Complete System Overview S. 71 7.4. Additional Knowledge of the System S. 72 7.5. Comparison to the Graphical Modelling Framework — GMF S. 77 8. VISUALIZATION PLATFORMS S. 80 8.1. Extensible 3D (X3D) S. 80 8.2. Scalable Vector Graphics (SVG) S. 81 8.3. XHTML + CSS S. 82 8.4. Text S. 82 9. OUTLOOK AND CONCLUSION S. 84 9.1. Advanced Mapping Vocabulary S. 84 9.2. Reusing Standardized Ontologies S. 84 9.3. Enabling Dynamic, Interaction and Animation S. 84 9.4. Implementation and Evaluation S. 85 9.5. Conclusion S. 85 GLOSSARY S. 86 BIBLIOGRAPHY S. 87 A. S. 90 A.1. Schemata S. 90 / Während Semantic-Web-Daten maschinenverstehbar und hervorragend filterbar sind, sind sie — in ihrer Rohform — nicht leicht von Menschen verstehbar. Eine Visualisierung der Daten ist deshalb notwendig. Die Kernherausforderung dabei ist eine flexible Abbildung der strukturierten aber heterogenen Daten auf Visuelle Variablen. Diese Arbeit beschreibt eine hochflexible halbautomatische Lösung bei maximaler Unterstützung des Visualisierungsprozesses, welcher die Abbildungsmöglichkeiten, aus denen der Nutzer zu wählen hat, auf eine sinnvolle Teilmenge reduziert. Die Grundlage dafür sind einerseits Metriken und das Wissen über die Struktur der Daten und andererseits das Wissen über verfügbare Visualisierungsstrukturen, -plattformen und bekannte grafische Fakten, welche durch eine neuentwickelte Visualisierungsontologie bereitgestellt werden. Basierend auf Standards des Semantic Webs und der Model-getriebenen Architektur, wurde desweiteren ein deklaratives, plattformunabhängiges Visualisierungsvokabular und -framework entwickelt.:ABSTRACT S. x 1. INTRODUCTION S. 1 2. VISUALIZATION OF STRUCTURED DATA IN GENERAL S. 4 2.1. Global and Local Interfaces S. 4 2.2. Steps of the Visualization Process S. 4 2.3. Existing Visual Selection Mechanisms S. 6 2.4. Existing Visualizations of Structured Data S. 12 2.5. Categorizing SemVis S. 25 3. REQUIREMENTS FOR A FLEXIBLE VISUALIZATION S. 27 3.1. Actors S. 27 3.2. Use Cases S. 27 4. FRESNEL, A STANDARD DISPLAY VOCABULARY FOR RDF S. 31 4.1. Fresnel Lenses S. 31 4.2. Fresnel Formats S. 33 4.3. Fresnel Groups S. 33 4.4. Primaries (Starting Points) S. 33 4.5. Selectors and Inference S. 34 4.6. Application and Reusability S. 34 4.7. Implementation S. 35 5. A VISUALIZATION ONTOLOGY S. 37 5.1. Describing and Formalizing the Field of Visualization S. 37 5.2. Overview S. 37 5.3. VisualVariable S. 38 5.4. DiscreteVisualValue S. 39 5.5. VisualElement S. 41 5.6. VisualizationStructure S. 42 5.7. VisualizationPlatform S. 42 5.8. PresentationScenario S. 43 5.9. Facts S. 44 6. A NOVEL MAPPING VOCABULARY FOR SEMANTIC VISUALIZATION S. 45 6.1. Overview S. 45 6.2. Mapping S. 46 6.3. PropertyMapping S. 47 6.4. ImplicitMapping S. 48 6.5. ExplicitMapping S. 53 6.6. MixedMapping S. 54 6.7. ComplexMapping S. 55 6.8. Inference S. 58 6.9. Explicit Display of Relations S. 58 6.10. Limitations s. 59 7. A MODEL-DRIVEN ARCHITECTURE FOR FLEXIBLE VISUALIZATION S. 60 7.1. A Model-Driven Architecture S. 61 7.2. Applications of the MDA Pattern S. 62 7.3. Complete System Overview S. 71 7.4. Additional Knowledge of the System S. 72 7.5. Comparison to the Graphical Modelling Framework — GMF S. 77 8. VISUALIZATION PLATFORMS S. 80 8.1. Extensible 3D (X3D) S. 80 8.2. Scalable Vector Graphics (SVG) S. 81 8.3. XHTML + CSS S. 82 8.4. Text S. 82 9. OUTLOOK AND CONCLUSION S. 84 9.1. Advanced Mapping Vocabulary S. 84 9.2. Reusing Standardized Ontologies S. 84 9.3. Enabling Dynamic, Interaction and Animation S. 84 9.4. Implementation and Evaluation S. 85 9.5. Conclusion S. 85 GLOSSARY S. 86 BIBLIOGRAPHY S. 87 A. S. 90 A.1. Schemata S. 90
320

Contrôle d'Accès et Présentation Contextuels pour le Web des Données

Costabello, Luca 29 November 2013 (has links) (PDF)
La thèse concerne le rôle joué par le contexte dans l'accès au Web de données depuis les dispositifs mobiles. Le travail analyse ce problème de deux points de vue distincts: adapter au contexte la présentation de données liées, et protéger l'accès aux bases des donnés RDF depuis les dispositifs mobiles. La première contribution est PRISSMA, un moteur de rendu RDF qui étend Fresnel avec la sélection de la meilleure représentation pour le contexte physique ou on se trouve. Cette opération est effectuée par un algorithme de recherche de sous-graphes tolérant aux erreurs basé sur la notion de distance d'édition sur les graphes. L'algorithme considère les différences entre les descriptions de contexte et le contexte détecté par les capteurs, supporte des dimensions de contexte hétérogènes et est exécuté sur le client pour ne pas révéler des informations privées. La deuxième contribution concerne le système de contrôle d'accès Shi3ld. Shi3ld supporte tous les triple stores et il ne nécessite pas de les modifier. Il utilise exclusivement les langages du Web sémantique, et il n'ajoute pas des nouveaux langages de définition de règles d'accès, y compris des analyseurs syntaxiques et des procédures de validation. Shi3ld offre une protection jusqu'au niveau des triplets. La thèse décrit les modèles, algorithmes et prototypes de PRISSMA et de Shi3ld. Des expériences montrent la validité des résultats de PRISSMA ainsi que les performances au niveau de mémoire et de temps de réponse. Le module de contrôle d'accès Shi3ld a été testé avec différents triple stores, avec et sans moteur SPARQL. Les résultats montrent l'impact sur le temps de réponse et démontrent la faisabilité de l'approche.

Page generated in 0.0598 seconds