Global ETD Search

1	Facilitating the use of cadastral data through the World Wide Web Polley, Iestyn Unknown Date (has links) (PDF) Over recent years the cadastral industry has become increasingly reliant on digital data. Many surveyors now submit digital survey plan data to accompany the legally required hardcopy maps and documentation, although it will not be long before total digital lodgement will be possible. In this environment it will be ideal to capitalise on computer networking technology such as the present day Internet and World Wide Web (WWW) to better facilitate the transmission of digital data. This work provides a study of the current climate in the cadastral industry and further identifies how the Internet and its related technologies can be used to facilitate the transmission of digital cadastral data. The focus is to provide a prototype application that facilitates these data transactions in the most effective manner that benefits both user and data provider. This involves a study of the different underlying Internet technologies and how they can be used within the cadastral context. The work presents how the Internet and the WWW can bring benefits in the form of increased data distribution, and, in data integration and update for data maintainers, who need efficient ways of passing digital data to and from different locations. cadastral data, World Wide Web
2	Ontologies of cree hydrography formalization and realization / Wellen, Christopher. January 1900 (has links) Thesis (M.Sc.). / Written for the Dept. of Geography. Title from title page of PDF (viewed 2008/12/10). Includes bibliographical references.
3	Evaluating Query and Storage Strategies for RDF Archives Fernandez Garcia, Javier David, Umbrich, Jürgen, Polleres, Axel, Knuth, Magnus January 2018 (has links) (PDF) There is an emerging demand on efficiently archiving and (temporal) querying different versions of evolving semantic Web data. As novel archiving systems are starting to address this challenge, foundations/standards for benchmarking RDF archives are needed to evaluate its storage space efficiency and the performance of different retrieval operations. To this end, we provide theoretical foundations on the design of data and queries to evaluate emerging RDF archiving systems. Then, we instantiate these foundations along a concrete set of queries on the basis of a real-world evolving dataset. Finally, we perform an empirical evaluation of various current archiving techniques and querying strategies on this data that is meant to serve as a baseline of future developments on querying archives of evolving RDF data.
4	Linked Data Quality Assessment and its Application to Societal Progress Measurement Zaveri, Amrapali 17 April 2015 (has links) In recent years, the Linked Data (LD) paradigm has emerged as a simple mechanism for employing the Web as a medium for data and knowledge integration where both documents and data are linked. Moreover, the semantics and structure of the underlying data are kept intact, making this the Semantic Web. LD essentially entails a set of best practices for publishing and connecting structure data on the Web, which allows publish- ing and exchanging information in an interoperable and reusable fashion. Many different communities on the Internet such as geographic, media, life sciences and government have already adopted these LD principles. This is confirmed by the dramatically growing Linked Data Web, where currently more than 50 billion facts are represented. With the emergence of Web of Linked Data, there are several use cases, which are possible due to the rich and disparate data integrated into one global information space. Linked Data, in these cases, not only assists in building mashups by interlinking heterogeneous and dispersed data from multiple sources but also empowers the uncovering of meaningful and impactful relationships. These discoveries have paved the way for scientists to explore the existing data and uncover meaningful outcomes that they might not have been aware of previously. In all these use cases utilizing LD, one crippling problem is the underlying data quality. Incomplete, inconsistent or inaccurate data affects the end results gravely, thus making them unreliable. Data quality is commonly conceived as fitness for use, be it for a certain application or use case. There are cases when datasets that contain quality problems, are useful for certain applications, thus depending on the use case at hand. Thus, LD consumption has to deal with the problem of getting the data into a state in which it can be exploited for real use cases. The insufficient data quality can be caused either by the LD publication process or is intrinsic to the data source itself. A key challenge is to assess the quality of datasets published on the Web and make this quality information explicit. Assessing data quality is particularly a challenge in LD as the underlying data stems from a set of multiple, autonomous and evolving data sources. Moreover, the dynamic nature of LD makes assessing the quality crucial to measure the accuracy of representing the real-world data. On the document Web, data quality can only be indirectly or vaguely defined, but there is a requirement for more concrete and measurable data quality metrics for LD. Such data quality metrics include correctness of facts wrt. the real-world, adequacy of semantic representation, quality of interlinks, interoperability, timeliness or consistency with regard to implicit information. Even though data quality is an important concept in LD, there are few methodologies proposed to assess the quality of these datasets. Thus, in this thesis, we first unify 18 data quality dimensions and provide a total of 69 metrics for assessment of LD. The first methodology includes the employment of LD experts for the assessment. This assessment is performed with the help of the TripleCheckMate tool, which was developed specifically to assist LD experts for assessing the quality of a dataset, in this case DBpedia. The second methodology is a semi-automatic process, in which the first phase involves the detection of common quality problems by the automatic creation of an extended schema for DBpedia. The second phase involves the manual verification of the generated schema axioms. Thereafter, we employ the wisdom of the crowds i.e. workers for online crowdsourcing platforms such as Amazon Mechanical Turk (MTurk) to assess the quality of DBpedia. We then compare the two approaches (previous assessment by LD experts and assessment by MTurk workers in this study) in order to measure the feasibility of each type of the user-driven data quality assessment methodology. Additionally, we evaluate another semi-automated methodology for LD quality assessment, which also involves human judgement. In this semi-automated methodology, selected metrics are formally defined and implemented as part of a tool, namely R2RLint. The user is not only provided the results of the assessment but also specific entities that cause the errors, which help users understand the quality issues and thus can fix them. Finally, we take into account a domain-specific use case that consumes LD and leverages on data quality. In particular, we identify four LD sources, assess their quality using the R2RLint tool and then utilize them in building the Health Economic Research (HER) Observatory. We show the advantages of this semi-automated assessment over the other types of quality assessment methodologies discussed earlier. The Observatory aims at evaluating the impact of research development on the economic and healthcare performance of each country per year. We illustrate the usefulness of LD in this use case and the importance of quality assessment for any data analysis. info:eu-repo/classification/ddc/500 ddc:500 Linked Data, Data Quality, Semantic Web
5	Query Languages for Semi-structured Data Maksimovic, Gordana January 2003 (has links) Semi-structured data is defined as irregular data with structure that may change rapidly or unpredictably. An example of such data can be found inside the World-Wide Web. Since the data is irregular, the user may not know the complete structure of the database. Thus, querying such data becomes a difficult issue. In order to write meaningful queries on semi-structured data, there is a need for a query language that will support the features that are presented by this data. Standard query languages, such as SQL for relational databases and OQL for object databases, are too constraining for querying semi-structured data, because they require data to conform to a fixed schema before any data is stored into the database. This paper introduces Lorel, a query language developed particularly for querying semi-structured data. Furthermore, it investigates if the standardised query languages support any of the criteria presented for semi-structured data. The result is an evaluation of three query languages, SQL, OQL and Lorel against these criteria. Semi-structured data unstructured data data on the Web database management Lorel Computer Sciences Datavetenskap (datalogi)
6	Aspekte der Kommunikation und Datenintegration in semantischen Daten-Wikis Frischmuth, Philipp 20 October 2017 (has links) Das Semantic Web, eine Erweiterung des ursprünglichen World Wide Web um eine se- mantische Schicht, kann die Integration von Informationen aus verschiedenen Datenquellen stark vereinfachen. Mit RDF und der SPARQL-Anfragesprache wurden Standards etabliert, die eine einheitliche Darstellung von strukturierten Informationen ermöglichen und diese abfragbar machen. Mit Linked Data werden diese Informationen über ein einheitliches Pro- tokoll verfügbar gemacht und es entsteht ein Netz aus Daten, anstelle von Dokumenten. In der vorliegenden Arbeit werden Aspekte einer auf solchen semantischen Technologien basierenden Datenintegration betrachtet und analysiert. Darauf aufbauend wird ein System spezifiziert und implementiert, das die Ergebnisse dieser Untersuchungen in einer konkreten Anwendung realisiert. Als Basis für die Implementierung dient OntoWiki, ein semantisches Daten-Wiki. foaf+ssl, Linked Data, RDF, Semantic Web info:eu-repo/classification/ddc/000 ddc:000
7	Fusion von Geodaten unterschiedlicher Quellen in Geodateninfrastrukturen am Beispiel von ATKIS und OpenStreetMap Wiemann, Stefan 10 December 2009 (has links) Die Zusammenführung von Geodaten auf Basis homologer Objekte ist ein wichtiger Teilprozess zur Wissensgenerierung aus verfügbaren Geoinformationen. Forschungen im Bereich der digitalen Geodatenfusion gibt es bereits seit Anfang der 80er Jahre. Das Aufgabenspektrum umfasst dabei die Aktualisierung, Veränderungsdetektion, Informationsanreicherung und Integration verfügbarer Datensätze. Gleichzeitig vollzieht sich seit Ende der 90er Jahre ein Paradigmenwechsel hin zum Aufbau dienstebasierter Geoinformationslandschaften auf Basis serviceorientierter Architekturen (SOA). Dieser wird insbesondere durch die Entwicklung einer Geodateninfrastruktur (GDI) im öffentlichen Sektor forciert und bildet einen Schwerpunkt der aktuellen Forschung im Bereich Geoinformatik. Innerhalb dieser interoperablen Strukturen kann ein entscheidender Informationsmehrwert durch die Kombination thematisch verwandter Ressourcen geschaffen werden. Die Fusion von Daten wird daher einen zentralen Bestandteil zukünftiger Entwicklungen im Bereich Web-basierter Anwendungen darstellen. Zur Bereitstellung von Geodaten in einer GDI hat das Open Geospatial Consortium (OGC) bereits zahlreiche Standards veröffentlicht. Darüber hinaus eröffnet die Entwicklung des Web 2.0 weitere, oftmals Community-gestützte, Möglichkeiten zur Bereitstellung von Geodaten außerhalb standardisierter GDI. Die Verarbeitung dieser Geodaten kann durch die Einführung des OGC Web Processing Service (WPS) realisiert werden. Diese Schnittstellenspezifikation ermöglicht die Verlagerung von Geoprozessierungsfunktionalitäten in eine GDI und trägt somit zur Ablösung monolithischer Geoinformationssysteme (GIS) durch verteilte dienstebasierte Strukturen bei. Für die Umsetzung komplexer Prozesse wie einer Geodatenfusion ist die Verfügbarkeit, Interoperabilität und Verkettung beteiligter Dienste von entscheidender Bedeutung. Nach der Einführung in Grundlagen von GDI und Geodatenfusion werden in dieser Arbeit Systemarchitektur und Bestandteile einer dienstebasierten Geodatenfusion konzipiert. Im Anschluss erfolgt die Beschreibung einer proof-ofconcept Implementierung wesentlicher Bestandteile unter Nutzung des 52°North WPS-Framework. Gegenstand der Implementierung ist die Fusion von Straßendaten der Modelle ATKIS (Amtliches Topographisch-Kartographisches Informationssystem) und OSM (OpenStreetMap) durch einen Feature- und Attributtransfer. Die Metadatenverarbeitung, Generalisierung und Evaluierung im Kontext einer dienstebasierten Geodatenfusion stellen weitere Teilaspekte dieser Arbeit dar. / The conflation of spatial data is one important task concerning the generation of knowledge from available geo-information. Research in this domain has been carried out since the early 80s and incorporates updating, change detection, enhancement and integration of spatial data. At the same time a paradigm shift leads towards service-oriented Architectures (SOA) in the field of geoinformation science. In the public sector this change is promoted by the developement of spatial data infrastructures (SDI). Especially whithin these interoperable structures, the combination of thematically comparable ressources can be used to enhance available spatial information. The conflation of data in general represents a core component of future research on web-based applications. The Open Geospatial Consortium (OGC) has already published various standards for spatial data dissemination. In addition, the Web 2.0 developement offers the possibility of user-generated spatial data beyond standardized SDI. The conflation of institutional- and community-provided datasets can be realized by the introduction of the OGC Web Processing Service (WPS). The WPS interface offers geoprocessing capabilities within SDI and thus helps distributed serviceoriented environments to replace monolithic Geographic Information Systems. Availibility, interoperability and chaining of services are crucial for implementing complex processes, such as conflation. After an introduction to the fundamentals of SDI and conflation, a servicebased architecture for geodata conflation will be developed within this thesis. The proof-of-concept implementation is realized using the 52°North WPS and exercises the conflation of street data. For this purpose, the data models ATKIS (Authoritative Topographic Cartographic Information System) and OSM (OpenStreetMap) were applied to perform a transfer of attributes and features. Other important aspects of this thesis related to service-based conflation include the processing of metadata, generalization and evaluation. info:eu-repo/classification/ddc/004 ddc:004
8	Bezpečnost publikování prostorových dat na Internetu / Security of publishing spatial data on Internet Břichnáč, Pavel January 2010 (has links) Bezpečnost publikování prostorových dat na internetu Abstrakt Diplomová práce se věnuje problematice publikování prostorových dat v síti internet. Cílem je popsat soudobé způsoby publikování dat, analyzovat bezpečnostní slabiny z hlediska úniku dat a navrhnout opatření, která by umožnila zabezpečit volně dostupná data proti automatizovanému stahování. V práci je vysvětlena motivace ilegálního získávání prostorových dat, jsou popsány soudobé možnosti publikování dat na internetu (včetně specifik pro data rastrová a vektorová), možnosti ochrany dat proti nelegálnímu získání a jejich slabiny. Výsledkem je navržení a formulování obecné metodiky ve formě doporučení pro publikování různých typů prostorových dat, která automatizované útoky na získání dat významně ztíží. Klíčová slova: prostorová data, internet, mapový server, datová politika, webové technologie Security of publishing spatial data on the Internet Abstract The master's thesis focuses on the topics of security of publishing spatial data on the Internet. The goal of the work is to describe present ways of publishing, to analyze weaknesses from point of view of leaks and to propose measures, that would allow securing publicly accessible data against automated downloads. Readers will get explained the motivation of getting spatial data, description and...
9	Building the Dresden Web Table Corpus: A Classification Approach Lehner, Wolfgang, Eberius, Julian, Braunschweig, Katrin, Hentsch, Markus, Thiele, Maik, Ahmadov, Ahmad 12 January 2023 (has links) In recent years, researchers have recognized relational tables on the Web as an important source of information. To assist this research we developed the Dresden Web Tables Corpus (DWTC), a collection of about 125 million data tables extracted from the Common Crawl (CC) which contains 3.6 billion web pages and is 266TB in size. As the vast majority of HTML tables are used for layout purposes and only a small share contains genuine tables with different surface forms, accurate table detection is essential for building a large-scale Web table corpus. Furthermore, correctly recognizing the table structure (e.g. horizontal listings, matrices) is important in order to understand the role of each table cell, distinguishing between label and data cells. In this paper, we present an extensive table layout classification that enables us to identify the main layout categories of Web tables with very high precision. We therefore identify and develop a plethora of table features, different feature selection techniques and several classification algorithms. We evaluate the effectiveness of the selected features and compare the performance of various state-of-the-art classification algorithms. Finally, the winning approach is employed to classify millions of tables resulting in the Dresden Web Table Corpus (DWTC). info:eu-repo/classification/ddc/004 ddc:004
10	Managing and Consuming Completeness Information for RDF Data Sources Darari, Fariz 20 June 2017 (has links) The ever increasing amount of Semantic Web data gives rise to the question: How complete is the data? Though generally data on the Semantic Web is incomplete, many parts of data are indeed complete, such as the children of Barack Obama and the crew of Apollo 11. This thesis aims to study how to manage and consume completeness information about Semantic Web data. In particular, we first discuss how completeness information can guarantee the completeness of query answering. Next, we propose optimization techniques of completeness reasoning and conduct experimental evaluations to show the feasibility of our approaches. We also provide a technique to check the soundness of queries with negation via reduction to query completeness checking. We further enrich completeness information with timestamps, enabling query answers to be checked up to when they are complete. We then introduce two demonstrators, i.e., CORNER and COOL-WD, to show how our completeness framework can be realized. Finally, we investigate an automated method to generate completeness statements from text on the Web via relation cardinality extraction. info:eu-repo/classification/ddc/004 ddc:004

Search results