• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • 1
  • Tagged with
  • 7
  • 7
  • 7
  • 7
  • 7
  • 5
  • 5
  • 5
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Leveraging Flexible Data Management with Graph Databases

Vasilyeva, Elena, Thiele, Maik, Bornhövd, Christof, Lehner, Wolfgang 01 September 2022 (has links)
Integrating up-to-date information into databases from different heterogeneous data sources is still a time-consuming and mostly manual job that can only be accomplished by skilled experts. For this reason, enterprises often lack information regarding the current market situation, preventing a holistic view that is needed to conduct sound data analysis and market predictions. Ironically, the Web consists of a huge and growing number of valuable information from diverse organizations and data providers, such as the Linked Open Data cloud, common knowledge sources like Freebase, and social networks. One desirable usage scenario for this kind of data is its integration into a single database in order to apply data analytics. However, in today's business intelligence tools there is an evident lack of support for so-called situational or ad-hoc data integration. What we need is a system which 1) provides a flexible storage of heterogeneous information of different degrees of structure in an ad-hoc manner, and 2) supports mass data operations suited for data analytics. In this paper, we will provide our vision of such a system and describe an extension of the well-studied property graph model that allows to 'integrate and analyze as you go' external data exposed in the RDF format in a seamless manner. The proposed integration approach extends the internal graph model with external data from the Linked Open Data cloud, which stores over 31 billion RDF triples (September 2011) from a variety of domains.
2

OPEN—Enabling Non-expert Users to Extract, Integrate, and Analyze Open Data

Braunschweig, Katrin, Eberius, Julian, Thiele, Maik, Lehner, Wolfgang 27 January 2023 (has links)
Government initiatives for more transparency and participation have lead to an increasing amount of structured data on the web in recent years. Many of these datasets have great potential. For example, a situational analysis and meaningful visualization of the data can assist in pointing out social or economic issues and raising people’s awareness. Unfortunately, the ad-hoc analysis of this so-called Open Data can prove very complex and time-consuming, partly due to a lack of efficient system support.On the one hand, search functionality is required to identify relevant datasets. Common document retrieval techniques used in web search, however, are not optimized for Open Data and do not address the semantic ambiguity inherent in it. On the other hand, semantic integration is necessary to perform analysis tasks across multiple datasets. To do so in an ad-hoc fashion, however, requires more flexibility and easier integration than most data integration systems provide. It is apparent that an optimal management system for Open Data must combine aspects from both classic approaches. In this article, we propose OPEN, a novel concept for the management and situational analysis of Open Data within a single system. In our approach, we extend a classic database management system, adding support for the identification and dynamic integration of public datasets. As most web users lack the experience and training required to formulate structured queries in a DBMS, we add support for non-expert users to our system, for example though keyword queries. Furthermore, we address the challenge of indexing Open Data.
3

Automating Geospatial RDF Dataset Integration and Enrichment / Automatische geografische RDF Datensatzintegration und Anreicherung

Sherif, Mohamed Ahmed Mohamed 12 December 2016 (has links) (PDF)
Over the last years, the Linked Open Data (LOD) has evolved from a mere 12 to more than 10,000 knowledge bases. These knowledge bases come from diverse domains including (but not limited to) publications, life sciences, social networking, government, media, linguistics. Moreover, the LOD cloud also contains a large number of crossdomain knowledge bases such as DBpedia and Yago2. These knowledge bases are commonly managed in a decentralized fashion and contain partly verlapping information. This architectural choice has led to knowledge pertaining to the same domain being published by independent entities in the LOD cloud. For example, information on drugs can be found in Diseasome as well as DBpedia and Drugbank. Furthermore, certain knowledge bases such as DBLP have been published by several bodies, which in turn has lead to duplicated content in the LOD . In addition, large amounts of geo-spatial information have been made available with the growth of heterogeneous Web of Data. The concurrent publication of knowledge bases containing related information promises to become a phenomenon of increasing importance with the growth of the number of independent data providers. Enabling the joint use of the knowledge bases published by these providers for tasks such as federated queries, cross-ontology question answering and data integration is most commonly tackled by creating links between the resources described within these knowledge bases. Within this thesis, we spur the transition from isolated knowledge bases to enriched Linked Data sets where information can be easily integrated and processed. To achieve this goal, we provide concepts, approaches and use cases that facilitate the integration and enrichment of information with other data types that are already present on the Linked Data Web with a focus on geo-spatial data. The first challenge that motivates our work is the lack of measures that use the geographic data for linking geo-spatial knowledge bases. This is partly due to the geo-spatial resources being described by the means of vector geometry. In particular, discrepancies in granularity and error measurements across knowledge bases render the selection of appropriate distance measures for geo-spatial resources difficult. We address this challenge by evaluating existing literature for point set measures that can be used to measure the similarity of vector geometries. Then, we present and evaluate the ten measures that we derived from the literature on samples of three real knowledge bases. The second challenge we address in this thesis is the lack of automatic Link Discovery (LD) approaches capable of dealing with geospatial knowledge bases with missing and erroneous data. To this end, we present Colibri, an unsupervised approach that allows discovering links between knowledge bases while improving the quality of the instance data in these knowledge bases. A Colibri iteration begins by generating links between knowledge bases. Then, the approach makes use of these links to detect resources with probably erroneous or missing information. This erroneous or missing information detected by the approach is finally corrected or added. The third challenge we address is the lack of scalable LD approaches for tackling big geo-spatial knowledge bases. Thus, we present Deterministic Particle-Swarm Optimization (DPSO), a novel load balancing technique for LD on parallel hardware based on particle-swarm optimization. We combine this approach with the Orchid algorithm for geo-spatial linking and evaluate it on real and artificial data sets. The lack of approaches for automatic updating of links of an evolving knowledge base is our fourth challenge. This challenge is addressed in this thesis by the Wombat algorithm. Wombat is a novel approach for the discovery of links between knowledge bases that relies exclusively on positive examples. Wombat is based on generalisation via an upward refinement operator to traverse the space of Link Specifications (LS). We study the theoretical characteristics of Wombat and evaluate it on different benchmark data sets. The last challenge addressed herein is the lack of automatic approaches for geo-spatial knowledge base enrichment. Thus, we propose Deer, a supervised learning approach based on a refinement operator for enriching Resource Description Framework (RDF) data sets. We show how we can use exemplary descriptions of enriched resources to generate accurate enrichment pipelines. We evaluate our approach against manually defined enrichment pipelines and show that our approach can learn accurate pipelines even when provided with a small number of training examples. Each of the proposed approaches is implemented and evaluated against state-of-the-art approaches on real and/or artificial data sets. Moreover, all approaches are peer-reviewed and published in a conference or a journal paper. Throughout this thesis, we detail the ideas, implementation and the evaluation of each of the approaches. Moreover, we discuss each approach and present lessons learned. Finally, we conclude this thesis by presenting a set of possible future extensions and use cases for each of the proposed approaches.
4

Sex Differences in Mate Preferences Across 45 Countries: A Large-Scale Replication

Walter, Kathryn V., Conroy-Beam, Daniel, Buss, David M., Asao, Kelly, Sorokowska, Agnieszka, Sorokowski, Piotr, Aavik, Toivo, Akello, Grace, Alhabahba, Mohammad Madallh, Alm, Charlotte, Amjad, Naumana, Anjum, Afifa, Atama, Chiemezie S., Atamtürk Duyar, Derya, Ayebare, Richard, Batres, Carlota, Bendixen, Mons, Bensafia, Aicha, Bizumic, Boris, Boussena, Mahmoud, Butovskaya, Marina, Can, Seda, Cantarero, Katarzyna, Carrier, Antonin, Cetinkaya, Hakan, Croy, Ilona, Cueto, Rosa María, Czub, Marcin, Dronova, Daria, Dural, Seda, Duyar, Izzet, Ertugrul, Berna, Espinosa, Agustín, Estevan, Ignacio, Esteves, Carla Sofia, Fang, Luxi, Frackowiak, Tomasz, Contreras Garduño, Jorge, Ugalde González, Karina, Guemaz, Farida, Gyuris, Petra, Halamová, Mária, Herak, Iskra, Horva, Marina, Hromatko, Ivana, Jaafar, Jas Laile, Jiang, Feng 17 May 2022 (has links)
Considerable research has examined human mate preferences across cultures, finding universal sex differences in preferences for attractiveness and resources as well as sources of systematic cultural variation. Two competing perspectives—an evolutionary psychological perspective and a biosocial role perspective—offer alternative explanations for these findings. However, the original data on which each perspective relies are decades old, and the literature is fraught with conflicting methods, analyses, results, and conclusions. Using a new 45-country sample (N = 14,399), we attempted to replicate classic studies and test both the evolutionary and biosocial role perspectives. Support for universal sex differences in preferences remains robust: Men, more than women, prefer attractive, young mates, and women, more than men, prefer older mates with financial prospects. Cross-culturally, both sexes have mates closer to their own ages as gender equality increases. Beyond age of partner, neither pathogen prevalence nor gender equality robustly predicted sex differences or preferences across countries.
5

Die Datenbankforschungsgruppe der Technischen Universität Dresden stellt sich vor

Wolfgang, Lehner 27 January 2023 (has links)
Im Herbst 2012 feiert der Lehrstuhl Datenbanken an der Technischen Universität Dresden sein 10-jähriges Bestehen unter der Leitung von Wolfgang Lehner. In diesem Zeitraum wurde die inhaltliche Ausrichtung im Bereich der Datenbankunterstützung zur Auswertung großer Datenbestände weiter fokussiert sowie auf Systemebene deutlich ausgeweitet. Die Forschungsgruppe um Wolfgang Lehner ist dabei sowohl auf internationaler Ebene durch Publikationen und Kooperationen sichtbar als auch in Forschungsverbünden auf regionaler Ebene aktiv, um sowohl an der extrem jungen und agilen Software-Industrie in Dresden zu partizipieren und, soweit eine Forschungsgruppe dies zu leisten vermag, auch unterstützend zu wirken. [Aus: Einleitung]
6

Automating Geospatial RDF Dataset Integration and Enrichment

Sherif, Mohamed Ahmed Mohamed 12 May 2016 (has links)
Over the last years, the Linked Open Data (LOD) has evolved from a mere 12 to more than 10,000 knowledge bases. These knowledge bases come from diverse domains including (but not limited to) publications, life sciences, social networking, government, media, linguistics. Moreover, the LOD cloud also contains a large number of crossdomain knowledge bases such as DBpedia and Yago2. These knowledge bases are commonly managed in a decentralized fashion and contain partly verlapping information. This architectural choice has led to knowledge pertaining to the same domain being published by independent entities in the LOD cloud. For example, information on drugs can be found in Diseasome as well as DBpedia and Drugbank. Furthermore, certain knowledge bases such as DBLP have been published by several bodies, which in turn has lead to duplicated content in the LOD . In addition, large amounts of geo-spatial information have been made available with the growth of heterogeneous Web of Data. The concurrent publication of knowledge bases containing related information promises to become a phenomenon of increasing importance with the growth of the number of independent data providers. Enabling the joint use of the knowledge bases published by these providers for tasks such as federated queries, cross-ontology question answering and data integration is most commonly tackled by creating links between the resources described within these knowledge bases. Within this thesis, we spur the transition from isolated knowledge bases to enriched Linked Data sets where information can be easily integrated and processed. To achieve this goal, we provide concepts, approaches and use cases that facilitate the integration and enrichment of information with other data types that are already present on the Linked Data Web with a focus on geo-spatial data. The first challenge that motivates our work is the lack of measures that use the geographic data for linking geo-spatial knowledge bases. This is partly due to the geo-spatial resources being described by the means of vector geometry. In particular, discrepancies in granularity and error measurements across knowledge bases render the selection of appropriate distance measures for geo-spatial resources difficult. We address this challenge by evaluating existing literature for point set measures that can be used to measure the similarity of vector geometries. Then, we present and evaluate the ten measures that we derived from the literature on samples of three real knowledge bases. The second challenge we address in this thesis is the lack of automatic Link Discovery (LD) approaches capable of dealing with geospatial knowledge bases with missing and erroneous data. To this end, we present Colibri, an unsupervised approach that allows discovering links between knowledge bases while improving the quality of the instance data in these knowledge bases. A Colibri iteration begins by generating links between knowledge bases. Then, the approach makes use of these links to detect resources with probably erroneous or missing information. This erroneous or missing information detected by the approach is finally corrected or added. The third challenge we address is the lack of scalable LD approaches for tackling big geo-spatial knowledge bases. Thus, we present Deterministic Particle-Swarm Optimization (DPSO), a novel load balancing technique for LD on parallel hardware based on particle-swarm optimization. We combine this approach with the Orchid algorithm for geo-spatial linking and evaluate it on real and artificial data sets. The lack of approaches for automatic updating of links of an evolving knowledge base is our fourth challenge. This challenge is addressed in this thesis by the Wombat algorithm. Wombat is a novel approach for the discovery of links between knowledge bases that relies exclusively on positive examples. Wombat is based on generalisation via an upward refinement operator to traverse the space of Link Specifications (LS). We study the theoretical characteristics of Wombat and evaluate it on different benchmark data sets. The last challenge addressed herein is the lack of automatic approaches for geo-spatial knowledge base enrichment. Thus, we propose Deer, a supervised learning approach based on a refinement operator for enriching Resource Description Framework (RDF) data sets. We show how we can use exemplary descriptions of enriched resources to generate accurate enrichment pipelines. We evaluate our approach against manually defined enrichment pipelines and show that our approach can learn accurate pipelines even when provided with a small number of training examples. Each of the proposed approaches is implemented and evaluated against state-of-the-art approaches on real and/or artificial data sets. Moreover, all approaches are peer-reviewed and published in a conference or a journal paper. Throughout this thesis, we detail the ideas, implementation and the evaluation of each of the approaches. Moreover, we discuss each approach and present lessons learned. Finally, we conclude this thesis by presenting a set of possible future extensions and use cases for each of the proposed approaches.
7

Open geospatial data fusion and its application in sustainable urban development

Xu, Shaojuan 17 July 2020 (has links)
This thesis presents the implementation of data fusion techniques for sustainable urban development. Recently, increasingly more geospatial data have been made easily available for no cost. The immeasurable quantities of geospatial data are mainly from four kinds of sources: remote sensing satellites, geographic information systems (GIS) data, citizen science, and sensor web. Among them, satellite images have been mostly used, due to the frequent and repetitive coverage, as well as the data acquisition over a long time period. However, the rather coarse spatial resolution of e.g. 30 m for Landsat 8 multispectral images impairs the application of satellite images in urban areas. Even though image fusion techniques have been used to improve the spatial resolution, the existing image fusion methods are neither suitable for sharpening one band thermal images nor for hyperspectral images with hundreds of bands. Therefore, simplified Ehlers fusion was developed. It adds the spatial information of a high-resolution image into a low-resolution image in the frequency domain through fast Fourier transform (FFT) and filter techniques. The developed algorithm successfully improved the spatial resolution of both one band thermal images as well as hyperspectral images. It can enhance various images, regardless of the number of bands and the spectral coverage, providing more precise measurement and richer information. To investigate the performance of simplified Ehlers fusion in practical use, it was applied for urban heat island (UHI) analysis. This was done by sharpening daytime and nighttime thermal images from Landsat 8, Landsat 7, and the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER). The developed algorithm effectively improved the spatial details of the original images so that the temperature differences between agricultural, forest, industrial, transportation, and residential areas could be distinguished from each other. Based on that, it was found that in the study city the causes of UHI are mainly anthropogenic heat from industrial areas as well as high temperatures from the road surface and dense urban fabric. Based on this analysis, corresponding mitigation strategies were tailored. Remote sensing images are useful yet not sufficient to retrieve land use related information, despite high spatial resolution. For sustainable urban development research, remote sensing images need to be incorporated with data from other sources. Accordingly, image fusion needs to be extended to broader data fusion. Extraction of urban vacant land was therefore taken as a second application case. Much effort was spent on the definition of vacant land as unclear definitions lead to ineffective data fusion and incorrect site extraction results. Through an intensive study of the current research and the available open data sources, a vacant land typology is proposed. It includes four categories: transportation-associated land, natural sites, unattended areas or remnant parcels, and brownfields. Based on this typology, a two-level data fusion framework was developed. On the feature level, sites are identified. For each type of vacant land, an individual site extraction rule and data fusion procedure is implemented. The overall data fusion involves satellite images, GIS data, citizen science, and social media data. In the end, four types of vacant land features were extracted from the study area. On the decision level, these extracted sites could be conserved or further developed to support sustainable urban development.

Page generated in 0.0859 seconds