• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 23
  • 3
  • 2
  • 2
  • 2
  • Tagged with
  • 37
  • 37
  • 17
  • 13
  • 9
  • 8
  • 7
  • 7
  • 7
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Integrace statistické aplikace a herního systému s využitím datového skladu a platformy Java / Integration of statistic application and gaming system using data warehouse and Java platform

Macoun, Jakub January 2013 (has links)
Diploma thesis is about creation of support software integrated into gaming system. Thanks to non-existence of documentation of chosen gaming information system, untraditional method of software creation had to be used. The method is described by this thesis. Main objective of the thesis is creation of supporting application that generates aggregated data, stores it to a data warehouse and presents it to its users. All of this development is done using defined method. Thesis is divided in two main parts. The first one contains analysis of gaming information system. This analysis is used in the second part of the thesis, which describes how was the developed software designed and implemented. All analysis in the thesis are important part of the final product development and the product can be created thanks to them. Created application is unique in its domain and brings view at an untraditional development of this support software. Thanks to its uniqueness, created software can help with inspiration for next development in this domain.
22

Barriers to Dissemination of Local Health Data Faced by US State Agencies: Survey Study of Behavioral Risk Factor Surveillance System Coordinators

Ahuja, Manik, Aseltine, Robert, Jr. 01 July 2021 (has links)
Background: Advances in information technology have paved the way to facilitate accessibility to population-level health data through web-based data query systems (WDQSs). Despite these advances in technology, US state agencies face many challenges related to the dissemination of their local health data. It is essential for the public to have access to high-quality data that are easy to interpret, reliable, and trusted. These challenges have been at the forefront throughout the COVID-19 pandemic. Objective: The purpose of this study is to identify the most significant challenges faced by state agencies, from the perspective of the Behavioral Risk Factor Surveillance System (BRFSS) coordinator from each state, and to assess if the coordinators from states with a WDQS perceive these challenges differently. Methods: We surveyed BRFSS coordinators (N=43) across all 50 US states and the District of Columbia. We surveyed the participants about contextual factors and asked them to rate system aspects and challenges they faced with their health data system on a Likert scale. We used two-sample t tests to compare the means of the ratings by participants from states with and without a WDQS. Results: Overall, 41/43 states (95%) make health data available over the internet, while 65% (28/43) employ a WDQS. States with a WDQS reported greater challenges (P=.01) related to the cost of hardware and software (mean score 3.44/4, 95% CI 3.09-3.78) than states without a WDQS (mean score 2.63/4, 95% CI 2.25-3.00). The system aspect of standardization of vocabulary scored more favorably (P=.01) in states with a WDQS (mean score 3.32/5, 95% CI 2.94-3.69) than in states without a WDQS (mean score 2.85/5, 95% CI 2.47-3.22). Conclusions: Securing of adequate resources and commitment to standardization are vital in the dissemination of local-level health data. Factors such as receiving data in a timely manner, privacy, and political opposition are less significant barriers than anticipated.
23

Towards a Hybrid Imputation Approach Using Web Tables

Lehner, Wolfgang, Ahmadov, Ahmad, Thiele, Maik, Eberius, Julian, Wrembel, Robert 12 January 2023 (has links)
Data completeness is one of the most important data quality dimensions and an essential premise in data analytics. With new emerging Big Data trends such as the data lake concept, which provides a low cost data preparation repository instead of moving curated data into a data warehouse, the problem of data completeness is additionally reinforced. While traditionally the process of filling in missing values is addressed by the data imputation community using statistical techniques, we complement these approaches by using external data sources from the data lake or even the Web to lookup missing values. In this paper we propose a novel hybrid data imputation strategy that, takes into account the characteristics of an incomplete dataset and based on that chooses the best imputation approach, i.e. either a statistical approach such as regression analysis or a Web-based lookup or a combination of both. We formalize and implement both imputation approaches, including a Web table retrieval and matching system and evaluate them extensively using a corpus with 125M Web tables. We show that applying statistical techniques in conjunction with external data sources will lead to a imputation system which is robust, accurate, and has high coverage at the same time.
24

Linked Open Data Alignment & Querying

Jain, Prateek 27 August 2012 (has links)
No description available.
25

Transforming user data into user value by novel mining techniques for extraction of web content, structure and usage patterns : the development and evaluation of new Web mining methods that enhance information retrieval and improve the understanding of users' Web behavior in websites and social blogs

Ammari, Ahmad N. January 2010 (has links)
The rapid growth of the World Wide Web in the last decade makes it the largest publicly accessible data source in the world, which has become one of the most significant and influential information revolution of modern times. The influence of the Web has impacted almost every aspect of humans' life, activities and fields, causing paradigm shifts and transformational changes in business, governance, and education. Moreover, the rapid evolution of Web 2.0 and the Social Web in the past few years, such as social blogs and friendship networking sites, has dramatically transformed the Web from a raw environment for information consumption to a dynamic and rich platform for information production and sharing worldwide. However, this growth and transformation of the Web has resulted in an uncontrollable explosion and abundance of the textual contents, creating a serious challenge for any user to find and retrieve the relevant information that he truly seeks to find on the Web. The process of finding a relevant Web page in a website easily and efficiently has become very difficult to achieve. This has created many challenges for researchers to develop new mining techniques in order to improve the user experience on the Web, as well as for organizations to understand the true informational interests and needs of their customers in order to improve their targeted services accordingly by providing the products, services and information that truly match the requirements of every online customer. With these challenges in mind, Web mining aims to extract hidden patterns and discover useful knowledge from Web page contents, Web hyperlinks, and Web usage logs. Based on the primary kinds of Web data used in the mining process, Web mining tasks can be categorized into three main types: Web content mining, which extracts knowledge from Web page contents using text mining techniques, Web structure mining, which extracts patterns from the hyperlinks that represent the structure of the website, and Web usage mining, which mines user's Web navigational patterns from Web server logs that record the Web page access made by every user, representing the interactional activities between the users and the Web pages in a website. The main goal of this thesis is to contribute toward addressing the challenges that have been resulted from the information explosion and overload on the Web, by proposing and developing novel Web mining-based approaches. Toward achieving this goal, the thesis presents, analyzes, and evaluates three major contributions. First, the development of an integrated Web structure and usage mining approach that recommends a collection of hyperlinks for the surfers of a website to be placed at the homepage of that website. Second, the development of an integrated Web content and usage mining approach to improve the understanding of the user's Web behavior and discover the user group interests in a website. Third, the development of a supervised classification model based on recent Social Web concepts, such as Tag Clouds, in order to improve the retrieval of relevant articles and posts from Web social blogs.
26

Semantic snippets via query-biased ranking of linked data entities / Snippets sémantiques via l'ordonnancement biaisé-requête des entités LOD

Alsarem, Mazen 30 May 2016 (has links)
Dans cette thèse, nous introduisons un nouvel artefact interactif pour le SERP: le "Snippet sémantique". Les snippets sémantiques s'appuient sur la coexistence des deux Webs pour faciliter le transfert des connaissances aux utilisateurs grâce a une contextualisation sémantique du besoin d'information de l'utilisateur. Ils font apparaître les relations entre le besoin d'information et les entités les plus pertinentes présentes dans la page Web. / In this thesis, we introduce a new interactive artifact for the SERP: the "Semantic Snippet". Semantic Snippets rely on the coexistence of the two webs to facilitate the transfer of knowledge to the user thanks to a semantic contextualization of the user's information need. It makes apparent the relationships between the information need and the most relevant entities present in the web page.
27

Integrace a konzumace důvěryhodných Linked Data / Towards Trustworthy Linked Data Integration and Consumption

Knap, Tomáš January 2013 (has links)
Title: Towards Trustworthy Linked Data Integration and Consumption Author: RNDr. Tomáš Knap Department: Department of Software Engineering Supervisor: RNDr. Irena Holubová, PhD., Department of Software Engineering Abstract: We are now finally at a point when datasets based upon open standards are being published on an increasing basis by a variety of Web communities, governmental initiatives, and various companies. Linked Data offers information consumers a level of information integration and aggregation agility that has up to now not been possible. Consumers can now "mashup" and readily integrate information for use in a myriad of alternative end uses. Indiscriminate addition of information can, however, come with inherent problems, such as the provision of poor quality, inaccurate, irrelevant or fraudulent information. All will come with associated costs of the consumed data which will negatively affect data consumer's benefit and Linked Data applications usage and uptake. In this thesis, we address these issues by proposing ODCleanStore, a Linked Da- ta management and querying tool able to provide data consumers with Linked Data, which is cleansed, properly linked, integrated, and trustworthy accord- ing to consumer's subjective requirements. Trustworthiness of data means that the data has associated...
28

Intégrer des sources de données hétérogènes dans le Web de données / Integrating heterogeneous data sources in the Web of data

Michel, Franck 03 March 2017 (has links)
Le succès du Web de Données repose largement sur notre capacité à atteindre les données stockées dans des silos invisibles du web. Dans les 15 dernières années, des travaux ont entrepris d’exposer divers types de données structurées au format RDF. Dans le même temps, le marché des bases de données (BdD) est devenu très hétérogène avec le succès massif des BdD NoSQL. Celles-ci sont potentiellement d’importants fournisseurs de données liées. Aussi, l’objectif de cette thèse est de permettre l’intégration en RDF de sources de données hétérogènes, et notamment d'alimenter le Web de Données avec les données issues des BdD NoSQL. Nous proposons un langage générique, xR2RML, pour décrire le mapping de sources hétérogènes vers une représentation RDF arbitraire. Ce langage étend des travaux précédents sur la traduction de sources relationnelles, CSV/TSV et XML en RDF. Sur cette base, nous proposons soit de matérialiser les données RDF, soit d'évaluer dynamiquement des requêtes SPARQL sur la base native. Dans ce dernier cas, nous proposons une approche en deux étapes : (i) traduction d’une requête SPARQL en une requête pivot, abstraite, en se basant sur le mapping xR2RML ; (ii) traduction de la requête abstraite en une requête concrète, prenant en compte les spécificités du langage de requête de la BdD cible. Un souci particulier est apporté à l'optimisation des requêtes, aux niveaux abstrait et concret. Nous démontrons l’applicabilité de notre approche via un prototype pour la populaire base MongoDB. Nous avons validé la méthode dans un cas d’utilisation réel issu du domaine des humanités numériques. / To a great extent, the success of the Web of Data depends on the ability to reach out legacy data locked in silos inaccessible from the web. In the last 15 years, various works have tackled the problem of exposing various structured data in the Resource Description Format (RDF). Meanwhile, the overwhelming success of NoSQL databases has made the database landscape more diverse than ever. NoSQL databases are strong potential contributors of valuable linked open data. Hence, the object of this thesis is to enable RDF-based data integration over heterogeneous data sources and, in particular, to harness NoSQL databases to populate the Web of Data. We propose a generic mapping language, xR2RML, to describe the mapping of heterogeneous data sources into an arbitrary RDF representation. xR2RML relies on and extends previous works on the translation of RDBs, CSV/TSV and XML into RDF. With such an xR2RML mapping, we propose either to materialize RDF data or to dynamically evaluate SPARQL queries on the native database. In the latter, we follow a two-step approach. The first step performs the translation of a SPARQL query into a pivot abstract query based on the xR2RML mapping of the target database to RDF. In the second step, the abstract query is translated into a concrete query, taking into account the specificities of the database query language. Great care is taken of the query optimization opportunities, both at the abstract and the concrete levels. To demonstrate the effectiveness of our approach, we have developed a prototype implementation for MongoDB, the popular NoSQL document store. We have validated the method using a real-life use case in Digital Humanities.
29

Interconnexion et visualisation de ressources géoréférencées du Web de données à l’aide d’un référentiel topographique de support / Interlinking and visualizing georeferenced resources of the Web of data with geographic reference data

Feliachi, Abdelfettah 27 October 2017 (has links)
Plusieurs ressources publiées sur le Web de données sont dotées de références spatiales qui décrivent leur localisation géographique. Ces références spatiales sont un moyen favori pour interconnecter et visualiser les ressources sur le Web de données. Cependant, les hétérogénéités des niveaux de détail et de modélisations géométriques entre les sources de données constituent un défi majeur pour l’utilisation de la comparaison des références spatiales comme critère pour l’interconnexion des ressources. Ce défi est amplifié par la nature ouverte et collaborative des sources de données du Web qui engendre des hétérogénéités géométriques internes aux sources de données. En outre, les applications de visualisation cartographique des ressources géoréférencées du Web de données ne fournissent pas une visualisation lisible à toutes les échelles.Dans cette thèse, nous proposons un vocabulaire pour formaliser les connaissances sur les caractéristiques de chaque géométrie dans un jeu de données. Nous proposons également une approche semi-automatique basée sur un référentiel topographique pour acquérir ces connaissances. Nous proposons de mettre en oeuvre ces connaissances dans une approche d’adaptation dynamique du paramétrage de la comparaison des géométries dans un processus d’interconnexion. Nous proposons une approche complémentaire s’appuyant sur un référentiel topographique pour la détection des liens de cardinalité n:m. Nous proposons finalement des applications qui s’appuient sur des données topographiques de référence et leurs liens avec les ressources géoréférencées du Web pour offrir une visualisation cartographique multiéchelle lisible et conviviale / Many resources published on the Web of data are related to spatial references that describe their location. These spatial references are a valuable asset for interlinking and visualizing data over the Web. However, these spatial references may be presented with different levels of detail and different geometric modelling from one data source to another. These differences are a major challenge for using geometries comparison as a criterion for interlinking georeferenced resources. This challenge is even amplified more due to the open and often volunteered nature of the data that causes geometric heterogeneities between the resources of a same data source. Furthermore, Web mapping applications of georeferenced data are limited when it comes to visualize data at different scales.In this PhD thesis, we propose a vocabulary for formalizing the knowledge about the characteristics of every single geometry in a dataset. We propose a semi-automatic approach for acquiring this knowledge by using geographic reference data. Then, we propose to use this knowledge in approach for adapting dynamically the setting of the comparison of each pair of geometries during an interlinking process. We propose an additional interlinking approach based on geographic reference data for detecting n:m links between data sources. Finally, we propose Web mapping applications for georeferenced resources that remain readable at different map scales
30

Linked Open Projects: Nachnutzung von Ergebnissen im Semantic Web

Pfeffer, Magnus, Eckert, Kai 28 January 2011 (has links)
Semantic Web und Linked Data sind in aller Munde. Nach fast einem Jahrzehnt der Entwicklung der Technologien und Erforschung der Möglichkeiten des Semantic Webs rücken nun die Daten in den Mittelpunk, denn ohne diese wäre das Semantic Web nicht mehr als ein theoretisches Konstrukt. Fast wie das World Wide Web ohne Websites. Bibliotheken besitzen mit Normdaten (PND, SWD) und Titelaufnahmen eine Fülle Daten, die sich zur Befüllung des Semantic Web eignen und teilweise bereits für das Semantic Web aufbereitet und zur Nutzung freigegeben wurden. Die Universitätsbibliothek Mannheim hat sich in zwei verschiedenen Projekten mit der Nutzung solcher Daten befasst – allerdings standen diese zu diesem Zeitpunkt noch nicht als Linked Data zur Verfügung. In einem Projekt ging es um die automatische Erschließung von Publikationen auf der Basis von Abstracts, im anderen Projekt um die automatische Klassifikation von Publikationen auf der Basis von Titeldaten. Im Rahmen dieses Beitrags stellen wir die Ergebnisse der Projekte kurz vor, möchten aber im Schwerpunkt auf einen Nebenaspekt eingehen, der sich erst im Laufe dieser Projekte herauskristallisiert hat: Wie kann man die gewonnenen Ergebnisse dauerhaft und sinnvoll zur Nachnutzung durch Dritte präsentieren? Soviel vorweg: Beide Verfahren können und wollen einen Bibliothekar nicht ersetzen. Die Einsatzmöglichkeiten der generierten Daten sind vielfältig. Konkrete Einsätze, zum Beispiel das Einspielen in einen Verbundkatalog, sind aber aufgrund der Qualität und mangelnden Kontrolle der Daten umstritten. Die Bereitstellung dieser Daten als Linked Data im Semantic Web ist da eine naheliegende Lösung – jeder, der die Ergebnisse nachnutzen möchte, kann das tun, ohne dass ein bestehender Datenbestand damit kompromittiert werden könnte. Diese Herangehensweise wirft aber neue Fragen auf, nicht zuletzt auch nach der Identifizierbarkeit der Ursprungsdaten über URIs, wenn diese (noch) nicht als Linked Data zur Verfügung stehen. Daneben erfordert die Bereitstellung von Ergebnisdaten aber auch weitere Maßnahmen, die über die gängige Praxis von Linked Data hinaus gehen: Die Bereitstellung von Zusatzinformationen, die die Quelle und das Zustandekommen dieser Daten näher beschreiben (Provenienzinformationen), aber auch weitere Informationen, die über das zugrunde liegende Metadatenschema meist hinausgehen, wie Konfidenzwerte im Falle eines automatischen Verfahrens der Datenerzeugung. Dazu präsentieren wir Ansätze auf Basis von RDF Reification und Named Graphs und schildern die aktuellen Entwicklungen auf diesem Gebiet, wie sie zum Beispiel in der Provenance Incubator Group des W3C und in Arbeitsgruppen der Dublin Core Metadaten-Initiative diskutiert werden.

Page generated in 0.061 seconds