Global ETD Search

251	Kvalita dat a efektivní využití rejstříků státní správy / Data Quality and Effective Use of Registers of State Administration Rut, Lukáš January 2009 (has links) This diploma thesis deals with registers of state administration in term of data quality. The main objective is to analyze the ways how to evaluate data quality and to apply appropriate method to data in business register. Analysis of possibilities of data cleansing and data quality improving and proposal of solution of found inaccuracy in business register is another objective. The last goal of this paper is to analyze approaches how to set identifier of persons and to choose suitable key for identification of persons in registers of state administration. The thesis is divided into several parts. The first one includes introduction into the sphere of registers of state administration. It closely analyzes several selected registers especially in terms of which data contain and how they are updated. Description of legislation changes, which will come into operation in the middle of year 2010, is great contribution of this part. Special attention is dedicated to the impact of these changes from data quality point of view. Next part deals with problems of legal and physical entities identifiers. This section contains possible solution how to identify entities in data from registers. Third part analyzes ways how to determine data quality. Method called data profiling is closely described and applied to extensive data quality analysis of business register. Correct metadata and information about incorrect data are the outputs of this analysis. The last chapter deals with possibilities how to solve data quality problems. There are proposed and compared three variations of solution. The paper as a whole represents compact material how to solve problems with effective using of data contained in registers of state administration. Nevertheless, proposed solutions and described approaches can be used in many other projects which deal with data quality.
252	Master Data Management, Integrace zákaznických dat a hodnota pro business / Master Data Management, Customer Data Integration and value for business Rais, Filip January 2009 (has links) This thesis is focused on Master Data Management (MDM), Customer Data Integration (CDI) area and its main domains. It is also a reference to a various theoretical directions that can be found in this area of expertise. It summarizes main aspects, domains and presents different perspectives to referenced principles. It is an exhaustive background research in area of Master Data Management with emphasis on practical use with references on authors experience and opinions. Secondary focus is directed to the field of business value of Master Data Management initiatives. Thesis presents a thought concept for initiations of MDM project. The reason for such a concept is based on current trend, where companies are struggling to determine actual benefits of MDM initiatives. There is overall accord on the subject of necessity of such initiatives, but the struggle is in area of determining actual measureable impact on company's revenue or profit. Since the MDM initiative is more of an enabling function, rather than direct revenue function, the benefit is less straight forward and therefore harder to determine. This work describes different layers and mapping of business requirements through layers for transparent linkage between enabling functions to revenue generating ones. The emphasis is given to financial benefit calculation, measurability and responsibility of business and IT departments. To underline certain conclusions thesis also presents real world interviews with possible stakeholders of MDM initiative within the company. These representatives were selected as key drivers for such an initiative. Interviews map their recognition of MDM and related terms. It also focus on their reasons and expectations from MDM. The representatives were also selected to equally represent business and IT departments, which presents interesting clash of views and expectations.
253	Contributions à une nouvelle approche de Recherche d'Information basée sur la métaphore de l'impédance et illustrée sur le domaine de la santé / Contributions to a new information retrieving approach based on the impedance metaphor and illustrated on the health domain Guemeida, Abdelbasset 16 October 2009 (has links) Les récentes évolutions dans les technologies de l’information et de la communication, avec le développement de l’Internet, conduisent à l’explosion des volumes des sources de données. Des nouveaux besoins en recherche d’information émergent pour traiter l’information en relation aux contextes d’utilisation, augmenter la pertinence des réponses et l’usabilité des résultats produits, ainsi que les possibles corrélations entre sources de données, en rendant transparentes leurs hétérogénéités. Les travaux de recherche présentés dans ce mémoire apportent des contributions à la conception d’une Nouvelle Approche de Recherche d’Information (NARI) pour la prise de décision. NARI vise à opérer sur des grandes masses de données cataloguées, hétérogènes, qui peuvent être géo référencées. Elle est basée sur des exigences préliminaires de qualité (standardisation, réglementations), exprimées par les utilisateurs, représentées et gérées à l’aide des métadonnées. Ces exigences conduisent à pallier le manque de données ou leur insuffisante qualité, pour produire une information de qualité suffisante par rapport aux besoins décisionnels. En utilisant la perspective des utilisateurs, on identifie et/ou on prépare des sources de données, avant de procéder à l’étape d’intégration des contenus. L’originalité de NARI réside dans la métaphore de l’écart d’impédance (phénomène classique lorsque on cherche à connecter deux systèmes physiques hétérogènes). Cette métaphore, dont R. Jeansoulin est à l’origine, ainsi que l’attention portée au cadre réglementaire, en guident la conception. NARI est structurée par la dimension géographique (prise en compte de divers niveaux de territoires, corrélations entre plusieurs thématiques) : des techniques d’analyse spatiale supportent des tâches de la recherche d’information, réalisées souvent implicitement par les décideurs. Elle s’appuie sur des techniques d’intégration de données (médiation, entrepôts de données), des langages de représentation des connaissances et des technologies et outils relevant du Web sémantique, pour supporter la montée en charge, la généralisation et la robustesse théorique de l’approche. NARI est illustrée sur des exemples relevant de la santé / The recent developments in information and communication technologies along with the growth of the Internet have lead to the explosion of data source volumes. This has created many growing needs such as in information retrieval to: treat the information according to its usage context, to increase the relevance of answers and the usability of results, and to increase the potential correlations between results, which can be done by making the heterogeneities and source distribution transparent. Our contributions consist in designing a NARI (New Approach to Information Retrieval) for decision-making. NARI is designed to operate on large amounts of catalogued and heterogeneous data that can be geo-referenced. It is based on quality preliminary requirements expressed by users, which are represented and managed using metadata. These requirements lead to the lack of data or their insufficient quality in relation to decision-making needs. Using the users’ perspective, we identify and/or prepare the data sources, before integration step processing. NARI’s originality relies on the metaphor of the impedance mismatch (classical phenomenon when we try to connect two physical heterogeneous systems), due to R. Jeansoulin. This metaphor, as well as the attention paid to regulatory framework (standardization), guides the design of NARI. The geographical dimension structures NARI, taking into account various territorial levels, correlations between several themes. Thus, it takes advantage of spatial analysis techniques, by automating information retrieval tasks, often implicitly made by policy makers. NARI is based on data integration techniques (mediation, data warehouses), knowledge representation languages and a set of Semantic Web technologies and tools, adapted to support the scalability, robustness and generalization theory of the approach. NARI is illustrated on examples relevant to the health domain Recherche d'information Impédance Qualité des données Besoins préliminaires Métadonnées Information géographique Standardisation Applications en santé Information retrieval Impedance Data quality Early requirements Metadata Geographic information Standardization Health applications
254	Managing and Consuming Completeness Information for RDF Data Sources Darari, Fariz 04 July 2017 (has links) (PDF) The ever increasing amount of Semantic Web data gives rise to the question: How complete is the data? Though generally data on the Semantic Web is incomplete, many parts of data are indeed complete, such as the children of Barack Obama and the crew of Apollo 11. This thesis aims to study how to manage and consume completeness information about Semantic Web data. In particular, we first discuss how completeness information can guarantee the completeness of query answering. Next, we propose optimization techniques of completeness reasoning and conduct experimental evaluations to show the feasibility of our approaches. We also provide a technique to check the soundness of queries with negation via reduction to query completeness checking. We further enrich completeness information with timestamps, enabling query answers to be checked up to when they are complete. We then introduce two demonstrators, i.e., CORNER and COOL-WD, to show how our completeness framework can be realized. Finally, we investigate an automated method to generate completeness statements from text on the Web via relation cardinality extraction. Datenqualität rdf sparql Semantic Web data quality data completeness semantic web query completeness query soundness rdf sparql ddc:004 rvk:ST 265
255	IBM Cognos Report Studio as an Effective Tool for Human Capital Reporting / IBM Cognos Report Studio jako efektivní nástroj reportingu v oblasti lidského kapitálu Zinchenko, Yulia January 2013 (has links) Main topic discussed in this diploma thesis is corporate reporting in terms of Human Capital using Business Intelligence tools, specifically IBM Cognos Report Studio. One of the objectives is to show step-by-step methodology of creating complex dynamic report, which includes data structure modeling, layout design and quality check. Another objective is to conduct Cost-Benefit Analysis for a real-life project, which is focused on recreating of Excel-based report in Cognos-based environment in order to automate information flows. Essential part of the diploma thesis is theoretical background of Business Intelligence aspects of data quality and visualization as well as purposes of human capital reporting and description of appropriate KPIs. Objectives are addressed by conducting analysis and research of resources related to topics described above as well as using IBM Cognos Report Studio provided by one of the major companies in financial advisory field. This diploma thesis represents relevant reading for those, who are interested in real-life application of data quality improvement and information flow automation using Business Intelligence reporting tools.
256	Prise en compte des fluctuations spatio-temporelles pluies-débits pour une meilleure gestion de la ressource en eau et une meilleure évaluation des risques / Taking into account the space-time rainfall-discharge fluctuations to improve water resource management and risk assessment Hoang, Cong Tuan 30 November 2011 (has links) Réduire la vulnérabilité et accroître la résilience des sociétés d'aujourd'hui aux fortes précipitations et inondations exige de mieux caractériser leur très forte variabilité spatio-temporelle observable sur une grande gamme d'échelle. Nous mettons donc en valeur tout au long de cette thèse l'intérêt méthodologique d'une approche multifractale comme étant la plus appropriée pour analyser et simuler cette variabilité. Cette thèse aborde tout d'abord le problème de la qualité des données, qui dépend étroitement de la résolution temporelle effective de la mesure, et son influence sur l'analyse multifractale et la détermination de lois d'échelle des processus de précipitations. Nous en soulignons les conséquences pour l'hydrologie opérationnelle. Nous présentons la procédure SERQUAL qui permet de quantifier cette qualité et de sélectionner les périodes correspondant aux critères de qualité requise. Un résultat surprenant est que les longues chronologies de pluie ont souvent une résolution effective horaire et rarement de 5 minutes comme annoncée. Ensuite, cette thèse se penche sur les données sélectionnées pour caractériser la structure temporelle et le comportement extrême de la pluie. Nous analysons les sources d'incertitudes dans les méthodes multifractales « classiques » d'estimation des paramètres et nous en déduisons des améliorations pour tenir compte, par exemple, de la taille finie des échantillons et des limites de la dynamique des capteurs. Ces améliorations sont utilisées pour obtenir les caractéristiques multifractales de la pluie à haute résolution de 5 minutes pour plusieurs départements de la France (à savoir, les départements 38, 78, 83 et 94) et pour aborder la question de l'évolution des précipitations durant les dernières décennies dans le cadre du changement climatique. Cette étude est confortée par l'analyse de mosaïques radars concernant trois événements majeurs en région parisienne. Enfin, cette thèse met en évidence une autre application des méthodes développées, à savoir l'hydrologie karstique. Nous discutons des caractéristiques multifractales des processus de précipitation et de débit à différentes résolutions dans deux bassins versant karstiques au sud de la France. Nous analysons, en utilisant les mesures journalière, 30 minutes et 3 minutes, la relation pluie-débit dans le cadre multifractal. Ceci est une étape majeure dans la direction d'une définition d'un modèle multi-échelle pluie-débit du fonctionnement des bassins versants karstiques / To reduce vulnerability and to increase resilience of nowadays societies to heavy precipitations and floods require better understanding of their very strong spatio-temporal variability observable over a wide range of scales. Therefore, throughout this thesis we highlight the methodological interest of a multifractal approach as being most appropriate to analyze and to simulate such the variability. This thesis first discusses the problem of data quality, which strongly depends on the effective temporal resolution of the measurements, and its influence on multifractal analysis determining the scaling laws of precipitation processes. We emphasize the consequences for operational hydrology. We present the SERQUAL procedure that allows to quantify the data quality and to select periods corresponding to the required quality criteria. A surprising result is that long chronological series of rainfall often have an effective hourly data, rather than the pretended 5-minute rainfall data. Then, this thesis focuses on the selected data to characterize the temporal structure and extreme behaviour of rainfall. We analyze the sources of uncertainties of already "classical" multifractal methods for the parameter estimates, and we therefore developed their improvements considering e.g., the finite size of data samples and the limitation of sensor dynamics. These improvements are used to obtain proper multifractal characteristics of 5-minute high-resolution rainfall for several French departments (i.e., 38, 78, 83 and 94), and hence to raise the question of preicipitation evolution during the last decades in the context of climate change. This study is reinforced by the analysis of radar mosaics for three major events in the Paris region. Finally, this thesis highlights another application of the developed methods, i.e. for the karst hydrology. We discuss the multifractal characteristics of rainfall and runoff processes observed at the different resolutions in two karst watersheds on the south of France. Using daily, 30-minute and 3-minute measurements we analyse the rainfall-runoff relationships within the multifractal framework. This is a major step towards defining a rainfall-runoff multi-scale model of the karst watershed functioning Qualité des données Analyse multifractale Hydrologie karstique Séries temporelles Analyse de mosaïques radars Données à haute résolution Data quality Multifractal analysis Karst hydrology Time series Analysis of radar mosaics High resolution data
257	Investigating the influence of data quality on ecological niche models for alien plant invaders Wolmarans, Rene 08 October 2010 (has links) Ecological niche modelling is a method designed to describe and predict the geographic distribution of an organism. This procedure aims to quantify the species-environment relationship by describing the association between the organism’s occurrence records and the environmental characteristics at these points. More simply, these models attempt to capture the ecological niche that a particular organism occupies. A popular application of ecological niche models is to predict the potential distribution of invasive alien species in their introduced range. From a biodiversity conservation perspective, a pro-active approach to the management of invasions would be to predict the potential distribution of the species so that areas susceptible to invasion can be identified. The performance of ecological niche models and the accuracy of the potential range predictions depend on the quality of the data that is used to calibrate and evaluate the models. Three different types of input data can be used to calibrate models when producing potential distribution predictions in the introduced range of an invasive alien species. Models can be calibrated with native range occurrence records, introduced range occurrence records or a combination of records from both ranges. However, native range occurrence records might suffer from geographical bias as a result of biased sampling or incomplete sampling. When occurrence records are geographically biased, the underlying environmental gradients in which a species can persist are unlikely to be fully sampled, which could result in an underestimation of the potential distribution of the species in the introduced range. I investigated the impact of geographical bias in native range occurrence records on the performance of ecological niche models for 19 invasive plant species by simulating two geographical bias scenarios (six different treatments) in the native range occurrence records of the species. The geographical bias simulated in this study was sufficient to result in significant environmental bias across treatments, but despite this I did not find a significant effect on model performance. However, this finding was perhaps influenced by the quality of the testing dataset and therefore one should be wary of the possible effects of geographical bias when calibrating models with native range occurrence records or combinations there of. Secondly, models can be calibrated with records obtained from the introduced range of a species. However, when calibrating models with records from the introduced range, uncertainties in terms of the equilibrium status and introduction history could influence data quality and thus model performance. A species that has recently been introduced to a new region is unlikely to be in equilibrium with the environment as insufficient time will have elapsed to allow it to disperse to suitable areas, therefore the occurrence records available would be unlikely to capture its full environmental niche and therefore underestimate the species’ potential distribution. I compared model performance for seven invasive alien plant species with different simulated introduction histories when calibrated with native range records, introduced range records or a combination of records from both ranges. A single introduction, multiple introduction and well established scenario was simulated from the introduced range records available for a species. Model performance was not significantly different when compared between models that were calibrated with datasets representing these three types of input data under a simulated single introduction or multiple introduction scenario, indicating that these datasets probably described enough of the species environmental niche to be able to make accurate predictions. However, model performance was significantly different for models calibrated with introduced range records and a combination of records from both ranges under the well established scenario. Further research is recommended to fully understand the effects of introduction history on the niche of the species. Copyright / Dissertation (MSc)--University of Pretoria, 2009. / Zoology and Entomology / unrestricted Ecological niche models Invasive alien plant species Potential distribution Data quality Geographical bias Introduction history Environmental niche Maxent Model performance UCTD
258	Počítačová podpora pro monitoring a hodnocení kvality dat v klinickém výzkumu / Computer-aided data quality monitoring and assessment in clinical research Šiška, Branislav January 2018 (has links) The diploma thesis deals with the monitoring and evaluation of data in clinical research. Usual methods to identify incorrect data are one-dimensional statistical methods per each variable in the register. Proposed method enters directly into database and finds out outliers in data using machine learning combined with multidimensional statistical methods that transform all column variables of clinical register to one, representing one record of patient in the register. Algorithm of proposed method is written in Matlab.
259	Managing and Consuming Completeness Information for RDF Data Sources Darari, Fariz 20 June 2017 (has links) The ever increasing amount of Semantic Web data gives rise to the question: How complete is the data? Though generally data on the Semantic Web is incomplete, many parts of data are indeed complete, such as the children of Barack Obama and the crew of Apollo 11. This thesis aims to study how to manage and consume completeness information about Semantic Web data. In particular, we first discuss how completeness information can guarantee the completeness of query answering. Next, we propose optimization techniques of completeness reasoning and conduct experimental evaluations to show the feasibility of our approaches. We also provide a technique to check the soundness of queries with negation via reduction to query completeness checking. We further enrich completeness information with timestamps, enabling query answers to be checked up to when they are complete. We then introduce two demonstrators, i.e., CORNER and COOL-WD, to show how our completeness framework can be realized. Finally, we investigate an automated method to generate completeness statements from text on the Web via relation cardinality extraction. info:eu-repo/classification/ddc/004 ddc:004
260	Datenqualität in Sensordatenströmen Klein, Anja 19 June 2009 (has links) Die stetige Entwicklung intelligenter Sensorsysteme erlaubt die Automatisierung und Verbesserung komplexer Prozess- und Geschäftsentscheidungen in vielfältigen Anwendungsszenarien. Sensoren können zum Beispiel zur Bestimmung optimaler Wartungstermine oder zur Steuerung von Produktionslinien genutzt werden. Ein grundlegendes Problem bereitet dabei die Sensordatenqualität, die durch Umwelteinflüsse und Sensorausfälle beschränkt wird. Ziel der vorliegenden Arbeit ist die Entwicklung eines Datenqualitätsmodells, das Anwendungen und Datenkonsumenten Qualitätsinformationen für eine umfassende Bewertung unsicherer Sensordaten zur Verfügung stellt. Neben Datenstrukturen zur effizienten Datenqualitätsverwaltung in Datenströmen und Datenbanken wird eine umfassende Datenqualitätsalgebra zur Berechnung der Qualität von Datenverarbeitungsergebnissen vorgestellt. Darüber hinaus werden Methoden zur Datenqualitätsverbesserung entwickelt, die speziell auf die Anforderungen der Sensordatenverarbeitung angepasst sind. Die Arbeit wird durch Ansätze zur nutzerfreundlichen Datenqualitätsanfrage und -visualisierung vervollständigt. info:eu-repo/classification/ddc/004 ddc:004

Search results