81 |
An analysis of the impact of data errors on backorder rates in the F404 engine systemBurson, Patrick A. R. 03 1900 (has links)
Approved for public release; distribution in unlimited. / In the management of the U.S. Naval inventory, data quality is of critical importance. Errors in major inventory databases contribute to increased operational costs, reduced revenue, and loss of confidence in the reliability of the supply system. Maintaining error-free databases is not a realistic objective. Data-quality efforts must be prioritized to ensure that limited resources are allocated to achieve the maximum benefit. This thesis proposes a methodology to assist the Naval Inventory Control Point in the prioritization of its data-quality efforts. By linking data errors to Naval inventory performance metrics, statistical testing is used to identify errors that have the greatest adverse impact on inventory operations. By focusing remediation efforts on errors identified in this manner, the Navy can best use its limited resources devoted to improvement of data quality. Two inventory performance metrics are considered: Supply Material Availability (SMA), an established metric in Naval inventory management; and Backorder Persistence Metric (BPM), which is developed in the thesis. Backorder persistence measures the duration of time that the ratio of backorders to quarterly demand exceeds a threshold value. Both metrics can be used together to target remediation on reducing shortage costs and improving inventory system performance. / Lieutenant Commander, Supply Corps, United States Navy
|
82 |
Performance du calorimètre à argon liquide et recherche du boson de Higgs dans son canal de désintégration H -->ZZ*->4l avec l'expérience ATLAS auprès du LHC / Performance of the liquid argon calorimeter, search and study of the Higgs boson in the channel H -> ZZ* -> 4 l with the ATLAS detectorTiouchichine, Elodie 14 November 2014 (has links)
Les travaux de thèse effectués au sein de la collaboration ATLAS et présentés dans ce manuscrit se sont déroulés dans le contexte de la découverte d'une nouvelle particule dans la recherche du boson de Higgs du Modèle Standard au LHC. Après une introduction du cadre théorique, le LHC et le détecteur ATLAS sont présentés ainsi que leurs performances durant cette première phase de prise de données comprenant les données acquises en 2011 et 2012. Une attention particulière est portée aux calorimètres à argon liquide et au traitement de la qualité des données enregistrées par ce système. Des études de validation des données collectées durant des conditions non nominales de la haute tension des calorimètres à argon liquide ont abouti à la récupération de 2% des données collectées, les rendant ainsi disponibles pour l'analyse de physique. Ceci a un impact direct sur le canal H -> ZZ* -> 4 l où le nombre d'événements attendus est faible. Dans le but d'optimiser l'acceptance du canal de désintégration en quatre électrons, des nouveaux algorithmes de reconstruction des électrons ont été introduits en 2012, et la mesure de son efficacité est présentée. Le gain d'efficacité allant jusqu'à 7% pour les électrons de basse énergie transverse (15<E_T<20 GeV) constitue une des améliorations de l'analyse dans le canal H -> ZZ* -> 4 l qui est présentée pour les données 2012. Les méthodes d'estimation du bruit de fond réductible des canaux contenant des électrons dans l'état final ont été au centre de l'attention durant la période qui a suivi la découverte et sont particulièrement décrites. Les mesures de propriétés du boson découvert, basées sur les données de 2011 et 2012 sont présentées. / The work presented in this thesis within the ATLAS collaboration was performed in the context of the discovery of a new particle at the LHC in the search for the Standard Model Higgs boson. My contribution to the Higgs boson search is focused in the H -> ZZ* -> 4 l channel at different level, from the data taking to the physics analysis. After a theoretical introduction, the LHC and the ATLAS detector are presented as well as their performance during the 2011 and 2012 runs. A particular consideration is given to the liquid argon calorimeters and to the data quality assesment of this system. The validation of the data recorded during non-nominal high voltage conditions is presented. This study allowed to recover 2% of the data collected available for physics analyses. This has a direct impact on the H -> ZZ* -> 4 l channel were the number of signal events expected is very low. In order to optimize the acceptance of the four electrons decay channel, novel electron reconstruction algorithms were introducted in 2012. The measurement of their efficiency is presented. The efficiency gain reaching 7% for low transverse energy electrons (15<E_T<20 GeV) is one of the main improvements in the H -> ZZ* -> 4 l analysis presented using the data recorded in 2012. The reducible background estimation methods in the channels containing electrons in the final state that were of primary importance after the discovery are detailed. Finally, the measurement of the new boson properties are presented based on the 2011 and the 2012 recorded data.
|
83 |
Rámec hodnocení problémů s datovou kvalitou / Framework for Data quality assessmentŠíp, Libor January 2009 (has links)
The goal of this thesis is to produce a tool that allow fast, conclusively calculate impacts of problems caused by poor data quality. Before making the tool is necessary to be familiarize with problems related to data quality like what is data quality, how to manage data quality, how to evaluate data quality and finaly how to choose right solution. First chapter is about Data Quality and shows it in different views. There is an explanation of the Data Quality meaning there, its state in companies is shown on research done by specialized organizations. The way how to solve Data Quality mananagement through the whole company is described in chapter two, which deals with Data Governance. The third chapter explains what is a Business Case and why should a Business Case be written. Otherwise shows key elements of a Business Case and provide approach how to calculate costs and benefits of the project. The fourth chapter describes potential threat which is related with decisions and how to minimize them. The fifth chapter shows examples of problems of Data Quality in an organization. There is an attempt to find evaluation of the problem and its causes. Suggestions of potential solutions towards disponsible finacial sources is made at the end. Comparison of the quality of solutions is also done.
|
84 |
Avaliação da qualidade do dado espacial digital de acordo com parâmetros estabelecidos por usuários. / Digital spatial data quality evaluation based on users parameters.Salisso Filho, João Luiz 02 May 2013 (has links)
Informações espaciais estão cada vez mais disseminadas no cotidiano do cidadão comum, de empresas e de instituições governamentais. Aplicações como o Google Earth, Bing Maps, aplicativos de localização por GPS, entre outros apresentam a informação espacial como uma commodity. Cada vez mais empresas públicas e privadas incorporam o dado espacial em seu processo decisório, tornando ainda mais crítico a questão da qualidade deste tipo de dado. Dada a natureza multidisciplinar e, principalmente, o volume de informações disponibilizadas para os usuários, faz-se necessário apresentar um método de avaliação de dados apoiado por processos computacionais, que permita ao usuário avaliar a verdadeira adequação que tais dados têm frente ao uso pretendido. Nesta Dissertação de Mestrado propõe-se uma metodologia estruturada de avaliação de dados espaciais apoiada por computador. A metodologia utilizada, baseada em normas apresentadas pela International Standards Organization (ISO), permite ao usuário de dados espaciais avaliar sua qualidade comparando a qualidade do dado de acordo com os parâmetros estabelecidos pelo próprio usuário. Também permite ao usuário comparar a qualidade apresentada pelo dado espacial com a informação de qualidade provida pelo produtor do dado. Desta forma, o método apresentado, ajuda o usuário a determinar a real adequação do dado espacial ao seu uso pretendido. / Spatial information is increasingly widespread in everyday life of ordinary people, businesses and government institutions. Applications like Google Earth, Bing Maps, GPS location applications, among others present spatial data as a commodity. More and more public and private companies incorporate the usage of spatial data into their decision process, increasing the importance of spatial quality issues. Given the multidisciplinary nature and, especially, the volume of information available to all users, it is necessary to introduce a data quality evaluation method supported by computational processes, enabling the end user to evaluate the real fitness for use that such data have for an intended use. This dissertation aims to present a structure methodology for spatial data evaluation supported by computational process. The methodology, based on standards provided by the International Standards Organization (ISO), allows users of spatial information evaluating the quality of spatial data comparing the quality of information against users own quality parameters. It will also allow the user to compare the quality presented by the given spatial data with quality information provided by the data producer. Thus, the presented method will support the end user in determining the real fitness for use for the spatial data.
|
85 |
Metodologia para controle de qualidade de cartas topográficas digitais / Quality control methodology of digital topographic mapsInui, Cesar 19 December 2006 (has links)
Hoje, existem muitas empresas de Cartografia que utilizam sistemas CAD para produção de cartas topográficas digitais.Este trabalho tem como proposta a identificação e classificação de erros de atributo gráfico em mapeamento digital, especialmente dados construídos em CAD (Computer Aided Design). Se os dados serão utilizados posteriormente num Sistema de Informações Geográficas, os dados espaciais deverão ser coletados de tal maneira que facilitem a inserção de topologia após a transferência dos dados. Como objetivo secundário, o trabalho propõe um melhor controle de qualidade, demonstrando seqüência lógica de tarefas para revisão e correção de problemas em dados espaciais / There are many Cartography corporations wich use CAD systems to built digital Topographic maps.This research intend to identify and classify errors of graphic attribute in digital mapping, specially digital spactial data built in CAD (Computer Aided Design). If these data will be applied in a Geographic Information System, they must be designed in a way that could easily perform spatial relationships (topology) after the data transfer.As a secondary objective, there is a suggestion of a better data quality control, showing a logical sequence of tasks to check up and correct problems in spatial data
|
86 |
Extensions de BPMN 2.0 et méthode de gestion de la qualité pour l'interopérabilité des données / BPMN 2.0 extensions and quality management method for nterprise data interoperability.Heguy, Xabier 13 December 2018 (has links)
Business Process Model and Notation (BPMN) est en train de devenir le standard le plus utilisé pour la modélisation de processus métiers. Une des principales améliorations de BPMN 2.0 par rapport à BPMN 1.2 est le fait que les objets de données comportent maintenant des éléments sémantiques. Toutefois, BPMN ne permet pas la représentation des mesures de la performance dans le cadre de l'interopérabilité liée aux échanges de données. Ceci représente une lacune dans l'utilisation de BPMN quand on veut représenter les problèmes entrainés par un manque d'interopérabilité dans les processus métiers. Nous proposons d'étendre le méta-modèle Meta-Object Facility meta-model et le XML Schema Definition de BPMN ainsi que sa représentation graphique dans le but de combler ce manque. L'extension, nommée performanceMeasurement, est définie en utilisant le mécanisme d'extension de BPMN. Ce nouvel élément permettra de représenter les mesures de performance dans le cadre de problèmes d'interopérabilité ainsi que dans celui où ces problèmes ont été résolus. L'utilisation de cette extension est illustrée dans un cas d'étude industriel réel. / Business Process Model and Notation (BPMN) is being becoming the most used standard for business process modelling. One of the important upgrades of BPMN 2.0 with respect to BPMN 1.2 is the fact that Data Objects are now handling semantic elements. Nevertheless, BPMN doesn't enable the representation of performance measurement in the case of interoperability problems in the exchanged data object, which remains a limitation when using BPMN to express interoperability issues in enterprise processes. We propose to extend the Meta-Object Facility meta-model and the XML Schema Definition of BPMN as well as the notation in order to fill this gap. The extension, named performanceMeasurement, is defined using the BPMN Extension Mechanism. This new element will allow to represent performance measurement in the case of interoperability problems as well as interoperability concerns which have been solved. We illustrate the use of this extension with an example from a real industrial case.
|
87 |
Exploratory Visualization of Data with Variable QualityHuang, Shiping 11 January 2005 (has links)
Data quality, which refers to correctness, uncertainty, completeness and other aspects of data, has became more and more prevalent and has been addressed across multiple disciplines. Data quality could be introduced and presented in any of the data manipulation processes such as data collection, transformation, and visualization. Data visualization is a process of data mining and analysis using graphical presentation and interpretation. The correctness and completeness of the visualization discoveries to a large extent depend on the quality of the original data. Without the integration of quality information with data presentation, the analysis of data using visualization is incomplete at best and can lead to inaccurate or incorrect conclusions at worst. This thesis addresses the issue of data quality visualization. Incorporating data quality measures into the data displays is challenging in that the display is apt to be cluttered when faced with multiple dimensions and data records. We investigate both the incorporation of data quality information in traditional multivariate data display techniques as well as develop novel visualization and interaction tools that operate in data quality space. We validate our results using several data sets that have variable quality associated with dimensions, records, and data values.
|
88 |
Quality based approach for updating geographic authoritative datasets from crowdsourced GPS traces / Une approche basée sur la qualité pour mettre à jour les bases de données géographiques de référence à partir de traces GPS issues de la fouleIvanovic, Stefan 19 January 2018 (has links)
Ces dernières années, le besoin de données géographiques de référence a significativement augmenté. Pour y répondre, il est nécessaire de mettre jour continuellement les données de référence existantes. Cette tâche est coûteuse tant financièrement que techniquement. Pour ce qui concerne les réseaux routiers, trois types de voies sont particulièrement complexes à mettre à jour en continu : les chemins piétonniers, les chemins agricoles et les pistes cyclables. Cette complexité est due à leur nature intermittente (elles disparaissent et réapparaissent régulièrement) et à l’hétérogénéité des terrains sur lesquels elles se situent (forêts, haute montagne, littoral, etc.).En parallèle, le volume de données GPS produites par crowdsourcing et disponibles librement augmente fortement. Le nombre de gens enregistrant leurs positions, notamment leurs traces GPS, est en augmentation, particulièrement dans le contexte d’activités sportives. Ces traces sont rendues accessibles sur les réseaux sociaux, les blogs ou les sites d’associations touristiques. Cependant, leur usage actuel est limité à des mesures et analyses simples telles que la durée totale d’une trace, la vitesse ou l’élévation moyenne, etc. Les raisons principales de ceci sont la forte variabilité de la précision planimétrique des points GPS ainsi que le manque de protocoles et de métadonnées (par ex. la précision du récepteur GPS).Le contexte de ce travail est l’utilisation de traces GPS de randonnées pédestres ou à vélo, collectées par des volontaires, pour détecter des mises à jours potentielles de chemins piétonniers, de voies agricoles et de pistes cyclables dans des données de référence. Une attention particulière est portée aux voies existantes mais absentes du référentiel. L’approche proposée se compose de trois étapes : La première consiste à évaluer et augmenter la qualité des traces GPS acquises par la communauté. Cette qualité a été augmentée en filtrant (1) les points extrêmes à l’aide d’un approche d’apprentissage automatique et (2) les points GPS qui résultent d’une activité humaine secondaire (en dehors de l’itinéraire principal). Les points restants sont ensuite évalués en termes de précision planimétrique par classification automatique. La seconde étape permet de détecter de potentielles mises à jour. Pour cela, nous proposons une solution d’appariement par distance tampon croissante. Cette distance est adaptée à la précision planimétrique des points GPS classifiés pour prendre en compte la forte hétérogénéité de la précision des traces GPS. Nous obtenons ainsi les parties des traces n’ayant pas été appariées au réseau de voies des données de référence. Ces parties sont alors considérées comme de potentielles voies manquantes dans les données de référence. Finalement nous proposons dans la troisième étape une méthode de décision multicritère visant à accepter ou rejeter ces mises à jour possibles. Cette méthode attribue un degré de confiance à chaque potentielle voie manquante. L’approche proposée dans ce travail a été évaluée sur un ensemble de trace GPS multi-sources acquises par crowdsourcing dans le massif des Vosges. Les voies manquantes dans les données de références IGN BDTOPO® ont été détectées avec succès et proposées comme mises à jour potentielles / Nowadays, the need for very up to date authoritative spatial data has significantly increased. Thus, to fulfill this need, a continuous update of authoritative spatial datasets is a necessity. This task has become highly demanding in both its technical and financial aspects. In terms of road network, there are three types of roads in particular which are particularly challenging for continuous update: footpath, tractor and bicycle road. They are challenging due to their intermittent nature (e.g. they appear and disappear very often) and various landscapes (e.g. forest, high mountains, seashore, etc.).Simultaneously, GPS data voluntarily collected by the crowd is widely available in a large quantity. The number of people recording GPS data, such as GPS traces, has been steadily increasing, especially during sport and spare time activities. The traces are made openly available and popularized on social networks, blogs, sport and touristic associations' websites. However, their current use is limited to very basic metric analysis like total time of a trace, average speed, average elevation, etc. The main reasons for that are a high variation of spatial quality from a point to a point composing a trace as well as lack of protocols and metadata (e.g. precision of GPS device used).The global context of our work is the use of GPS hiking and mountain bike traces collected by volunteers (VGI traces), to detect potential updates of footpaths, tractor and bicycle roads in authoritative datasets. Particular attention is paid on roads that exist in reality but are not represented in authoritative datasets (missing roads). The approach we propose consists of three phases. The first phase consists of evaluation and improvement of VGI traces quality. The quality of traces was improved by filtering outlying points (machine learning based approach) and points that are a result of secondary human behaviour (activities out of main itinerary). Remained points are then evaluated in terms of their accuracy by classifying into low or high accurate (accuracy) points using rule based machine learning classification. The second phase deals with detection of potential updates. For that purpose, a growing buffer data matching solution is proposed. The size of buffers is adapted to the results of GPS point’s accuracy classification in order to handle the huge variations in VGI traces accuracy. As a result, parts of traces unmatched to authoritative road network are obtained and considered as candidates for missing roads. Finally, in the third phase we propose a decision method where the “missing road” candidates should be accepted as updates or not. This decision method was made in multi-criteria process where potential missing roads are qualified according to their degree of confidence. The approach was tested on multi-sourced VGI GPS traces from Vosges area. Missing roads in IGN authoritative database BDTopo® were successfully detected and proposed as potential updates
|
89 |
Social data mining for crime intelligence : contributions to social data quality assessment and prediction methodsIsah, Haruna January 2017 (has links)
With the advancement of the Internet and related technologies, many traditional crimes have made the leap to digital environments. The successes of data mining in a wide variety of disciplines have given birth to crime analysis. Traditional crime analysis is mainly focused on understanding crime patterns, however, it is unsuitable for identifying and monitoring emerging crimes. The true nature of crime remains buried in unstructured content that represents the hidden story behind the data. User feedback leaves valuable traces that can be utilised to measure the quality of various aspects of products or services and can also be used to detect, infer, or predict crimes. Like any application of data mining, the data must be of a high quality standard in order to avoid erroneous conclusions. This thesis presents a methodology and practical experiments towards discovering whether (i) user feedback can be harnessed and processed for crime intelligence, (ii) criminal associations, structures, and roles can be inferred among entities involved in a crime, and (iii) methods and standards can be developed for measuring, predicting, and comparing the quality level of social data instances and samples. It contributes to the theory, design and development of a novel framework for crime intelligence and algorithm for the estimation of social data quality by innovatively adapting the methods of monitoring water contaminants. Several experiments were conducted and the results obtained revealed the significance of this study in mining social data for crime intelligence and in developing social data quality filters and decision support systems.
|
90 |
Metodika auditu datové kvality / Data Quality Audit MethodologyKotek, Aleš January 2008 (has links)
The goal of this thesis is to summarize and to describe all available know-how and experiences of Adastra employees related to Data Quality Audit in organization. The entire thesis should serve as a guideline for sales and implementation staff within the Adastra Corp. The first part of this thesis (chapter 2 and 3) is generally concerned with Data Quality, i.e. provides various definitions of Data Quality, points out importance/relevance of Data Quality in organization and describes the most important tools and Data Quality Management Solutions. The second part (chapter 4 and 5) uses the theoretical basis of the previous chapters and form the main methodical part of this thesis. Chapter 4 is rather focused on business/sales side, defines the most important terms and used principles, and is considered as a necessary precondition for correct understanding following chapter. Chapter 5 shows detailed procedures of Data Quality Audit. Single activities are written in a standardized form to ensure clear, accurate and brief step description. The result of this thesis is the most detailed description of Data Quality Audit in Adastra Corp. including all identified services/products.
|
Page generated in 0.0793 seconds