Global ETD Search

91	Improve Data Quality By Using Dependencies And Regular Expressions Feng, Yuan January 2018 (has links) The objective of this study has been to answer the question of finding ways to improve the quality of database. There exists a lot of problems of the data stored in the database, like missing or spelling errors. To deal with the dirty data in the database, this study adopts the conditional functional dependencies and regular expressions to detect and correct data. Based on the former studies of data cleaning methods, this study considers the more complex conditions of database and combines the efficient algorithms to deal with the data. The study shows that by using these methods, the database’s quality can be improved and considering the complexity of time and space, there still has a lot of things to do to make the data cleaning process more efficiency. data cleaning data quality condition functional dependency regular expression Computer Systems Datorsystem
92	Vieses em estudos epidemiológicos: reflexão sobre o papel do monitoramento na condução de ensaios clínicos aleatorizados / Bias in epidemiologic studies: considerations on the role of clinical monitoring in randomized clinical trials. Miyaoka, Tatiana Midori 20 October 2015 (has links) Introdução: As práticas clínicas baseadas em evidências utilizam resultados de estudos bem desenhados e bem conduzidos que, compilados em revisões sistemáticas, auxiliam os profissionais da saúde e orientam-os de modo sintético e atualizado no manejo dos tratamentos. Em um estudo bem conduzido, os dados coletados apresentarão boa qualidade se obtidos a partir de protocolos bem definidos que incluem as orientações para o acompanhamento dos pacientes e ações padronizadas pelos profissionais envolvidos. O monitoramento do estudo permite acompanhar e controlar a execução das ações definidas no protocolo de tal forma que os resultados finais não apresentem vieses de seleção, de desempenho (performance), de detecção, de atrito (attrition) ou de relato. Entre os instrumentos que avaliam a qualidade do relato de ensaios clínicos, nenhum deles destaca a avaliação de ações de monitoramento que, constitui, segundo nosso ponto de vista, um elemento importante para assegurar a qualidade dos dados. Objetivo: Apresentar uma reflexão sobre vieses em ensaios clínicos aleatorizados e sobre o papel do monitoramento do estudo no controle e prevenção destes. Métodos: Estudo metodológico que se propôs a avaliar a qualidade de ensaios clínicos aleatorizados incluídos em uma revisão sistemática escolhida ad hoc que tratou do uso de estatinas na prevenção primária de doença cardiovascular. Análise de vieses dos estudos originais incluídos na revisão sistemática utilizando a ferramenta para avaliação de risco de viés em ensaios clínicos aleatorizados descrita no Cochrane Handbook for Systematic Reviews of Interventions (Manual Cochrane para Revisões Sistemáticas de Intervenção), versão 5.1.0. Foram identificadas e descritas em detalhes as ações do monitoramento que poderiam colaborar na minimização ou possível eliminação dos vieses. Foi realizada uma busca nos artigos originais para verificar se existia a descrição das ações relacionadas ao monitoramento. Resultados: Considerando o critério para possibilidade de ocorrência de cada um dos sete tipos de viés, os estudos BONE, CARDS, METEOR e MRC/BHF apresentaram a maior porcentagem (85,7 por cento ) de baixo risco de ocorrência de vieses, indicando possivelmente boa qualidade metodológica. Em contrapartida, em quatro estudos esta porcentagem foi menor que 50 por cento (estudos ASPEN, CERDIA, HYRIM e KAPS), indicando menor qualidade metodológica. Todos os estudos foram classificados como risco incerto para outras fontes de vieses por apresentarem patrocínio por indústria farmacêutica representando, sob nossa avaliação, conflito de interesse. Observou-se que o estudo AFCAPS/TexCAPS indicou que uma empresa que organiza pesquisas foi contratada pelo patrocinador para manejo administrativo e clínico e também dos dados. Porém, não foram descritos maiores detalhes sobre o monitoramento. Neste estudo, observou-se risco incerto para a geração de sequência aleatória, ocultação da alocação e outros vieses. Os demais potenciais vieses foram classificados como baixo risco. Conclusões: No presente trabalho, verificou-se que mesmo um ensaio clínico bem desenhado, relatado e avaliado como baixo risco para a ocorrência de vieses também está sujeito a ocorrência destes durante a sua condução. Vê-se como necessária a inclusão de um item específico sobre viés de conflito de interesse nos instrumentos de avaliação de qualidade metodológica de estudos. Reforça-se o papel do monitoramento para evitar ou minimizar erros sistemáticos, garantindo que o estudo seja realizado conforme o que foi inicialmente proposto. / Background: The clinical practices based on evidences use results obtained from well designed and conducted studies that compiled in systematic reviews assist and guide health professionals to manage patient treatments. In a well conducted study, data collected will have good quality if obtained from protocols that include guidance for patients follow-up and standardized procedures for personnel involved in the study. The study monitoring allows follow-up and control regarding the execution of tasks required by protocol in order to avoid selection bias, performance bias, detection bias, attrition bias or report bias in study results. The available tools to evaluate the quality of reporting of randomized clinical trials do not describe the monitoring actions that we believe are very important for data quality assurance. Objective: To reflect about bias occurrence in randomized clinical trials and the role of study monitoring in its control and prevention. Methods: Methodological study that evaluated the quality of randomized clinical trials included in a systematic review chosen ad hoc regarding the statin use for the primary prevention of cardiovascular disease. Analysis of the original studies using the Cochrane Collaboration´s tool for assessing risk of bias in randomized clinical trials described in Cochrane Handbook for Systematic Reviews of Interventions, version 5.1.0. The monitoring actions that could avoid or minimize bias occurrence were identified and described in details. A search for actions related to monitoring was also performed in the original articles. Results: Considering the criteria for the possibility of occurence of each of the seven bias types, the studies BONE, CARDS, METEOR and MRC/BHF presented a higher percentage (85.7 per cent ) of low risk for bias, possibly indicating a good methodologic quality. However, this percentage was less than 50 per cent in four studies (ASPEN, CERDIA, HYRIM e KAPS), indicating a poor methodologic quality. All studies were classified as unclear risk for other bias considering that they were sponsored by pharmaceutical industries representing, according to our evaluation, conflict of interest. It was observed that AFCAPS/TexCAPS study indicated that a company responsible for research organization was contracted by the sponsor for data, administrative and clinical management. However, further information about monitoring was not described. In this particular study, random sequence generation, allocation concealment and other bias were classified as unclear risk. The remaining potential biases were classified as low risk. Conclusion: At the present work, it was verified that even clinical trials that are well designed, reported and with low risk for bias might have problems during the study conduction. We understand as necessary the inclusion of a specific item about the bias of conflict of interest in the tools for evaluation of methodology of studies.We emphasize the role of monitoring to avoid or minimize systematic bias, ensuring that the study is performed according to what was initially proposed. Bias Data Quality Ensaio Clínico Aleatorizado Monitoramento Monitoring Qualidade do Dado Randomized Clinical Trial Viés
93	EVALUATE PROBE SPEED DATA QUALITY TO IMPROVE TRANSPORTATION MODELING Rahman, Fahmida 01 January 2019 (has links) Probe speed data are widely used to calculate performance measures for quantifying state-wide traffic conditions. Estimation of the accurate performance measures requires adequate speed data observations. However, probe vehicles reporting the speed data may not be available all the time on each road segment. Agencies need to develop a good understanding of the adequacy of these reported data before using them in different transportation applications. This study attempts to systematically assess the quality of the probe data by proposing a method, which determines the minimum sample rate for checking data adequacy. The minimum sample rate is defined as the minimum required speed data for a segment ensuring the speed estimates within a defined error range. The proposed method adopts a bootstrapping approach to determine the minimum sample rate within a pre-defined acceptance level. After applying the method to the speed data, the results from the analysis show a minimum sample rate of 10% for Kentucky’s roads. This cut-off value for Kentucky’s roads helps to identify the segments where the availability is greater than the minimum sample rate. This study also shows two applications of the minimum sample rates resulted from the bootstrapping. Firstly, the results are utilized to identify the geometric and operational factors that contribute to the minimum sample rate of a facility. Using random forests regression model as a tool, functional class, section length, and speed limit are found to be the significant variables for uninterrupted facility. Contrarily, for interrupted facility, signal density, section length, speed limit, and intersection density are the significant variables. Lastly, the speed data associated with the segments are applied to improve Free Flow Speed estimation by the traditional model. Minimum Sample Rate Bootstrapping Probe Data Quality Random Forests Free Flow Speed. Transportation Engineering
94	Multi-spectral remote sensing of native vegetation condition Sheffield, Kathryn Jane, kathryn.sheffield@dpi.vic.gov.au January 2009 (has links) Native vegetation condition provides an indication of the state of vegetation health or function relative to a stated objective or benchmark. Measures of vegetation condition provide an indication of the vegetation's capacity to provide habitat for a range of species and ecosystem functions through the assessment of selected vegetation attributes. Subsets of vegetation attributes are often combined into vegetation condition indices or metrics, which are used to provide information for natural resource management. Despite their value as surrogates of biota and ecosystem function, measures of vegetation condition are rarely used to inform biodiversity assessments at scales beyond individual stands. The extension of vegetation condition information across landscapes, and approaches for achieving this, using remote sensing technologies, is a key focus of the work presented in this thesis. The aim of this research is to assess the utility of multi-spectral remotely sensed data for the recovery of stand-level attributes of native vegetation condition at landscape scales. The use of remotely sensed data for the assessment of vegetation condition attributes in fragmented landscapes is a focus of this study. The influence of a number of practical issues, such as spatial scale and ground data sampling methodology, are also explored. This study sets limitations on the use of this technology for vegetation condition assessment and also demonstrates the practical impact of data quality issues that are frequently encountered in these types of applied integrated approaches. The work presented in this thesis demonstrates that while some measures of vegetation condition, such as vegetation cover and stem density, are readily recoverable from multi-spectral remotely sensed data, others, such as hollow-bearing trees and log length, are not easily derived from this type of data. The types of information derived from remotely sensed data, such as texture measures and vegetation indices, that are useful for vegetation condition assessments of this nature are also highlighted. The utility of multi-spectral remotely sensed data for the assessment of stand-level vegetation condition attributes is highly dependent on a number of factors including the type of attribute being measured, the characteristics of the vegetation, the sensor characteristics (i.e. the spatial, spectral, temporal, and radiometric resolution), and other spatial data quality considerations, such as site homogeneity and spatial scale. A series of case studies are presented in this thesis that explores the effects of these factors. These case studies demonstrate the importance of different aspects of spatial data and how data manipulation can greatly affect the derived relationships between vegetation attributes and remotely sensed data. The work documented in this thesis provides an assessment of what can be achieved from two sources of multi-spectral imagery in terms of recovery of individual vegetation attributes from remotely sensed data. Potential surrogate measures of vegetation condition that can be derived across broad scales are identified. This information could provide a basis for the development of landscape scale multi-spectral remotely sensed based vegetation condition assessment approaches, supplementing information provided by established site-based vegetation condition assessment approaches. Multi-spectral remote sensing vegetation condition spatial data quality ground data acquisition
95	The Design and Implementation of a Corporate Householding Knowledge Processor to Improve Data Quality Madnick, Stuart, Wang, Richard, Xian, Xiang 06 February 2004 (has links) Advances in Corporate Householding are needed to address certain categories of data quality problems caused by data misinterpretation. In this paper, we first summarize some of these data quality problems and our more recent results from studying corporate householding applications and knowledge exploration. Then we outline a technical approach to a Corporate Householding Knowledge Processor (CHKP) to solve a particularly important type of corporate householding problem - entity aggregation. We illustrate the operation of the CHKP by using a motivational example in account consolidation. Our CHKP design and implementation uses and expands on the COntext INterchange (COIN) technology to manage and process corporate householding knowled Corporate Household Corporate Householding Data Quality Context Mediation COntext INterchange Enterprise Knowledge Management Database Interoperability
96	Quality Assurance in Quantitative Microbial Risk Assessment: Application of methods to a model for Salmonella in pork Boone, Idesbald 31 January 2011 (has links) Quantitative microbial risk assessment (QMRA) is being increasingly used to support decision-making for food safety issues. Decision-makers need to know whether these QMRA results can be trusted, especially when urgent and important decisions have to be made. This can be achieved by setting up a quality assurance (QA) framework for QMRA. A Belgian risk assessment project (the METZOON project) aiming to assess the risk of human salmonellosis due to the consumption of fresh minced pork meat was used as a case study to develop and implement QA methods for the evaluation of the quality of input data, expert opinion, model assumptions, and the quality of the QMRA model (the METZOON model). The first part of this thesis consists of a literature review of available QA methods of interest in QMRA (chapter 2). In the next experimental part, different QA methods were applied to the METZOON model. A structured expert elicitation study (chapter 4) was set up to fill in missing parameters for the METZOON model. Judgements of experts were used to derive subjective probability density functions (PDFs) to quantify the uncertainty on the model input parameters. The elicitation was based on Cookes classical model (Cooke, 1991) which aims to achieve a rational consensus about the elicitation protocol and allowed comparing different weighting schemes for the aggregation of the experts PDFs. Unique to this method was the fact that the performance of experts as probability assessors was measured by the experts ability to correctly and precisely provide estimates for a set of seed variables (=variables from the experts area of expertise for which the true values were known to the analyst). The weighting scheme using the experts performance on a set of calibration variables was chosen to obtain the combined uncertainty distributions of lacking parameters for the METZOON model. A novel method for the assessment of data quality, known as the NUSAP (Numeral Unit Spread Assessment Pedigree) system (chapter 5) was tested to screen the quality of the METZOON input parameters. First, an inventory with the essential characteristics of parameters including the source of information, the sampling methodology and distributional characteristics was established. Subsequently the quality of these parameters was evaluated and scored by experts using objective criteria (proxy, empirical basis, methodological rigour and validation). The NUSAP method allowed to debate on the quality of the parameters within the members of the risk assessment team using a structured format. The quality evaluation was supported by graphical representations which facilitated decisions on the inclusion or exclusion of inputs into the model. It is well known that assumptions and subjective choices can have a large impact on the output of a risk assessment. To assess the value-ladenness (degree of subjectivity) of assumptions in the METZOON model a structured approach based on the protocol by Kloprogge et al. (2005) was chosen (chapter 6). The key assumptions for the METZOON model were first identified and then evaluated by experts in a workshop using four criteria: the influence of situational limitations, the plausibility, the choice space and the agreement among peers. The quality of the assumptions was graphically represented (using kite diagrams, pedigree charts and diagnostic diagrams) and allowed to identify assumptions characterised by high degree of subjectivity and high expected influence on the model results, which can be considered as weak links in the model. The quality assessment of the assumptions was taken into account to modify parts of the METZOON model, and allows to increase the transparency in the QMRA process. In a last application of a QA method, a quality audit checklist (Paisley, 2007) was used to critically review and score the quality of the METZOON model and to identify its strengths and weaknesses (chapter 7). A high total score (87%) was obtained by reviewing the METZOON model with the Paisley-checklist. A higher score would have been obtained if the model was subjected to external peer review, if a sensitivity analysis, validation of the model with recent data, updating/replacing expert judgement data with empirical data was carried out. It would also be advisable to repeat the NUSAP/Pedigree on the input data and assumptions of the final model. The checklist can be used in its current form to evaluate QMRA models and to support model improvements from the early phases of development up to the finalised model for internal as well as for external peer review of QMRAs. The applied QA methods were found useful to improve the transparency in the QMRA process and to open the debate about the relevance (fitness for purpose) of a QMRA. A pragmatic approach by combining several QA methods is recommendable, as the application of one QA method often facilitates the application of another method. Many QA methods (NUSAP, structured expert judgement, checklists) are however not yet or insufficiently described in QMRA related guidelines (at EFSA and WHO level). Another limiting factor is the time and resources which need to be taken into account as well. To understand the degree of quality required from a QMRA a clear communication with the risk managers is required. It is therefore necessary to strengthen the training in QA methods and in the communication of its results. Understanding the usefulness of these QA methods could improve among the risk analysis actors when they will be tested in large number of QMRAs. quantitative microbial risk assessment checklist structured expert judgement assumptions data quality NUSAP Quality assurance
97	Datenqualität als Schlüsselfrage der Qualitätssicherung an Hochschulen / Data Quality as a key issue of quality assurance in higher education Pohlenz, Philipp January 2008 (has links) Hochschulen stehen zunehmend vor einem Legitimationsproblem bezüglich ihres Umgangs mit (öffentlich bereit gestellten) Ressourcen. Die Kritik bezieht sich hauptsächlich auf den Leistungsbereich der Lehre. Diese sei ineffektiv organisiert und trage durch schlechte Studienbedingungen – die ihrerseits von den Hochschulen selbst zu verantworten seien – zu langen Studienzeiten und hohen Abbruchquoten bei. Es wird konstatiert, dass mit der Lebenszeit der Studierenden verantwortungslos umgegangen und der gesellschaftliche Ausbildungsauftrag sowohl von der Hochschule im Ganzen, als auch von einzelnen Lehrenden nicht angemessen wahrgenommen werde. Um die gleichzeitig steigende Nachfrage nach akademischen Bildungsangeboten befriedigen zu können, vollziehen Hochschulen einen Wandel zu Dienstleistungsunternehmen, deren Leistungsfähigkeit sich an der Effizienz ihrer Angebote bemisst. Ein solches Leitbild ist von den Steuerungsgrundsätzen des New Public Management inspiriert. In diesem zieht sich der Staat aus der traditionell engen Verbindung zu den Hochschulen zurück und gewährt diesen lokale Autonomie, bspw. durch die Einführung globaler Haushalte zu ihrer finanziellen Selbststeuerung. Die Hochschulen werden zu Marktakteuren, die sich in der Konkurrenz um Kunden gegen ihre Wettbewerber durchsetzen, indem sie Qualität und Exzellenz unter Beweis stellen. Für die Durchführung von diesbezüglichen Leistungsvergleichen werden unterschiedliche Verfahren der Evaluation eingesetzt. In diese sind landläufig sowohl Daten der Hochschulstatistik, bspw. in Form von Absolventenquoten, als auch zunehmend Befragungsdaten, meist von Studierenden, zur Erhebung ihrer Qualitätseinschätzungen zu Lehre und Studium involviert. Insbesondere letzteren wird vielfach entgegen gehalten, dass sie nicht geeignet seien, die Qualität der Lehre adäquat abzubilden. Vielmehr seien sie durch subjektive Verzerrungen in ihrer Aussagefähigkeit eingeschränkt. Eine Beurteilung, die auf studentischen Befragungsdaten aufsetzt, müsse entsprechend zu Fehleinschätzungen und daraus folgend ungerechten Leistungssanktionen kommen. Im Sinne der Akzeptanz von Verfahren der Evaluation als Instrument hochschulinterner Qualitätssicherungs- und –entwicklungsprozesse ist daher zu untersuchen, inwieweit Beeinträchtigungen der Validität von für die Hochschulsteuerung eingesetzten Datenbasen deren Aussagekraft vermindern. Ausgehend von den entsprechenden Ergebnissen sind Entwicklungen der Verfahren möglich. Diese Frage steht im Zentrum der vorliegenden Arbeit. / Universities encounter public debate on the effectivenes of their handling of public funds. Criticism mainly refers to higher education which is regarded as ineffectively organised and -due to bad learning conditions- contributing to excessively long study times and student drop out. An irresponsible handling of students' life time is detected and it is stated that universities as institutions and individual teachers do not adquately meet society's demands regarding higher education quality. In order to respond to the raising request of higher education services, universities are modified to service-oriented "enterprises" which are competing with other institutions for "customers" by providing the publicly requested evidence of quality and excellencec of their educational services. For the implementation of respective quality comparisons, different procesures of educational evaluation are being established. Higher education statistics (students/graduates ratios) and -increasingly- students' surveys, inquiring their quality appraisals of higher education teaching are involved in these procedures. Particularly the latter encounter controverse debate on their suitability to display the quality of teaching and training adequately. Limitations of their informational value is regarded to stem from subjective distortions of the collected data. Quality assessments and respective sanctions thus are deemed by those who are evaluated to potentially result in misjudgments. In order to establish evaluation procedures as an accepted instrument of internal quality assurance and quality development, data quality and the validity concerns need to be inquired carefully. Based on respective research results, further developments and improvements of the evaluation procedures can be achieved. Lehrevaluation Qualitätssicherung Hochschulsteuerung Datenqualität Validität Evaluation Quality Assurance Higher Education Management Data Quality Validity Social sciences
98	A Model for Managing Data Integrity Mallur, Vikram 22 September 2011 (has links) Consistent, accurate and timely data are essential to the functioning of a modern organization. Managing the integrity of an organization’s data assets in a systematic manner is a challenging task in the face of continuous update, transformation and processing to support business operations. Classic approaches to constraint-based integrity focus on logical consistency within a database and reject any transaction that violates consistency, but leave unresolved how to fix or manage violations. More ad hoc approaches focus on the accuracy of the data and attempt to clean data assets after the fact, using queries to flag records with potential violations and using manual efforts to repair. Neither approach satisfactorily addresses the problem from an organizational point of view. In this thesis, we provide a conceptual model of constraint-based integrity management (CBIM) that flexibly combines both approaches in a systematic manner to provide improved integrity management. We perform a gap analysis that examines the criteria that are desirable for efficient management of data integrity. Our approach involves creating a Data Integrity Zone and an On Deck Zone in the database for separating the clean data from data that violates integrity constraints. We provide tool support for specifying constraints in a tabular form and generating triggers that flag violations of dependencies. We validate this by performing case studies on two systems used to manage healthcare data: PAL-IS and iMED-Learn. Our case studies show that using views to implement the zones does not cause any significant increase in the running time of a process. ad hoc methods constraints data dependency data processing data quality database integrity logical consistency
99	A Model for Managing Data Integrity Mallur, Vikram 22 September 2011 (has links) Consistent, accurate and timely data are essential to the functioning of a modern organization. Managing the integrity of an organization’s data assets in a systematic manner is a challenging task in the face of continuous update, transformation and processing to support business operations. Classic approaches to constraint-based integrity focus on logical consistency within a database and reject any transaction that violates consistency, but leave unresolved how to fix or manage violations. More ad hoc approaches focus on the accuracy of the data and attempt to clean data assets after the fact, using queries to flag records with potential violations and using manual efforts to repair. Neither approach satisfactorily addresses the problem from an organizational point of view. In this thesis, we provide a conceptual model of constraint-based integrity management (CBIM) that flexibly combines both approaches in a systematic manner to provide improved integrity management. We perform a gap analysis that examines the criteria that are desirable for efficient management of data integrity. Our approach involves creating a Data Integrity Zone and an On Deck Zone in the database for separating the clean data from data that violates integrity constraints. We provide tool support for specifying constraints in a tabular form and generating triggers that flag violations of dependencies. We validate this by performing case studies on two systems used to manage healthcare data: PAL-IS and iMED-Learn. Our case studies show that using views to implement the zones does not cause any significant increase in the running time of a process. ad hoc methods constraints data dependency data processing data quality database integrity logical consistency
100	Exploring Swedish Hospitals’ Transition towards becoming more Data-Driven : A Qualitative Case Study of Two Swedish Hospitals Carlson, Olof, Thunmarker, Viktor, Zetterberg, Mikael January 2012 (has links) The Swedish health care sector must improve productivity in order to deal with anincreased demand from an aging population with limited resources. In the traditiondriven health care sector, transitioning towards becoming more data-driven has beenidentified as a potential solution. This explorative qualitative case study explores howindividual employees perceive this development at two Swedish hospitals. The resultscomplement theory by presenting propositions that explains drivers and barriers ofthe transition, but also the outcomes of it as perceived by the employees. The studyprimarily concludes that (1) a lack of trust in data and a tradition to base decisions ongut feelings in conjunction with low IT competence make hospital culture a majorobstacle for the transition, and that (2) it is important to understand the employees’perceived outcomes of becoming data-driven as it affects their support of thetransition. The results provide a platform for future research to build on and arevaluable for practitioners as they seek to utilize the drivers and mitigate the barriers. BI BI to the masses business analytics decision making data-driven data quality health care productivity

Search results