• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 152
  • 34
  • 33
  • 26
  • 12
  • 10
  • 9
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 334
  • 334
  • 76
  • 59
  • 49
  • 35
  • 34
  • 32
  • 31
  • 30
  • 30
  • 29
  • 28
  • 28
  • 27
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
241

Análise da qualidade da informação produzida por classificação baseada em orientação a objeto e SVM visando a estimativa do volume do reservatório Jaguari-Jacareí / Analysis of information quality in using OBIA and SVM classification to water volume estimation from Jaguari-Jacareí reservoir

Leão Junior, Emerson [UNESP] 25 April 2017 (has links)
Submitted by Emerson Leão Júnior null (emerson.leaojr@gmail.com) on 2017-12-05T18:07:16Z No. of bitstreams: 1 leao_ej_me_prud.pdf: 4186679 bytes, checksum: ee186b23411343c3e2d782d622226699 (MD5) / Approved for entry into archive by ALESSANDRA KUBA OSHIRO null (alessandra@fct.unesp.br) on 2017-12-06T10:52:22Z (GMT) No. of bitstreams: 1 leaojunior_e_me_prud.pdf: 4186679 bytes, checksum: ee186b23411343c3e2d782d622226699 (MD5) / Made available in DSpace on 2017-12-06T10:52:22Z (GMT). No. of bitstreams: 1 leaojunior_e_me_prud.pdf: 4186679 bytes, checksum: ee186b23411343c3e2d782d622226699 (MD5) Previous issue date: 2017-04-25 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Considerando o cenário durante a crise hídrica de 2014 e a situação crítica dos reservatórios do sistema Cantareira no estado de São Paulo, este estudo realizado no reservatório Jaguari-Jacareí, consistiu na extração de informações a partir de imagens multiespectrais e análise da qualidade da informação relacionada com a acurácia no cálculo do volume de água do reservatório. Inicialmente, a superfície do espelho d’água foi obtida pela classificação da cobertura da terra a partir de imagens multiespectrais RapidEye tomadas antes e durante a crise hídrica (2013 e 2014, respectivamente), utilizando duas abordagens distintas: classificação orientada a objeto (Object-based Image Analysis - OBIA) e classificação baseada em pixel (Support Vector Machine – SVM). A acurácia do usuário por classe permitiu expressar o erro para detectar a superfície do espelho d’água para cada abordagem de classificação de 2013 e 2014. O segundo componente da estimação do volume foi a representação do relevo submerso, que considerou duas fontes de dados na construção do modelo numérico do terreno (MNT): dados topográficos provenientes de levantamento batimétrico disponibilizado pela Sabesp e o modelo de superfície AW3D30 (ALOS World 3D 30m mesh), para complementar a informação não disponível além da cota 830,13 metros. A comparação entre as duas abordagens de classificação dos tipos de cobertura da terra do entorno do reservatório Jaguari-Jacareí mostrou que SVM resultou em indicadores de acurácia ligeiramente superiores à OBIA, para os anos de 2013 e 2014. Em relação à estimação de volume do reservatório, incorporando a informação do nível de água divulgado pela Sabesp, a abordagem SVM apresentou menor discrepância relativa do que OBIA. Apesar disso, a qualidade da informação produzida na estimação de volume, resultante da propagação da variância associada aos dados envolvidos no processo, ambas as abordagens produziram valores similares de incerteza, mas com uma sutil superioridade de OBIA, para alguns dos cenários avaliados. No geral, os métodos de classificação utilizados nesta dissertação produziram informação acurada e adequada para o monitoramento de recursos hídricos e indicou que a abordagem SVM teve um desempenho sutilmente superior na classificação dos tipos de cobertura da terra, na estimação do volume e em alguns dos cenários considerados na propagação da incerteza. / This study aims to extract information from multispectral images and to analyse the information quality in the water volume estimation of Jaguari-Jacareí reservoir. The presented study of changes in the volume of the Jaguari-Jacareí reservoir was motivated by the critical situation of the reservoirs from Cantareira System in São Paulo State caused by water crisis in 2014. Reservoir area was extracted from RapidEye multispectral images acquired before and during the water crisis (2013 and 2014, respectively) through land cover classification. Firstly, the image classification was carried out in two distinct approaches: object-based (Object-based Image Analysis - OBIA) and pixel-based (Support Vector Machine - SVM) method. The classifications quality was evaluated through thematic accuracy, in which for every technique the user accuracy allowed to express the error for the class representing the water in 2013 and 2014. Secondly, we estimated the volume of the reservoir’s water body, using the numerical terrain model generated from two additional data sources: topographic data from a bathymetric survey, available from Sabesp, and the elevation model AW3D30 (to complement the information in the area where data from Sabesp was not available). When compare the two classification techniques, it was found that in the image classification, SVM performance slightly overcame the OBIA classification technique for 2013 and 2014. In the volume calculation considering the water level estimated from the generated DTM, the result obtained by SVM approach was better in 2013, whereas OBIA approach was more accurate in 2014. Considering the quality of the information produced in the volume estimation, both approaches presented similar values of uncertainty, with the OBIA method slightly less uncertain than SVM. In conclusion, the classification methods used in this dissertation produced accurate information to monitor water resource, but SVM had a subtly superior performance in the classification of land cover types, volume estimation and some of the scenarios considered in the propagation of uncertainty.
242

Big Data Validation

Rizk, Raya January 2018 (has links)
With the explosion in usage of big data, stakes are high for companies to develop workflows that translate the data into business value. Those data transformations are continuously updated and refined in order to meet the evolving business needs, and it is imperative to ensure that a new version of a workflow still produces the correct output. This study focuses on the validation of big data in a real-world scenario, and implements a validation tool that compares two databases that hold the results produced by different versions of a workflow in order to detect and prevent potential unwanted alterations, with row-based and column-based statistics being used to validate the two versions. The tool was shown to provide accurate results in test scenarios, providing leverage to companies that need to validate the outputs of the workflows. In addition, by automating this process, the risk of human error is eliminated, and it has the added benefit of improved speed compared to the more labour-intensive manual alternative. All this allows for a more agile way of performing updates on the data transformation workflows by improving on the turnaround time of the validation process.
243

Plateforme visuelle pour l'intégration de données faiblement structurées et incertaines / A visual platform to integrate poorly structured and unknown data

Da Silva Carvalho, Paulo 19 December 2017 (has links)
Nous entendons beaucoup parler de Big Data, Open Data, Social Data, Scientific Data, etc. L’importance qui est apportée aux données en général est très élevée. L’analyse de ces données est importante si l’objectif est de réussir à en extraire de la valeur pour pouvoir les utiliser. Les travaux présentés dans cette thèse concernent la compréhension, l’évaluation, la correction/modification, la gestion et finalement l’intégration de données, pour permettre leur exploitation. Notre recherche étudie exclusivement les données ouvertes (DOs - Open Data) et plus précisément celles structurées sous format tabulaire (CSV). Le terme Open Data est apparu pour la première fois en 1995. Il a été utilisé par le groupe GCDIS (Global Change Data and Information System) (États-Unis) pour encourager les entités, possédant les mêmes intérêts et préoccupations, à partager leurs données [Data et System, 1995]. Le mouvement des données ouvertes étant récent, il s’agit d’un champ qui est actuellement en grande croissance. Son importance est actuellement très forte. L’encouragement donné par les gouvernements et institutions publiques à ce que leurs données soient publiées a sans doute un rôle important à ce niveau. / We hear a lot about Big Data, Open Data, Social Data, Scientific Data, etc. The importance currently given to data is, in general, very high. We are living in the era of massive data. The analysis of these data is important if the objective is to successfully extract value from it so that they can be used. The work presented in this thesis project is related with the understanding, assessment, correction/modification, management and finally the integration of the data, in order to allow their respective exploitation and reuse. Our research is exclusively focused on Open Data and, more precisely, Open Data organized in tabular form (CSV - being one of the most widely used formats in the Open Data domain). The first time that the term Open Data appeared was in 1995 when the group GCDIS (Global Change Data and Information System) (from United States) used this expression to encourage entities, having the same interests and concerns, to share their data [Data et System, 1995]. However, the Open Data movement has only recently undergone a sharp increase. It has become a popular phenomenon all over the world. Being the Open Data movement recent, it is a field that is currently growing and its importance is very strong. The encouragement given by governments and public institutions to have their data published openly has an important role at this level.
244

Factors influencing the quality of data for tuberculosis control programme in Oshakati District, Namibia

Kagasi, Linda Vugutsa 11 1900 (has links)
This study investigated factors influencing the quality of data for the Tuberculosis (TB) control programme in Oshakati District in Namibia. A quantitative, cross-sectional descriptive survey was conducted using 50 nurses who were sampled from five departments in Oshakati State Hospital. Data was collected by means of a self-administered questionnaire. The results indicated that the majority (90%) of the respondents agreed that TB training improved correct recording and reporting. Sixty percent of the respondents agreed that TB trainings influenced the rate of incomplete records in the unit, while 26% of the respondents disagreed with this statement. This indicates that TB trainings influence the quality of data reported in the TB programme as it influences correct recording and completeness of data at operational level. Participants’ knowledge on TB control guidelines, in particular the use of TB records to, used to capture the core TB indicators influenced the quality of data in the programme. The attitudes and practises of respondents affected implementation of TB guidelines hence, influencing the quality of data in the programme. The findings related to the influence of the quality of data in the TB programme and its effect to decision-making demonstrated a positive relationship (p=0.0023) between the attitudes of study participant on the use of data collected for decision-making. Knowledge, attitudes and practice are the main factors influencing the quality of data in the TB control programme in Oshakati District. / Health Studies / M.A. (Public Health)
245

Modelo de procedência para auxiliar na análise da qualidade do dado geográfico

Santos, Renata Ribeiro dos 09 August 2016 (has links)
Submitted by Aelson Maciera (aelsoncm@terra.com.br) on 2017-03-29T19:09:28Z No. of bitstreams: 1 DissRRS.pdf: 3751863 bytes, checksum: 950bef628d03f26a109436e96c9ac337 (MD5) / Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-04-11T13:45:04Z (GMT) No. of bitstreams: 1 DissRRS.pdf: 3751863 bytes, checksum: 950bef628d03f26a109436e96c9ac337 (MD5) / Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-04-11T13:45:15Z (GMT) No. of bitstreams: 1 DissRRS.pdf: 3751863 bytes, checksum: 950bef628d03f26a109436e96c9ac337 (MD5) / Made available in DSpace on 2017-04-11T13:53:54Z (GMT). No. of bitstreams: 1 DissRRS.pdf: 3751863 bytes, checksum: 950bef628d03f26a109436e96c9ac337 (MD5) Previous issue date: 2016-08-09 / Não recebi financiamento / The quality of the geographic data must be a relevant concern for providers and consumers of this type of data because the manipulation and analysis of low quality geographic data may result in errors, which will be propagated through the consequent data. Thus it is important to properly document the information which allows for certifying the quality of the geographic data. In order to provide a minimum amount of metadata for such purpose, this dissertation presents an approach based on the provenance of the geographic data, which corresponds to the information about the history of such data from its origin until the processes that resulted in its current state. For this purpose, a provenance model called ProcGeo was proposed, in which it was defined a minimum amount of metadata that must be considered for the analysis of the quality of a certain geographic data. Although a few works and geographic metadata standards, such as Federal Geographic Data Committee (FGDC) and ISO 19115, consider the information about the provenance in the analysis of the quality of geographic data, it´s the opinion of the author that some metadata considered important for this purpose are not adequately contemplated. In this work, the prototype of an interface called ProcGeoInter was also implemented, aiming to guarantee the completeness and correctness in the filling out of the defined metadata in the ProcGeo model as well as the visualization of their content. The validation of the ProcGeo model and of the ProcGeoInter interface were made through tests and surveys applied to providers and consumers of geographic data. As a means of comparison, the interface for filling out and visualization of metadata available by SIG Quantum GIS (plugin Metatools) was used, which implements the FGDC geographic metadata standard. The obtained results indicated that the metadata defined in the ProcGeo model helped the geographic data provider in the description of the provenance of such data, when compared to those defined in the FGDC geographic metadata standard. Through the consumer´s focus it was possible to notice that the information filled out in the metadata defined by the ProcGeo favored the analysis of the quality of the consumed data. It was clear that both providers and consumers do not possess the habit of providing or consuming the information predicted in the FGDC and ISO 19115 geographic metadata standards. / A qualidade do dado geográfico deve ser uma preocupação relevante para provedores e consumidores desse tipo de dado, pois a manipulação e análise de um dado geográfico com baixa qualidade podem resultar em erros que vão se propagar nos dados gerados a partir desse. Assim, é importante que a informação que permita atestar a qualidade do dado geográfico seja adequadamente documentada. Com o propósito de oferecer um conjunto mínimo de metadados para essa finalidade, esse trabalho apresenta uma abordagem baseada na procedência do dado geográfico, que corresponde à informação sobre a história do dado, desde a sua origem até os processos que resultaram no seu estado atual. Para tanto, foi proposto um modelo de procedência denominado ProcGeo no qual foi definido um conjunto mínimo de metadados que devem ser considerados para a análise da qualidade de um dado geográfico. Embora alguns trabalhos e padrões de metadados geográficos, como o Federal Geographic Data Committee (FGDC) e o ISO 19115, considerem a informação da procedência para a análise da qualidade do dado geográfico, sob o ponto de vista da autora deste trabalho, alguns metadados considerados importantes para essa finalidade não são adequadamente contemplados. Neste trabalho também foi implementado o protótipo de uma interface denominada ProcGeoInter, que tem como finalidade garantir a corretude e completude do preenchimento dos metadados definidos no modelo ProcGeo e a visualização do conteúdo dos mesmos. A validação do modelo ProcGeo e da interface ProcGeoInter foram realizados por meio de testes e questionários aplicados a provedores e consumidores de dados geográficos. Para efeito de comparação, foi considerada a interface para preenchimento e visualização de metadados disponibilizada no SIG Quantum GIS (plugin Metatoools), que implementa o padrão de metadados geográficos FGDC. Os resultados obtidos indicaram que os metadados definidos no modelo ProcGeo auxiliaram o provedor de dados geográficos na descrição da procedência desses dados, quando comparados aos definidos no padrão de metadados geográficos FGDC. Pelo foco do consumidor foi possível perceber que as informações preenchidas nos metadados definidos pelo ProcGeo favoreceram a análise da qualidade dos dados consumidos. Ficou evidente que tanto provedores quanto consumidores não possuem o hábito de prover ou consumir as informações previstas nos padrões de metadados geográficos FGDC e ISO 19115.
246

Amélioration de la qualité des données produits échangées entre l'ingénierie et la production à travers l'intégration de systèmes d'information dédiés / Quality Improvement of product data exchanged between engineering and production through the integration of dedicated information systems.

Ben Khedher, Anis 27 February 2012 (has links)
Le travail présenté dans ce mémoire de thèse apporte sa contribution à l'amélioration de la qualité des données échangées entre la production et les services d'ingénierie dédiés à la conception du produit et du système de production associé. Cette amélioration de la qualité des données passe par l'étude des interactions entre la gestion du cycle de vie du produit et la gestion de la production.Ces deux concepts étant supportés, tout ou partie, par des systèmes d'information industriels, l'étude de leurs interactions a ensuite conduit à l'intégration de ces systèmes d'information (PLM, ERP et MES). Dans un contexte de forte concurrence et de mondialisation, les entreprises sont obligées d'innover et de minimiser les coûts, notamment ceux de production. Face à ces enjeux, le volume des données de production et leur fréquence de modification ne cessent d'augmenter en raison de la réduction constante de la durée de vie et de mise sur le marché des produits, de la personnalisation accrue des produits et en n de la généralisation des démarches d'amélioration continue en production. La conséquence directe est alors la nécessité de formaliser et de gérer l'ensemble des données de production devant être fournies aux opérateurs de production et aux machines. Suite à une analyse du point de vue de la qualité des données pour chaque architecture existante démontrant ainsi leur incapacité à répondre à cette problématique, une architecture basée sur l'intégration des trois systèmes d'information directement impliqués dans la production (PLM, ERP et MES) a été proposée. Cette architecture nous a menés à deux sous-problématiques complémentaires qui sont respectivement la construction d'une architecture basée sur des Web Services permettant d'améliorer l'accessibilité, la sécurité et la complétude des données échangées, et la construction d'une architecture d'intégration, basée sur les ontologies, permettant d'offrir des mécanismes d'intégration basés sur la sémantique dans le but d'assurer la bonne interprétation des données échangées.Enfin, la maquette de l'outil logiciel supportant la solution proposée et permettant d'assurer l'intégration des données échangées entre ingénierie et production a été réalisée. / The research work contributes to improve the quality of data exchanged between the production and the engineering units which dedicated to product design and production system design. This improvement is qualified by studying the interactions between the product life cycle management and the production management. These two concepts are supported, wholly or partly by industrial information systems, the study of the interactions then lead to the integration of information systems (PLM, ERP and MES).In a highly competitive environment and globalization, companies are forced to innovate and reduce costs, especially the production costs. Facing with these challenges, the volume and frequency change of production data are increasing due to the steady reduction of the lifetime and the products marketing, the increasing of product customization and the generalization of continuous improvement in production. Consequently, the need to formalize and manage all production data is required. These data should be provided to the production operators and machines.After analysis the data quality for each existing architecture demonstrating the inability to address this problem, an architecture, based on the integration of three information systems involved in the production (PLM, ERP and MES) has been proposed. This architecture leads to two complementary sub-problems. The first one is the development of an architecture based on Web services to improve the accessibility, safety and completeness of data exchanged. The second is the integration architecture of integration based on ontologies to offer the integration mechanisms based on the semantics in order to ensure the correct interpretation of the data exchanged. Therefore, the model of the software tool supports the proposed solution and ensures that integration of data exchanged between engineering and production was carried out.
247

Avaliação da qualidade do Sistema  de Informação de Registro de Óbitos Hospitalares (SIS-ROH), Hospital Central da Beira, Moçambique / Assessment of the Quality of the Information System of Hospital Death Registration (SIS-ROH), Beira Central Hospital, Mozambique

Edina da Rosa Durão Mola 24 February 2016 (has links)
As informações de mortalidade são úteis para avaliar a situação de saúde de uma população. Dados de mortalidade confiáveis produzidos por um sistema de informação de saúde nacional constituem uma ferramenta importante para o planejamento de saúde. Em muitos países, sobretudo em desenvolvimento, o sistema de informação de mortalidade continua precário. Apesar dos esforços feitos em Moçambique para melhoria das estatísticas de mortalidade, os desafios ainda prevalecem em termos de tecnologias de informação, capacidade técnica de recursos humanos e em termos de produção estatística. O SIS-ROH é um sistema eletrônico de registro de óbitos hospitalares de nível nacional, implementado em 2008 e tem uma cobertura de apenas 4% de todos os óbitos anuais do país. Apesar de ser um sistema de nível nacional, ele presentemente funciona em algumas Unidades Sanitárias (US), incluindo o Hospital Central da Beira (HCB). Dada a importância deste sistema para monitorar o padrão de mortalidade do HCB e, no geral, da cidade da Beira, este estudo avalia a qualidade do SIS-ROH do HCB. É um estudo descritivo sobre a completitude, cobertura, concordância e consistência dos dados do SIS-ROH. Foram analisados 3.009 óbitos de menores de 5 anos ocorridos entre 2010 e 2013 e regsitrados no SIS-ROH e uma amostra de 822 Certificados de Óbitos (COs) fetais e de menores de 5 anos do HCB. O SIS-ROH apresentou uma cobertura inferior a 50% calculados com os dados de mortalidade estimados pelo Inquérito Nacional de Causas de Morte (INCAM). Verificamos a utilização de dois modelos diferentes de CO (modelo antigo e atual) para o registro de óbitos referentes ao ano de 2013. Observou-se completitude excelente para a maioria das variáveis do SISROH. Das 25 variáveis analisadas dos COs observou-se a seguinte situação: 9 apresentaram completitude muito ruim, sendo elas relativas à identificação do falecido (tipo de óbito e idade), relativas ao bloco V em que dados da mãe devem ser obrigatoriamente preenchidos em caso de óbitos fetais e de menores de 1 ano (escolaridade, ocupação habitual, número de filhos tidos vivos e mortos, duração da gestação) e relativas às condições e às causas de óbito (autópsia e causa intermédiacódigo); 3 variáveis apresentaram completitude ruim relativas à identificação do falecido (NID) e relativas às condições e causas de morte (causa intermédia - descrição e causa básica - código); 9 apresentaram completitude regular relativas à identificação do falecido (data de nascimento e idade), relativas ao bloco V (idade da mãe, tipo de gravidez, tipo de parto, peso do feto/bebé ao nascer, morte do feto/bebé em relação ao parto) e relativas às condições e causa de óbito (causa direta- código, causa básica descrição); 2 apresentaram completitude bom relativas à identificação do falecido (sexo e raça/cor) e, por último, 2 apresentaram completitude excelente relativas ao local de ocorrência de óbito (data de internamento e data de óbito ou desaparecimento do cadáver). Algumas variáveis do SIS-ROH e dos COS apresentaram inconsistências. Observou-se falta de concordância para causa direta entre o SIS-ROH e os COs. Conclusão: Moçambique tem feito esforços para aprimorar as estatísticas de mortalidade, porém há lacunas na qualidade; a análise rotineria dos dados pode identificar essas lacunas e subsidiar seu aprimoramento. / The mortality information is useful to assess the health status of a population. Reliable mortality data produced by a national health information system is an important tool for health planning. In many countries, especially developing countries, the mortality information system is still precarious. Despite efforts in Mozambique to improve mortality statistics, challenges still prevail in terms of information technology, technical capacity and human resources and statistical production. The SIS-ROH is an electronic system of national-level hospital deaths registration, implemented in 2008 and has a coverage of only 4% of all annual deaths in the country. Despite being a national system, it currently works in some health units (US), including Beira Central Hospital (HCB). Given the importance of this system to monitor the mortality pattern of HCB and, in general, the city of Beira, this study evaluates the quality of SIS-ROH HCB. It is a descriptive study on the completeness, coverage, compliance and consistency of the SIS-ROH data and examined a sample of 822 HCB deaths Certificates (COs) of fetal and children under 5 years of age. We find the use of two different models of CO (former and current model) for the registration of deaths related to the year 2013. We observed excellent completeness for most SIS-ROH variables. Of the 25 variables of COs there was the following situation: 9 had very bad completeness, which were relating to the identification of the deceased (type of death and age) on the V block in the mother\'s data, where must be filled in case of stillbirths and children under 1 year of age (education, usual occupation, number of living children taken and killed, gestational age) and on the conditions and causes of death (autopsy and intermediate-code causes); 3 variables had bad completeness concerning the identification of the deceased (NID) and on the conditions and causes of death (intermediate cause - description and basic cause - code); 9 showed regular completeness concerning the identification of the deceased (date of birth and age) on the V block (mother\'s age, type of pregnancy, mode of delivery, weight of the fetus / baby birth, death of the fetus / baby compared to delivery) and on the conditions and causes of death (direct cause code, basic cause description); 2 showed good completeness concerning the identification of the deceased (sex and race / color) and, finally, 2 showed excellent completeness concerning the place of occurrence of death (date of admission and date of death or the disappearance corpse). The SIS-ROH had coverage below 50% calculated on mortality data estimated by the National Survey of Causes of Death (INCAM). Some SIS-ROH variables and COS showed inconsistencies. There was a lack of agreement to direct cause between SIS-ROH and COs.
248

Arquitetura e métodos de integração de dados e interoperabilidade aplicados na saúde mental / Investigation of the effectiveness of data integration and interoperability methods applied to mental health

Newton Shydeo Brandão Miyoshi 16 March 2018 (has links)
A disponibilidade e integração das informações em saúde relativas a um mesmo paciente entre diferentes níveis de atenção ou entre diferentes instituições de saúde é normalmente incompleta ou inexistente. Isso acontece principalmente porque os sistemas de informação que oferecem apoio aos profissionais da saúde não são interoperáveis, dificultando também a gestão dos serviços a nível municipal e regional. Essa fragmentação da informação também é desafiadora e preocupante na área da saúde mental, em que normalmente se exige um cuidado prolongado e que integra diferentes tipos de serviços de saúde. Problemas como a baixa qualidade e indisponibilidade de informações, assim como a duplicidade de registros, são importantes aspectos na gestão e no cuidado prolongado ao paciente portador de transtornos mentais. Apesar disso, ainda não existem estudos objetivos demonstrando o impacto efetivo da interoperabilidade e integração de dados na gestão e na qualidade de dados para a área de saúde mental. Objetivos: Neste contexto, o projeto tem como objetivo geral propor uma arquitetura de interoperabilidade para a assistência em saúde regionalizada e avaliar a efetividade de técnicas de integração de dados e interoperabilidade para a gestão dos atendimentos e internações em saúde mental na região de Ribeirão Preto, assim como o impacto na melhoria e disponibilidade dos dados por meio de métricas bem definidas. Métodos: O framework de interoperabilidade proposto tem como base a arquitetura cliente-servidor em camadas. O modelo de informação de interoperabilidade foi baseado em padrões de saúde internacionais e nacionais. Foi proposto um servidor de terminologias baseado em padrões de informação em saúde. Foram também utilizados algoritmos de Record Linkage para garantir a identificação unívoca do paciente. Para teste e validação da proposta foram utilizados dados de diferentes níveis de atenção à saúde provenientes de atendimentos na rede de atenção psicossocial na região de Ribeirão Preto. Os dados foram extraídos de cinco fontes diferentes: (i) a Unidade Básica de Saúde da Família - I, de Santa Cruz da Esperança; (ii) o Centro de Atenção Integrada à Saúde, de Santa Rita do Passa Quatro; (iii) o Hospital Santa Tereza; (iv) as informações de solicitações de internação contidas no SISAM (Sistema de Informação em Saúde Mental); e (v) dados demográficos do Barramento do Cartão Nacional de Saúde do Ministério da Saúde. As métricas de qualidade de dados utilizadas foram completude, consistência, duplicidade e acurácia. Resultados: Como resultado deste trabalho, foi projetado, desenvolvido e testado a plataforma de interoperabilidade em saúde, denominado eHealth-Interop. Foi adotada uma proposta de interoperabilidade por meio de serviços web com um modelo de integração de dados baseado em um banco de dados centralizador. Foi desenvolvido também um servidor de terminologias, denominado eHealth-Interop Terminology Server, que pode ser utilizado como um componente independente e em outros contextos médicos. No total foram obtidos dados de 31340 registros de pacientes pelo SISAM, e-SUS AB de Santa Cruz da Esperança, do CAIS de Santa Rita do Passa Quatro, do Hospital Santa Tereza e do Barramento do CNS do Ministério da Saúde. Desse total, 30,47% (9548) registros foram identificados como presente em mais de 1 fonte de informação, possuindo diferentes níveis de acurácia e completude. A análise de qualidade de dados, abrangendo todas os registros integrados, obteve uma melhoria na completude média de 18,40% (de 56,47% para 74,87%) e na acurácia sintática média de 1,08% (de 96,69% para 96,77%). Na análise de consistência houve melhoras em todas as fontes de informação, variando de uma melhoria mínima de 14.4% até o máximo de 51,5%. Com o módulo de Record Linkage foi possível quantificar, 1066 duplicidades e, dessas, 226 foram verificadas manualmente. Conclusões: A disponibilidade e a qualidade da informação são aspectos importantes para a continuidade do atendimento e gerenciamento de serviços de saúde. A solução proposta neste trabalho visa estabelecer um modelo computacional para preencher essa lacuna. O ambiente de interoperabilidade foi capaz de integrar a informação no caso de uso de saúde mental com o suporte de terminologias clínicas internacionais e nacionais sendo flexível para ser estendido a outros domínios de atenção à saúde. / The availability and integration of health information from the same patient between different care levels or between different health services is usually incomplete or non-existent. This happens especially because the information systems that support health professionals are not interoperable, making it difficult to manage services at the municipal and regional level. This fragmentation of information is also challenging and worrying in the area of mental health, where long-term care is often required and integrates different types of health services and professionals. Problems such as poor quality and unavailability of information, as well as duplicate records, are important aspects in the management and long-term care of patients with mental disorders. Despite this, there are still no objective studies that demonstrate the effective impact of interoperability and data integration on the management and quality of data for the mental health area. Objectives: In this context, this project proposes an interoperability architecture for regionalized health care management. It also proposes to evaluate the effectiveness of data integration and interoperability techniques for the management of mental health hospitalizations in the Ribeirão Preto region as well as the improvement in data availability through well-defined metrics. Methods: The proposed framework is based on client-service architecture to be deployed in the web. The interoperability information model was based on international and national health standards. It was proposed a terminology server based on health information standards. Record Linkage algorithms were implemented to guarantee the patient identification. In order to test and validate the proposal, we used data from different health care levels provided by the mental health care network in the Ribeirão Preto region. The data were extracted from five different sources: the Family Health Unit I of Santa Cruz da Esperança, the Center for Integrated Health Care of Santa Rita do Passa Quatro, Santa Tereza Hospital, the information on hospitalization requests system in SISAM (Mental Health Information System) and demographic data of the Brazilian Ministry of Health Bus. Results: As a result of this work, the health interoperability platform, called eHealth-Interop, was designed, developed and tested. A proposal was adopted for interoperability through web services with a data integration model based on a centralizing database. A terminology server, called eHealth-Interop Terminology Server, has been developed that can be used as an independent component and in other medical contexts. In total, 31340 patient records were obtained from SISAM, eSUS-AB from Santa Cruz da Esperança, from CAIS from Santa Rita do Passa Quatro, from Santa Tereza Hospital and from the CNS Service Bus from the Brazillian Ministry of Health. 47% (9548) records were identified as present in more than 1 information source, having different levels ofaccuracy and completeness. The data quality analysis, covering all integrated records, obtained an improvement in the average completeness of 18.40% (from 56.47% to 74.87%) and the mean syntactic accuracy of 1.08% (from 96,69% to 96.77%). In the consistency analysis there were improvements in all information sources, ranging from a minimum improvement of 14.4% to a maximum of 51.5%. With the Record Linkage module it was possible to quantify 1066 duplications, of which 226 were manually verified. Conclusions: The information\'s availability and quality are both important aspects for the continuity of care and health services management. The solution proposed in this work aims to establish a computational model to fill this gap. It has been successfully applied in the mental health care context and is flexible to be extendable to other medical domains.
249

Evaluating Quality of Online Behavior Data

Berg, Marcus January 2013 (has links)
This thesis has two purposes; emphasizing the importance of data quality of Big Data, and identifying and evaluating potential error sources in JavaScript tracking (a client side on - site online behavior clickstream data collection method commonly used in web analytics). The importance of data quality of Big Data is emphasized through the evaluation of JavaScript tracking. The Total Survey Error framework is applied to JavaScript tracking and 17 nonsampling error sources are identified and evaluated. The bias imposed by these error sources varies from large to small, but the major takeaway is the large number of error sources actually identified. More work is needed. Big Data has much to gain from quality work. Similarly, there is much that can be done with statistics in web analytics.
250

Porovnatelnost dat v dobývání znalostí z databází / Data comparability in knowledge discovery in databases

Horáková, Linda January 2017 (has links)
The master thesis is focused on analysis of data comparability and commensurability in datasets, which are used for obtaining knowledge using methods of data mining. Data comparability is one of aspects of data quality, which is crucial for correct and applicable results from data mining tasks. The aim of the theoretical part of the thesis is to briefly describe the field of knowledqe discovery and define specifics of mining of aggregated data. Moreover, the terms of comparability and commensurability is discussed. The main part is focused on process of knowledge discovery. These findings are applied in practical part of the thesis. The main goal of this part is to define general methodology, which can be used for discovery of potential problems of data comparability in analyzed data. This methodology is based on analysis of real dataset containing daily sales of products. In conclusion, the methodology is applied on data from the field of public budgets.

Page generated in 0.0784 seconds