Global ETD Search

231	Arquitetura e métodos de integração de dados e interoperabilidade aplicados na saúde mental / Investigation of the effectiveness of data integration and interoperability methods applied to mental health Miyoshi, Newton Shydeo Brandão 16 March 2018 (has links) A disponibilidade e integração das informações em saúde relativas a um mesmo paciente entre diferentes níveis de atenção ou entre diferentes instituições de saúde é normalmente incompleta ou inexistente. Isso acontece principalmente porque os sistemas de informação que oferecem apoio aos profissionais da saúde não são interoperáveis, dificultando também a gestão dos serviços a nível municipal e regional. Essa fragmentação da informação também é desafiadora e preocupante na área da saúde mental, em que normalmente se exige um cuidado prolongado e que integra diferentes tipos de serviços de saúde. Problemas como a baixa qualidade e indisponibilidade de informações, assim como a duplicidade de registros, são importantes aspectos na gestão e no cuidado prolongado ao paciente portador de transtornos mentais. Apesar disso, ainda não existem estudos objetivos demonstrando o impacto efetivo da interoperabilidade e integração de dados na gestão e na qualidade de dados para a área de saúde mental. Objetivos: Neste contexto, o projeto tem como objetivo geral propor uma arquitetura de interoperabilidade para a assistência em saúde regionalizada e avaliar a efetividade de técnicas de integração de dados e interoperabilidade para a gestão dos atendimentos e internações em saúde mental na região de Ribeirão Preto, assim como o impacto na melhoria e disponibilidade dos dados por meio de métricas bem definidas. Métodos: O framework de interoperabilidade proposto tem como base a arquitetura cliente-servidor em camadas. O modelo de informação de interoperabilidade foi baseado em padrões de saúde internacionais e nacionais. Foi proposto um servidor de terminologias baseado em padrões de informação em saúde. Foram também utilizados algoritmos de Record Linkage para garantir a identificação unívoca do paciente. Para teste e validação da proposta foram utilizados dados de diferentes níveis de atenção à saúde provenientes de atendimentos na rede de atenção psicossocial na região de Ribeirão Preto. Os dados foram extraídos de cinco fontes diferentes: (i) a Unidade Básica de Saúde da Família - I, de Santa Cruz da Esperança; (ii) o Centro de Atenção Integrada à Saúde, de Santa Rita do Passa Quatro; (iii) o Hospital Santa Tereza; (iv) as informações de solicitações de internação contidas no SISAM (Sistema de Informação em Saúde Mental); e (v) dados demográficos do Barramento do Cartão Nacional de Saúde do Ministério da Saúde. As métricas de qualidade de dados utilizadas foram completude, consistência, duplicidade e acurácia. Resultados: Como resultado deste trabalho, foi projetado, desenvolvido e testado a plataforma de interoperabilidade em saúde, denominado eHealth-Interop. Foi adotada uma proposta de interoperabilidade por meio de serviços web com um modelo de integração de dados baseado em um banco de dados centralizador. Foi desenvolvido também um servidor de terminologias, denominado eHealth-Interop Terminology Server, que pode ser utilizado como um componente independente e em outros contextos médicos. No total foram obtidos dados de 31340 registros de pacientes pelo SISAM, e-SUS AB de Santa Cruz da Esperança, do CAIS de Santa Rita do Passa Quatro, do Hospital Santa Tereza e do Barramento do CNS do Ministério da Saúde. Desse total, 30,47% (9548) registros foram identificados como presente em mais de 1 fonte de informação, possuindo diferentes níveis de acurácia e completude. A análise de qualidade de dados, abrangendo todas os registros integrados, obteve uma melhoria na completude média de 18,40% (de 56,47% para 74,87%) e na acurácia sintática média de 1,08% (de 96,69% para 96,77%). Na análise de consistência houve melhoras em todas as fontes de informação, variando de uma melhoria mínima de 14.4% até o máximo de 51,5%. Com o módulo de Record Linkage foi possível quantificar, 1066 duplicidades e, dessas, 226 foram verificadas manualmente. Conclusões: A disponibilidade e a qualidade da informação são aspectos importantes para a continuidade do atendimento e gerenciamento de serviços de saúde. A solução proposta neste trabalho visa estabelecer um modelo computacional para preencher essa lacuna. O ambiente de interoperabilidade foi capaz de integrar a informação no caso de uso de saúde mental com o suporte de terminologias clínicas internacionais e nacionais sendo flexível para ser estendido a outros domínios de atenção à saúde. / The availability and integration of health information from the same patient between different care levels or between different health services is usually incomplete or non-existent. This happens especially because the information systems that support health professionals are not interoperable, making it difficult to manage services at the municipal and regional level. This fragmentation of information is also challenging and worrying in the area of mental health, where long-term care is often required and integrates different types of health services and professionals. Problems such as poor quality and unavailability of information, as well as duplicate records, are important aspects in the management and long-term care of patients with mental disorders. Despite this, there are still no objective studies that demonstrate the effective impact of interoperability and data integration on the management and quality of data for the mental health area. Objectives: In this context, this project proposes an interoperability architecture for regionalized health care management. It also proposes to evaluate the effectiveness of data integration and interoperability techniques for the management of mental health hospitalizations in the Ribeirão Preto region as well as the improvement in data availability through well-defined metrics. Methods: The proposed framework is based on client-service architecture to be deployed in the web. The interoperability information model was based on international and national health standards. It was proposed a terminology server based on health information standards. Record Linkage algorithms were implemented to guarantee the patient identification. In order to test and validate the proposal, we used data from different health care levels provided by the mental health care network in the Ribeirão Preto region. The data were extracted from five different sources: the Family Health Unit I of Santa Cruz da Esperança, the Center for Integrated Health Care of Santa Rita do Passa Quatro, Santa Tereza Hospital, the information on hospitalization requests system in SISAM (Mental Health Information System) and demographic data of the Brazilian Ministry of Health Bus. Results: As a result of this work, the health interoperability platform, called eHealth-Interop, was designed, developed and tested. A proposal was adopted for interoperability through web services with a data integration model based on a centralizing database. A terminology server, called eHealth-Interop Terminology Server, has been developed that can be used as an independent component and in other medical contexts. In total, 31340 patient records were obtained from SISAM, eSUS-AB from Santa Cruz da Esperança, from CAIS from Santa Rita do Passa Quatro, from Santa Tereza Hospital and from the CNS Service Bus from the Brazillian Ministry of Health. 47% (9548) records were identified as present in more than 1 information source, having different levels ofaccuracy and completeness. The data quality analysis, covering all integrated records, obtained an improvement in the average completeness of 18.40% (from 56.47% to 74.87%) and the mean syntactic accuracy of 1.08% (from 96,69% to 96.77%). In the consistency analysis there were improvements in all information sources, ranging from a minimum improvement of 14.4% to a maximum of 51.5%. With the Record Linkage module it was possible to quantify 1066 duplications, of which 226 were manually verified. Conclusions: The information\'s availability and quality are both important aspects for the continuity of care and health services management. The solution proposed in this work aims to establish a computational model to fill this gap. It has been successfully applied in the mental health care context and is flexible to be extendable to other medical domains. Data Integration Data Quality Health Information Systems Health Terminologies Interoperabilidade Interoperability Mental Health Qualidade de Dados Record Linkage Record Linkage Saúde Mental Sistemas de Informação em Saúde Terminologias em Saúde
232	Kunskapsskillnaderna mellan IT och Redovisning och dess påverkan på redovisningsdatakvalitet : en kvalitativ studie på ett av de största bemanningsföretagen i Sverige och i världen Homanen, Malin, Karlsson, Therese January 2019 (has links) Det oundvikliga beroendet av digitalisering och IT-system i dagens verksamheter och organisationer ställer krav på dagens arbetskraft att öka sina IT-kunskaper för att kunna integrera och kommunicera med nya datasystem för en mer effektiv verksamhet. Inte minst lika viktigt blir det för redovisningsekonomer som sköter verksamhetens finansiella redovisning då de måste kunna säkerställa att den redovisningsdata som framställs och levereras med hjälp av IT är felfri och uppnår kvalitet. Bristen på kunskap inom IT kan riskera att fel i redovisningsdata inte upptäcks och därmed påverkar redovisningsdatakvalitet. Detta i sin tur riskerar påverka redovisningskvalitet i den slutliga finansiella rapporteringen. Kommunikationen mellan avdelningarna riskerar också bli lidande då de med olika kunskaper har svårt att förstå varandra.Studiens syfte är att försöka bidra med kunskap om hur kunskapsskillnader i digital gundkunskap kan påverka säkerställandet av redovisningsdatakvalitet samt ge insyn i hur arbetet med detta kan gå till i praktiken. Med hjälp av tidigare forskning togs en analysmodell fram som illustrerar identifierade faktorers påverkansordning av redovisningsdatakvalitet; kunskapsskillnader → intern kontroll → redovisningsdatakvalitet.Studien tillämpar en instrumentell fallstudiedesign med en kvalitativ forskningsansats för att besvara frågeställningen. Två fokusgruppsintervjuer utfördes vid två olika tillfällen med respondenter från redovisningsavdelningen och IT-avdelningen från samma företag. Data transkriberades och kodades med hjälp av färgkodning för att tydliggöra de faktorer som utgör utgångspunkten i analysmodellen. En enkätundersökning genomfördes på resterande anställda på respektive avdelning för att komplettera resultaten från intervjuerna.Resultatet av studien visade att kunskapsskillnaderna har liten eller ingen alls direkt påverkan på redovisningsdatakvalitet utan snarare påverkar den interna kontrollen desto mer utifrån externa faktorer som tillkom. / The inevitable dependence on digitization and IT systems in today's operations and organizations demands the current workforce to increase their IT skills in order to be able to integrate and communicate with new computer systems for a more efficient business. It is equally important for financial accountants who’s responsible for the business’s financial reporting, since they must be able to ensure that the accounting data produced and delivered using IT is correct and of high quality. The lack of IT skills can increase the risk of errors in accounting data not detected and thus further affect the accounting data quality. This in turn risks affecting the accounting quality in the final financial reporting. The communication between the departments could also suffer due to the knowledge gaps between them that could make it difficult to understand each other.The aim of the study is to contribute with knowledge about how the differences in knowledge can affect the work in ensuring accounting data quality and give insight into how this work can be realized in practice. With the help of previous research, an analysis model was developed that illustrates identified factors and their influence on accounting data quality; knowledge gaps → internal control → accounting data quality.The study applies an instrumental case study with a qualitative research approach. Two focus group interviews were conducted on two different occasions with respondents from the accounting department and the IT department, both from the same company. Data was transcribed and coded using color coding to clarify the factors that form the basis of the analysis model. A survey was conducted with the other employees to complement and confirm the results found from the interviews.The result of the study showed that the differences in knowledge have little or no direct impact on accounting data quality, but rather affect the internal control, based on external factors that came into light during the analysis of the result. A revised analysis model was developed based on the result and replaced the initial hypothetical model. Business systems Data Quality Data management Internal Control Financial Accounting Quality Accounting data Knowledge gaps Affärssystem Datahantering Datakvalitet Intern kontroll Kunskapsskillnader Redovisningsdata Redovisningskvalitet Business Administration Företagsekonomi
233	Use of the CIM framework for data management in maintenance of electricity distribution networks Nordström, Lars January 2006 (has links) Aging infrastructure and personnel, combined with stricter financial constraints has put maintenance, or more popular Asset Management, at the top of the agenda for most power utilities. At the same time the industry reports that this area is not properly supported by information systems. Today’s power utilities have very comprehensive and complex portfolios of information systems that serve many different purposes. A common problem in such heterogeneous system architectures is data management, e.g. data in the systems do not represent the true status of the equipment in the power grid or several sources of data are contradictory. The research presented in this thesis concerns how this industrial problem can be better understood and approached by novel use of the ontology standardized in the Common Information Model defined in IEC standards 61970 & 61968. The theoretical framework for the research is that of data management using ontology based frameworks. This notion is not new, but is receiving renewed attention due to emerging technologies, e.g. Service Oriented Architectures, that support implementation of such ontological frameworks. The work presented is empirical in nature and takes its origin in the ontology available in the Common Information Model. The scope of the research is the applicability of the CIM ontology, not as it was intended i.e. in systems integration, but for analysis of business processes, legacy systems and data. The work has involved significant interaction with power distribution utilities in Sweden, in order to validate the framework developed around the CIM ontology. Results from the research have been published continuously, this thesis consists of an introduction and summary and papers describing the main contribution of the work. The main contribution of the work presented in this thesis is the validation of the proposition to use the CIM ontology as a basis for analysis existing legacy systems. By using the data models defined in the standards and combining them with established modeling techniques we propose a framework for information system management. The framework is appropriate for analyzing data quality problems related to power systems maintenance at power distribution utilities. As part of validating the results, the proposed framework has been applied in a case study involving medium voltage overhead line inspection. In addition to the main contribution, a classification of the state of the practice system support for power system maintenance at utilities has been created. Second, the work includes an analysis and classification of how high performance Wide Area communication technologies can be used to improve power system maintenance including improving data quality. / QC 20100614 Asset Management Electricity Distribution Networks Information Systems Data Quality Power System Modeling Enterprise Architecture Common Information Model Ontology Systems engineering Systemteknik
234	An Effective Implementation of Operational Inventory Management Sellamuthu, Sivakumar 16 January 2010 (has links) This Record of Study describes the Doctor of Engineering (DE) internship experience at the Supply Chain Systems Laboratory (SCSL) at Texas A&M University. The objective of the internship was to design and develop automation tools to streamline lab operations related to inventory management projects and during that process adapt and/or extend theoretical inventory models according to real-world business complexity and data integrity problems. A holistic approach to automation was taken to satisfy both short-term and long-term needs subject to organizational constraints. A comprehensive software productivity tool was designed and developed that considerably reduced time and effort spent on non-value adding activities. This resulted in standardizing and streamlining data analysis related activities. Real-world factors that significantly influence the data analysis process were identified and incorporated into model specifications. This helped develop an operational inventory management model that accounted for business complexity and data integrity issues commonly encountered during implementation. Many organizational issues including new business strategies, human resources, administration, and project management were also addressed during the course of the internship. Inventory Management Data Analysis Automation Engineering Management Project Management Supply Chain Management Information Technology Data Integrity Data Quality Inventory Stratification Forecasting Replenishment
235	Datenqualität in Sensordatenströmen / Data Quality in Sensor Data Streams Klein, Anja 23 March 2010 (has links) (PDF) Die stetige Entwicklung intelligenter Sensorsysteme erlaubt die Automatisierung und Verbesserung komplexer Prozess- und Geschäftsentscheidungen in vielfältigen Anwendungsszenarien. Sensoren können zum Beispiel zur Bestimmung optimaler Wartungstermine oder zur Steuerung von Produktionslinien genutzt werden. Ein grundlegendes Problem bereitet dabei die Sensordatenqualität, die durch Umwelteinflüsse und Sensorausfälle beschränkt wird. Ziel der vorliegenden Arbeit ist die Entwicklung eines Datenqualitätsmodells, das Anwendungen und Datenkonsumenten Qualitätsinformationen für eine umfassende Bewertung unsicherer Sensordaten zur Verfügung stellt. Neben Datenstrukturen zur effizienten Datenqualitätsverwaltung in Datenströmen und Datenbanken wird eine umfassende Datenqualitätsalgebra zur Berechnung der Qualität von Datenverarbeitungsergebnissen vorgestellt. Darüber hinaus werden Methoden zur Datenqualitätsverbesserung entwickelt, die speziell auf die Anforderungen der Sensordatenverarbeitung angepasst sind. Die Arbeit wird durch Ansätze zur nutzerfreundlichen Datenqualitätsanfrage und -visualisierung vervollständigt. Datenqualität Datenstromverarbeitung Sensordaten Intelligente Systeme Datenbank Optimierung Data Quality Data Stream Data Stream Processing Database Sensor Data Smart Items Optimization ddc:004 rvk:ST 265 rvk:ST 270
236	Dealing with unstructured data : A study about information quality and measurement / Hantera ostrukturerad data : En studie om informationskvalitet och mätning Vikholm, Oskar January 2015 (has links) Many organizations have realized that the growing amount of unstructured text may contain information that can be used for different purposes, such as making decisions. Organizations can by using so-called text mining tools, extract information from text documents. For example within military and intelligence activities it is important to go through reports and look for entities such as names of people, events, and the relationships in-between them when criminal or other interesting activities are being investigated and mapped. This study explores how information quality can be measured and what challenges it involves. It is done on the basis of Wang and Strong (1996) theory about how information quality can be measured. The theory is tested and discussed from empirical material that contains interviews from two case organizations. The study observed two important aspects to take into consideration when measuring information quality: context dependency and source criticism. Context dependency means that the context in which information quality should be measured in must be defined based on the consumer’s needs. Source criticism implies that it is important to take the original source into consideration, and how reliable it is. Further, data quality and information quality is often used interchangeably, which means that organizations needs to decide what they really want to measure. One of the major challenges in developing software for entity extraction is that the system needs to understand the structure of natural language, which is very complicated. / Många organisationer har insett att den växande mängden ostrukturerad text kan innehålla information som kan användas till flera ändamål såsom beslutsfattande. Genom att använda så kallade text-mining verktyg kan organisationer extrahera information från textdokument. Inom till exempel militär verksamhet och underrättelsetjänst är det viktigt att kunna gå igenom rapporter och leta efter exempelvis namn på personer, händelser och relationerna mellan dessa när brottslig eller annan intressant verksamhet undersöks och kartläggs. I studien undersöks hur informationskvalitet kan mätas och vilka utmaningar det medför. Det görs med utgångspunkt i Wang och Strongs (1996) teori om hur informationskvalité kan mätas. Teorin testas och diskuteras utifrån ett empiriskt material som består av intervjuer från två fall-organisationer. Studien uppmärksammar två viktiga aspekter att ta hänsyn till för att mäta informationskvalitét; kontextberoende och källkritik. Kontextberoendet innebär att det sammanhang inom vilket informationskvalitét mäts måste definieras utifrån konsumentens behov. Källkritik innebär att det är viktigt att ta hänsyn informationens ursprungliga källa och hur trovärdig den är. Vidare är det viktigt att organisationer bestämmer om det är data eller informationskvalitét som ska mätas eftersom dessa två begrepp ofta blandas ihop. En av de stora utmaningarna med att utveckla mjukvaror för entitetsextrahering är att systemen ska förstå uppbyggnaden av det naturliga språket, vilket är väldigt komplicerat. Data quality entity extraction information extraction information quality information quality measurement measurement relationship extraction text mining Datakvalitet entitetsextrahering informationsextrahering informationskvalitet mätning av informationskvalitet mätning relationsextrahering text mining
237	Data Quality Assessment for Closed-Loop System Identification and Forecasting with Application to Soft Sensors Shardt, Yuri Unknown Date No description available. closed-loop identification routine operating data model segmentation soft senor time delay sampling time order-based conditions parameter-based conditions Monte Carlo simulations heated tank data quality assessment
238	Developing a New Mixed-Mode Methodology For a Provincial Park Camper Survey in British Columbia Dyck, Brian Wesley 08 July 2013 (has links) Park and resource management agencies are looking for less costly ways to undertake park visitor surveys. The use of the Internet is often suggested as a way to reduce the costs of these surveys. By itself, however, the use of the Internet for park visitor surveys faces a number of methodological challenges that include the potential for coverage error, sampling difficulties and nonresponse error. A potential way of addressing these challenges is the use of a mixed-mode approach that combines the use of the Internet with another survey mode. The procedures for such a mixed-mode approach, however, have not been fully developed and evaluated. This study develops and evaluates a new mixed-mode approach –a face-to-face/web response – for a provincial park camper survey in British Columbia. The five key steps of this approach are: (a) selecting a random sample of occupied campsites; (b) undertaking a short interview with potential respondents; (c) obtaining an email address at the end of the interview; (d) distributing a postcard to potential respondents that contains the website and an individual access code; and (e) undertaking email follow-ups with nonrespondents. In evaluating this new approach, two experiments were conducted during the summer of 2010. The first experiment was conducted at Goldstream Provincial Park campground and was designed to compare a face-to-face/paper response to face-to-face/web response for several sources of survey errors and costs. The second experiment was conducted at 12 provincial park campgrounds throughout British Columbia and was designed to examine the potential for coverage error and the effect of a number of email follow-ups on return rates, nonresponse error and the substantive results. Taken together, these experiments indicate: a low potential for coverage error (i.e., 4% non-use Internet rate); a high email collection rate for follow-ups (i.e., 99% at Goldstream; a combined rate of 88% for 12 campgrounds); similar return rates between a paper mode (60%) and a web (59%) mode; the use of two email follow-ups reduced nonresponse error for a key variable (i.e., geographic location of residence), but not for all variables; low item nonresponse for both mixed-modes (about 1%); very few differences in the substantive results between each follow-up; a 9% cost saving for the web mode. This study suggests that a face-to face/web approach can provide a viable approach for undertaking park visitor surveys if there is high Internet coverage among park visitors. / Graduate / 0366 / 0344 / 0814 / brdyckfam@yahoo.com mixed-mode survey methodology face-to-face/web mode face-to-face/paper mode provincial park campground camper survey nonresponse error coverage error data quality and comparability
239	Análise da qualidade da informação produzida por classificação baseada em orientação a objeto e SVM visando a estimativa do volume do reservatório Jaguari-Jacareí / Analysis of information quality in using OBIA and SVM classification to water volume estimation from Jaguari-Jacareí reservoir Leão Junior, Emerson [UNESP] 25 April 2017 (has links) Submitted by Emerson Leão Júnior null (emerson.leaojr@gmail.com) on 2017-12-05T18:07:16Z No. of bitstreams: 1 leao_ej_me_prud.pdf: 4186679 bytes, checksum: ee186b23411343c3e2d782d622226699 (MD5) / Approved for entry into archive by ALESSANDRA KUBA OSHIRO null (alessandra@fct.unesp.br) on 2017-12-06T10:52:22Z (GMT) No. of bitstreams: 1 leaojunior_e_me_prud.pdf: 4186679 bytes, checksum: ee186b23411343c3e2d782d622226699 (MD5) / Made available in DSpace on 2017-12-06T10:52:22Z (GMT). No. of bitstreams: 1 leaojunior_e_me_prud.pdf: 4186679 bytes, checksum: ee186b23411343c3e2d782d622226699 (MD5) Previous issue date: 2017-04-25 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Considerando o cenário durante a crise hídrica de 2014 e a situação crítica dos reservatórios do sistema Cantareira no estado de São Paulo, este estudo realizado no reservatório Jaguari-Jacareí, consistiu na extração de informações a partir de imagens multiespectrais e análise da qualidade da informação relacionada com a acurácia no cálculo do volume de água do reservatório. Inicialmente, a superfície do espelho d’água foi obtida pela classificação da cobertura da terra a partir de imagens multiespectrais RapidEye tomadas antes e durante a crise hídrica (2013 e 2014, respectivamente), utilizando duas abordagens distintas: classificação orientada a objeto (Object-based Image Analysis - OBIA) e classificação baseada em pixel (Support Vector Machine – SVM). A acurácia do usuário por classe permitiu expressar o erro para detectar a superfície do espelho d’água para cada abordagem de classificação de 2013 e 2014. O segundo componente da estimação do volume foi a representação do relevo submerso, que considerou duas fontes de dados na construção do modelo numérico do terreno (MNT): dados topográficos provenientes de levantamento batimétrico disponibilizado pela Sabesp e o modelo de superfície AW3D30 (ALOS World 3D 30m mesh), para complementar a informação não disponível além da cota 830,13 metros. A comparação entre as duas abordagens de classificação dos tipos de cobertura da terra do entorno do reservatório Jaguari-Jacareí mostrou que SVM resultou em indicadores de acurácia ligeiramente superiores à OBIA, para os anos de 2013 e 2014. Em relação à estimação de volume do reservatório, incorporando a informação do nível de água divulgado pela Sabesp, a abordagem SVM apresentou menor discrepância relativa do que OBIA. Apesar disso, a qualidade da informação produzida na estimação de volume, resultante da propagação da variância associada aos dados envolvidos no processo, ambas as abordagens produziram valores similares de incerteza, mas com uma sutil superioridade de OBIA, para alguns dos cenários avaliados. No geral, os métodos de classificação utilizados nesta dissertação produziram informação acurada e adequada para o monitoramento de recursos hídricos e indicou que a abordagem SVM teve um desempenho sutilmente superior na classificação dos tipos de cobertura da terra, na estimação do volume e em alguns dos cenários considerados na propagação da incerteza. / This study aims to extract information from multispectral images and to analyse the information quality in the water volume estimation of Jaguari-Jacareí reservoir. The presented study of changes in the volume of the Jaguari-Jacareí reservoir was motivated by the critical situation of the reservoirs from Cantareira System in São Paulo State caused by water crisis in 2014. Reservoir area was extracted from RapidEye multispectral images acquired before and during the water crisis (2013 and 2014, respectively) through land cover classification. Firstly, the image classification was carried out in two distinct approaches: object-based (Object-based Image Analysis - OBIA) and pixel-based (Support Vector Machine - SVM) method. The classifications quality was evaluated through thematic accuracy, in which for every technique the user accuracy allowed to express the error for the class representing the water in 2013 and 2014. Secondly, we estimated the volume of the reservoir’s water body, using the numerical terrain model generated from two additional data sources: topographic data from a bathymetric survey, available from Sabesp, and the elevation model AW3D30 (to complement the information in the area where data from Sabesp was not available). When compare the two classification techniques, it was found that in the image classification, SVM performance slightly overcame the OBIA classification technique for 2013 and 2014. In the volume calculation considering the water level estimated from the generated DTM, the result obtained by SVM approach was better in 2013, whereas OBIA approach was more accurate in 2014. Considering the quality of the information produced in the volume estimation, both approaches presented similar values of uncertainty, with the OBIA method slightly less uncertain than SVM. In conclusion, the classification methods used in this dissertation produced accurate information to monitor water resource, but SVM had a subtly superior performance in the classification of land cover types, volume estimation and some of the scenarios considered in the propagation of uncertainty. Classificação de imagens Máquina de vetor de suporte Análise orientada a objeto Qualidade de dados espaciais Image classification Support vector machine Object-based image analysis Spatial data quality
240	Big Data Validation Rizk, Raya January 2018 (has links) With the explosion in usage of big data, stakes are high for companies to develop workflows that translate the data into business value. Those data transformations are continuously updated and refined in order to meet the evolving business needs, and it is imperative to ensure that a new version of a workflow still produces the correct output. This study focuses on the validation of big data in a real-world scenario, and implements a validation tool that compares two databases that hold the results produced by different versions of a workflow in order to detect and prevent potential unwanted alterations, with row-based and column-based statistics being used to validate the two versions. The tool was shown to provide accurate results in test scenarios, providing leverage to companies that need to validate the outputs of the workflows. In addition, by automating this process, the risk of human error is eliminated, and it has the added benefit of improved speed compared to the more labour-intensive manual alternative. All this allows for a more agile way of performing updates on the data transformation workflows by improving on the turnaround time of the validation process. big data data testing data validation data quality big data validation process big data validation tool Information Systems

Search results