Global ETD Search

331	Modelos para análise de dados discretos longitudinais com superdispersão / Models for analysis of longitudinal discrete data in the presence of overdispersion Fernanda Bührer Rizzato 08 February 2012 (has links) Dados longitudinais na forma de contagens e na forma binária são muito comuns, os quais, frequentemente, podem ser analisados por distribuições de Poisson e de Bernoulli, respectivamente, pertencentes à família exponencial. Duas das principais limitações para modelar esse tipo de dados são: (1) a ocorrência de superdispersão, ou seja, quando a variabilidade dos dados não é adequadamente descrita pelos modelos, que muitas vezes apresentam uma relação pré-estabelecida entre a média e a variância, e (2) a correlação existente entre medidas realizadas repetidas vezes na mesma unidade experimental. Uma forma de acomodar a superdispersão é pela utilização das distribuições binomial negativa e beta binomial, ou seja, pela inclusão de um efeito aleatório com distribuição gama quando se considera dados provenientes de contagens e um efeito aleatório com distribuição beta quando se considera dados binários, ambos introduzidos de forma multiplicativa. Para acomodar a correlação entre as medidas realizadas no mesmo indivíduo podem-se incluir efeitos aleat órios com distribuição normal no preditor linear. Esses situações podem ocorrer separada ou simultaneamente. Molenberghs et al. (2010) propuseram modelos que generalizam os modelos lineares generalizados mistos Poisson-normal e Bernoulli-normal, incorporando aos mesmos a superdispersão. Esses modelos foram formulados e ajustados aos dados, usando-se o método da máxima verossimilhança. Entretanto, para um modelo de efeitos aleatórios, é natural pensar em uma abordagem Bayesiana. Neste trabalho, são apresentados modelos Bayesianos hierárquicos para dados longitudinais, na forma de contagens e binários que apresentam superdispersão. A análise Bayesiana hierárquica é baseada no método de Monte Carlo com Cadeias de Markov (MCMC) e para implementação computacional utilizou-se o software WinBUGS. A metodologia para dados na forma de contagens é usada para a análise de dados de um ensaio clínico em pacientes epilépticos e a metodologia para dados binários é usada para a análise de dados de um ensaio clínico para tratamento de dermatite. / Longitudinal count and binary data are very common, which often can be analyzed by Poisson and Bernoulli distributions, respectively, members of the exponential family. Two of the main limitations to model this data are: (1) the occurrence of overdispersion, i.e., the phenomenon whereby variability in the data is not adequately captured by the model, and (2) the accommodation of data hierarchies owing to, for example, repeatedly measuring the outcome on the same subject. One way of accommodating overdispersion is by using the negative-binomial and beta-binomial distributions, in other words, by the inclusion of a random, gamma-distributed eect when considering count data and a random, beta-distributed eect when considering binary data, both introduced by multiplication. To accommodate the correlation between measurements made in the same individual one can include normal random eects in the linear predictor. These situations can occur separately or simultaneously. Molenberghs et al. (2010) proposed models that simultaneously generalizes the generalized linear mixed models Poisson-normal and Bernoulli-normal, incorporating the overdispersion. These models were formulated and tted to the data using maximum likelihood estimation. However, these models lend themselves naturally to a Bayesian approach as well. In this paper, we present Bayesian hierarchical models for longitudinal count and binary data in the presence of overdispersion. A hierarchical Bayesian analysis is based in the Monte Carlo Markov Chain methods (MCMC) and the software WinBUGS is used for the computational implementation. The methodology for count data is used to analyse a dataset from a clinical trial in epileptic patients and the methodology for binary data is used to analyse a dataset from a clinical trial in toenail infection named onychomycosis. Análise de dados longitudinais Distribuição de Bernoulli Distribuição de Poisson Inferência Bayesiana Modelos lineares generalizados Modelos mistos Bayesian inference Bernoulli distribution Generalized linear models Longitudinal data Mixed models Poisson distribution
332	Tendência de mortalidade por câncer de colo de útero e útero porção não especificada no estado de Minas Gerais – 1980 a 2005 Alves, Christiane Maria Meurer 13 February 2009 (has links) Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-10-06T17:59:14Z No. of bitstreams: 1 christianemariameureralves.pdf: 1154792 bytes, checksum: 93b03559e6d95acbb4dbb3661f857e03 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-10-07T12:14:33Z (GMT) No. of bitstreams: 1 christianemariameureralves.pdf: 1154792 bytes, checksum: 93b03559e6d95acbb4dbb3661f857e03 (MD5) / Made available in DSpace on 2016-10-07T12:14:33Z (GMT). No. of bitstreams: 1 christianemariameureralves.pdf: 1154792 bytes, checksum: 93b03559e6d95acbb4dbb3661f857e03 (MD5) Previous issue date: 2009-02-13 / Introdução: O câncer de colo de útero, desde a década de 50, dispõe de um exame capaz de detectá-lo em fase incipiente e curável. A disponibilidade do teste de Papanicolaou parece ser a principal motivação para a queda de mortalidade por câncer de colo de útero em vários países ao redor do mundo. Buscou-se com este estudo avaliar o comportamento da mortalidade por câncer de colo de útero e útero porção não especificada, no período de 1980-2005, no Estado de Minas Gerais. Optou-se pela utilização de modelo de regressão linear e pela abordagem idadeperíodo-coorte. Material e Métodos: Foram coletados os dados de óbito e população disponíveis no DATASUS. Para avaliação da tendência de mortalidade por idade e período, utilizou-se o modelo de regressão linear; as taxas também foram log-transformadas para que se obtivesse o percentual de mudança da mortalidade por ano. A análise período-coorte foi feita através do método não paramétrico de Tarone e Chu. Resultados: Encontrou-se queda na mortalidade por câncer de colo de útero e útero porção não especificada para a análise idade e período. A redução foi principalmente relacionada com os casos de câncer de útero porção não especificada. Na análise idade-período-coorte houve redução menor que a esperada para as coortes de 1901-1908 e 1921-1928. Houve redução maior que a esperada para as coortes de 1913-1920, 1929-1932, 1937-1946, 1949-1956, 19631970 e 1969-1976. Encontrou-se ainda redução maior que a esperada para o período de 2000-2001. Conclusão: Foi evidenciada a redução da mortalidade por câncer de colo de útero e útero porção não especificada no Estado de Minas Gerais no período estudado. Os achados mostram influência das coortes de nascimento sobre a queda da mortalidade. / Introduction: Cervical cancer has had since the fifties, an exam capable of detecting it in its early and curable stage. The availability of the Papanicolaou smear test seems to be the principal reason for the fall in the mortality due to cervical cancer in many countries throughout the world. The aim of this study was to assess the trends of the mortality due to cervical cancer and uterus not otherwise specified (NOS) in the period from 1980-2005, in the state of Minas Gerais. We opted for the model of linear regression and the age-period-cohort approach. Material and Methods: Data related to death and population available at DATASUS were collected. To assess the tendency of mortality by age and period the approach of linear regression was used; the taxes were also log transformed in order to obtain the percentage of change in the mortality by year. The period-cohort analysis was carried out using Tarone & Chu’s non parametric method. Results: A reduction in the mortality due to cervical cancer and uterus not otherwise specified (NOS) for the age and period analyzed was found. The reduction was mainly related with the cases of cancer of uterus not otherwise specified (NOS). In the age-period-cohort analysis the reduction was less than expected for the cohorts from 1901-1908 and 1921-1928. There was a reduction bigger than expected for the cohorts from 1913-1920, 1929-1932, 1937-1946, 19491956, 1963-1970 and 1969-1976. It was also found a bigger reduction than expected for the period from 2000-2001. Conclusion: The reduction in the mortality due to cervical cancer and uterus not otherwise specified (NOS) in the state of Minas Gerais was unmistakable in the period studied. The findings show the influence of the birth cohorts over the decrease in mortality. CNPQ::CIENCIAS DA SAUDE::SAUDE COLETIVA Neoplasias de colo de útero Mortalidade Modelos lineares Modelos idade-período-coorte Uterine cervical neoplasma Mortality Linear models Age-period-cohort models
333	Fatores associados à proficiência em leitura e matemática : uma aplicação do modelo linear hierárquico com dados longitudinais do Projeto GERES / Factors associated with proficiency in reading and mathematics : an application of hierarchical linear models with longitudinal data of the GERES Project Dalben, Adilson, 1965- 24 August 2018 (has links) Orientadores: Luiz Carlos de Freitas, Dalton Francisco de Andrade / Tese (doutorado) - Universidade Estadual de Campinas, Faculdade de Educação / Made available in DSpace on 2018-08-24T22:44:15Z (GMT). No. of bitstreams: 1 Dalben_Adilson_D.pdf: 5011742 bytes, checksum: e9c6413b4e6fb98c276dcbdecd13440b (MD5) Previous issue date: 2014 / Resumo: Esta pesquisa é um estudo sobre a eficácia e equidade escolar que tem ganhado atenção especial nos países que usam as avaliações em larga escala a serviço da gestão do sistema educativo. No Brasil, que desde a década de 1990 colocou a avaliação educacional como recurso central em suas políticas educacionais, mas coletando dados seccionais, que são muito frágeis para essa finalidade. Essa fragilidade decorre da alta associação que os fatores extraescolares, sobretudo o nível socioeconômico do aluno, têm sobre as medidas de proficiência. Diante disso, foram usados dados longitudinais e a análise foi feita por meio de modelos lineares hierárquicos. Esta pesquisa teve como objetivo principal desenvolver um modelo estatístico capaz de identificar tais fatores para a realidade brasileira, considerando que a aprendizagem é um processo complexo, isto é, ela é influenciada simultaneamente por múltiplos fatores. Foram desenvolvidos modelos de valor agregado que não só identificam tais variáveis, como também caracterizam sua influência em alunos com distintas proficiências no início de cada período de escolarização. A base de dados utilizada nesses modelos foi fornecida pelo Projeto GERES, que, no período de 2005 a 2008, coletou dados dos mesmos alunos de 1ª a 4ª séries de uma amostra de 312 escolas em cinco grandes cidades brasileiras. Foram medidas as proficiências em Leitura e Matemática de 35.538 alunos e coletadas informações de contexto desses alunos, seus familiares, professores, diretores e escola. Após a redução do grande número de informações disponibilizadas pelo Projeto GERES, feita por meio da Análise Fatorial Exploratória (AFE), as variáveis resultantes foram reorganizadas em três arquivos usados para análise em modelos lineares hierárquicos de três níveis. Os resultados encontrados evidenciam uma significativa instabilidade nos efeitos que as variáveis têm sobre a proficiência, tanto em leitura quanto em matemática. Ao final da pesquisa, são encontrados alguns fatores que influenciam positivamente e negativamente a proficiência em Leitura e Matemática e outros que afetam especificamente cada uma dessas áreas, indicando que podem colaborar para o aumento da eficácia e da equidade das escolas. No entanto, constatam-se também algumas variáveis que têm comportamentos incoerentes com o esperado e outras com comportamentos opostos nas duas áreas. Assim, dos achados das pesquisas, comprova-se que, com base nos dados utilizados, procedimentos metodológicos e modelos estatísticos adotados, os modelos de valor agregado melhoram a confiabilidade das análises em comparação aos modelos que usam dados seccionais, mas ainda são inviáveis como ferramentas para a gestão do sistema educativo, sobretudo para o uso meritocrático de seus resultados. Dessa forma, esta pesquisa corrobora os achados de outras realizadas no âmbito internacional e permite afirmar que a qualidade da modelagem estatística depende da qualidade dos dados que busca modelar, podendo gerar distorções, estabelecer relações inesperadas ou levar a conclusões equivocadas. Em contrapartida, trata-se de recursos que podem ser usados no sistema educativo, fornecendo dados importantes para a orientação das políticas públicas numa perspectiva de avaliação formativa, com vistas ao melhoramento da qualidade de ensino oferecido pelas escolas e à melhor formação dos profissionais docentes e não-docentes que nelas trabalham / Abstract: This research is a study on school effectiveness and equality in Brazil, adding up to a number of other researches that have drawn special attention in countries that use large-scale evaluations at the service of the education system management. In the Brazil has regarded the educational evaluation as a central resource in national education policies, but using cross-sectional data, which are far more fragile for such purpose. This fragility has derived from the great influence that extra-school factors, particularly the students¿ socioeconomic status, exerts on proficiency measures. Longitudinal data was used in the analyses with hierarchical linear models. The main objective of this research was to develop a statistical model to identify such factors in the Brazilian reality, considering that learning is a complex process, i.e. it is simultaneously influenced by multiple factors. Value-added models were developed not only to identify such variables, but also to characterize their influence on students showing different proficiencies at the beginning of every school term. The data base used in those models was provided by the GERES Project, which collected data of the same students from the 1st to the 4th grade from a sample of 312 schools in five Brazilian cities from 2005 to 2008. Proficiencies of 35,538 students were measured, and information about these students¿ context, family, teachers, principals and school were gathered. After the reduction of the great amount of information made available by the GERES Project by means of Exploratory Factor Analysis (EFA), the resulting variables were reorganized in three files used for analysis in three-level hierarchical linear models. The results evidenced significant instability in the effects that the variables have on proficiency both in Reading and in Mathematics. At the end of the research, some factors that influence Reading and Mathematics proficiency either positively or negatively, as well as other factors that specifically affect one of those areas, were found, thus indicating that they may contribute to increased school effectiveness and equality. However, some variables whose behavior was inconsistent with the one expected, and others with opposite behaviors in the two areas were also found. Therefore, from the research findings, based on the data used, the methodological procedures and the statistical models adopted, it has been evidenced that value-added models improve the analysis reliability in comparison with models that use cross-sectional data, but they are still impracticable as tools for education system management, particularly for meritocratic use of their results. Hence, this research has corroborated the findings of other studies carried out over the world and has enabled us to state that the quality of the statistical modeling depends on the quality of data that it attempts to model, and it may generate distortions, establish unexpected relationships or lead to misleading conclusions. On the other hand, these resources may be used in the education system by providing important data for guiding public policies in a educative evaluation perspective, aiming at improving the quality of teaching offered by schools, teachers and other professionals that work in the school setting / Doutorado / Ensino e Práticas Culturais / Doutor em Educação Avaliação educacional Eficácia no ensino Modelos de valor agregado Modelos lineares hierárquicos Método longitudinal Educational evaluation Teaching effectiveness Value added models Hierarchical linear models Longitudinal method
334	Diagnóstico em modelos de regressão linear e não-linear com erros simétricos / Diagnostic in linear and nonlinear regression models with symmetrical errors Reis, Sandra Santos dos, 1983- 24 August 2018 (has links) Orientador: Mauricio Enrique Zevallos Herencia / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica / Made available in DSpace on 2018-08-24T02:03:22Z (GMT). No. of bitstreams: 1 Reis_SandraSantosdos_M.pdf: 1897835 bytes, checksum: 24e50267c694dbcb380ddcfc9d7bdace (MD5) Previous issue date: 2013 / Resumo: Neste trabalho discutimos a detecção de observações influentes em modelos simétricos lineares e não lineares. Em primeiro lugar é realizado um estudo de simulação para avaliar o desempenho de três métodos de estimação em dados gerados por quatro situações: sem observações influentes, com outliers na variável resposta, com observações influentes de média alavancagem e com observações influentes de alta alavancagem. São analisados dois métodos de máxima verossimilhança e um método robusto. Foram considerados modelos de regressão linear e não linear com erros logísticos tipo II e t-Student. Em segundo lugar é discutida detecção de observações influentes mediante a distância de Cook generalizada, a estatística de Peña e a estatística de Andrews-Pregibon. Em particular é discutida a conveniência de utilizar a metodologia de limiares para caracterizar uma observação como influente ou não influente, assim como o efeito da estimação de parâmetros na construção de limiares. Estas medidas foram aplicadas a conjuntos de dados reais e simulados considerando o ajuste de alguns modelos simétricos com uma adaptação no método de estimação scoring de Fisher / Abstract: We discuss the detection of influential observations in symmetrical linear and nonlinear regression models. First a simulation study is conducted to evaluate the performance of three estimation methods on data generated by four situations: without influential observations with outliers in the response variable, with influential observations average leverage and influential observations with high leverage. Two methods of maximum likelihood and robust method are analyzed. We considered linear and nonlinear regression models with logistic-II and Student-t errors. Secondly detection of influential observations by generalized Cook's distance, the statistic PeÃ?a and Andrews - Pregibon statistic is discussed. In particular the convenience of using the methodology to characterize a threshold observation as influential or not influential, as well as the effect of parameter estimation in the construction of thresholds is discussed. These measures were applied to sets of real and simulated data considering the fit of some symmetrical regression models with an adaptation estimation method of Fisher scoring / Mestrado / Estatistica / Mestra em Estatística Estimativa de parâmetro Modelos lineares (Estatistica) Modelos não lineares (Estatística) Observações influentes (Estatística) Parameter estimation Linear models (Statistics) Nonlinear models (Statistics) Influential observations
335	[en] APPLYING RISK CLASSIFICATION METHOD IN CAR INSURANCE MARKET / [pt] MÉTODO DE CLASSIFICAÇÃO DE RISCO APLICADO AO MERCADO DE SEGUROS DE AUTOMÓVEIS WILSON LINS MORGADO 14 February 2005 (has links) [pt] A estimação do risco em seguros de automóveis representa um difícil problema de regressão. As dificuldades vão desde a utilização de um grande número de variáveis discretas como explicativas, até a distribuição particular dos ruídos e uma quantidade expressiva de categorias com valores nulos e valores discrepantes. Supondo que os problemas de estimação estejam relacionados com a classificação do risco adotada pelo mercado, este trabalho propõe um método de classificação alternativo. O método desenvolvido foi baseado na técnica de análise fatorial, e no algoritmo de agrupamento de dados denominado fuzzy clustering system. Para avaliar a eficiência do método em solucionar os problemas de estimação, optou-se por utilizar o erro resultante da aplicação de modelos lineares generalizados. Ao final, o erro de estimação obtido diante da classificação proposta, foi comparado ao obtido diante da classificação usual de mercado. / [en] The estimation of car insurance risk rate represents a difficult regression problem. One of the difficulties of this problem is the use of a number of discrete independent variables and a specific error distribution that presents an expressive number of null and outlier values. Assuming that these estimation problems are related to the risk classification adopted by the insurance companies, this work proposes an alternative classification method. This method is based on factorial analysis techniques and on the algorithm known as Fuzzy Clustering System. To evaluate the efficiency of this method in solving the problems identified, the risk was estimated using generalized linear models. The errors from each model were obtained and compared between classifications. [pt] ANALISE FATORIAL [en] FACTOR ANALYSIS [pt] MODELOS LINEARES GENERALIZADOS [en] GENERALIZED LINEAR MODELS [en] CAR INSURANCE RATEMAKING [pt] CLASSIFICACAO DO RISCO [en] RISK CLASSIFICATION
336	Kernel-based machine learning for tracking and environmental monitoring in wireless sensor networkds / Méthodes à noyaux pour le suivi de cibles et la surveillance de l'environnement dans les réseaux de capteurs Mahfouz, Sandy 14 October 2015 (has links) Cette thèse porte sur les problèmes de localisation et de surveillance de champ de gaz à l'aide de réseaux de capteurs sans fil. Nous nous intéressons d'abord à la géolocalisation des capteurs et au suivi de cibles. Nous proposons ainsi une approche exploitant la puissance des signaux échangés entre les capteurs et appliquant les méthodes à noyaux avec la technique de fingerprinting. Nous élaborons ensuite une méthode de suivi de cibles, en se basant sur l'approche de localisation proposée. Cette méthode permet d'améliorer la position estimée de la cible en tenant compte de ses accélérations, et cela à l'aide du filtre de Kalman. Nous proposons également un modèle semi-paramétrique estimant les distances inter-capteurs en se basant sur les puissances des signaux échangés entre ces capteurs. Ce modèle est une combinaison du modèle physique de propagation avec un terme non linéaire estimé par les méthodes à noyaux. Les données d'accélérations sont également utilisées ici avec les distances, pour localiser la cible, en s'appuyant sur un filtrage de Kalman et un filtrage particulaire. Dans un autre contexte, nous proposons une méthode pour la surveillance de la diffusion d'un gaz dans une zone d'intérêt, basée sur l'apprentissage par noyaux. Cette méthode permet de détecter la diffusion d'un gaz en utilisant des concentrations relevées régulièrement par des capteurs déployés dans la zone. Les concentrations mesurées sont ensuite traitées pour estimer les paramètres de la source de gaz, notamment sa position et la quantité du gaz libéré / This thesis focuses on the problems of localization and gas field monitoring using wireless sensor networks. First, we focus on the geolocalization of sensors and target tracking. Using the powers of the signals exchanged between sensors, we propose a localization method combining radio-location fingerprinting and kernel methods from statistical machine learning. Based on this localization method, we develop a target tracking method that enhances the estimated position of the target by combining it to acceleration information using the Kalman filter. We also provide a semi-parametric model that estimates the distances separating sensors based on the powers of the signals exchanged between them. This semi-parametric model is a combination of the well-known log-distance propagation model with a non-linear fluctuation term estimated within the framework of kernel methods. The target's position is estimated by incorporating acceleration information to the distances separating the target from the sensors, using either the Kalman filter or the particle filter. In another context, we study gas diffusions in wireless sensor networks, using also machine learning. We propose a method that allows the detection of multiple gas diffusions based on concentration measures regularly collected from the studied region. The method estimates then the parameters of the multiple gas sources, including the sources' locations and their release rates Réseaux de capteurs (technologie) Apprentissage automatique Modèles non linéaires (statistique) Traitement du signal Kalman, Filtrage de Sensor networks Machine learning Non-linear models Signal processing Kalman filtering 621.384
337	Sequential detection and isolation of cyber-physical attacks on SCADA systems / Détection et localisation séquentielle d’attaques cyber-physiques aux systèmes SCADA Do, Van Long 17 November 2015 (has links) Cette thèse s’inscrit dans le cadre du projet « SCALA » financé par l’ANR à travers le programme ANR-11-SECU-0005. Son objectif consiste à surveiller des systèmes de contrôle et d’acquisition de données (SCADA) contre des attaques cyber-physiques. Il s'agit de résoudre un problème de détection-localisation séquentielle de signaux transitoires dans des systèmes stochastiques et dynamiques en présence d'états inconnus et de bruits aléatoires. La solution proposée s'appuie sur une approche par redondance analytique composée de deux étapes : la génération de résidus, puis leur évaluation. Les résidus sont générés de deux façons distinctes, avec le filtre de Kalman ou par projection sur l’espace de parité. Ils sont ensuite évalués par des méthodes d’analyse séquentielle de rupture selon de nouveaux critères d’optimalité adaptés à la surveillance des systèmes à sécurité critique. Il s'agit donc de minimiser la pire probabilité de détection manquée sous la contrainte de niveaux acceptables pour la pire probabilité de fausse alarme et la pire probabilité de fausse localisation. Pour la tâche de détection, le problème d’optimisation est résolu dans deux cas : les paramètres du signal transitoire sont complètement connus ou seulement partiellement connus. Les propriétés statistiques des tests sous-optimaux obtenus sont analysées. Des résultats préliminaires pour la tâche de localisation sont également proposés. Les algorithmes développés sont appliqués à la détection et à la localisation d'actes malveillants dans un réseau d’eau potable / This PhD thesis is registered in the framework of the project “SCALA” which received financial support through the program ANR-11-SECU-0005. Its ultimate objective involves the on-line monitoring of Supervisory Control And Data Acquisition (SCADA) systems against cyber-physical attacks. The problem is formulated as the sequential detection and isolation of transient signals in stochastic-dynamical systems in the presence of unknown system states and random noises. It is solved by using the analytical redundancy approach consisting of two steps: residual generation and residual evaluation. The residuals are firstly generated by both Kalman filter and parity space approaches. They are then evaluated by using sequential analysis techniques taking into account certain criteria of optimality. However, these classical criteria are not adequate for the surveillance of safety-critical infrastructures. For such applications, it is suggested to minimize the worst-case probability of missed detection subject to acceptable levels on the worst-case probability of false alarm and false isolation. For the detection task, the optimization problem is formulated and solved in both scenarios: exactly and partially known parameters. The sub-optimal tests are obtained and their statistical properties are investigated. Preliminary results for the isolation task are also obtained. The proposed algorithms are applied to the detection and isolation of malicious attacks on a simple SCADA water network Analyse séquentielle Détection du signal Rupture (statistique) Modèles linéaires (statistique) Criminalité informatique Sequential analysis Signal detection Change-point problems Linear models (Statistics) Computer crimes 621.382 2
338	MODELOS DE PREVISÃO APLICADOS AO CONTROLE DE QUALIDADE COM DADOS AUTOCORRELACIONADOS / FORECAST MODEL APPLIED TO QUALITY CONTROL WITH AUTOCORRELATIONAL DATA Klidzio, Regiane 04 September 2009 (has links) This research has a topic forecast models applied to industrial productive processes with the objective of verifying the stability of the process through control charts applied to the residues originated from linear and non-linear model. In the presence of autocorrelation data, it was necessary to look for a mathematical model which are produce independent and identically distributed residues. This investigation about the stability of the process goes by the verification of the volatility is influence in the detection of points that are capable to affect the productive process performance. This fact shows the existence of the volatility in productive processes, which it is just used until now in economic variables. The data used for analysis belong to three different industries in different segments. The mathematic models were used multivariate dynamic equation, ARIMA and ARIMA-ARCH model. According to the control charts the statistical techniques used to eliminate the serial autocorrelation was statistically adequate comparing to the classic model used by each industry analyzed. Besides, it was verified, in the period that the volatility occurs corresponds to the period the shows a lack of stability detected by Shewhart control charts. The mathematic models were able to represent the productive process, facilitating understands the behavior of the variables, and help to accomplish the forecast and monitoring the process. / A presente pesquisa tem como tema a abordagem de modelos de previsão, aplicados a processos produtivos industriais, com o objetivo de verificar a estabilidade do processo por meio de gráficos de controle, aplicado aos resíduos oriundos de modelagem linear e nãolinear. Como as observações eram autocorrelacionadas, foi necessário buscar um modelo matemático pelo qual foram obtidos resíduos independentes e normalmente distribuídos. A investigação da estabilidade do processo passa pela verificação da influência da volatilidade na detecção de pontos amostrais que são potenciais para afetar o desempenho do processo produtivo. Esse fato comprova a existência da volatilidade em processos produtivos que, até o momento, é trabalhada apenas em variáveis econômicas. Os dados utilizados para análise pertencem a três empresas de segmentos distintos. O modelo matemático foi ajustado utilizando modelo de regressão dinâmica multivariada, modelo ARIMA e modelo ARIMAARCH. De acordo com os gráficos de controle, as técnicas estatísticas empregadas para eliminar a autocorrelação serial dos dados mostraram-se adequadas estatisticamente, se comparadas com o modelo clássico utilizado por cada empresa analisada. Além disso, verificou-se que, no período que ocorre volatilidade corresponde a um período fora de controle detectado nos gráficos de controle de Shewhart. Os modelos matemáticos encontrados foram capazes de representar os processos produtivos, possibilitando compreender o comportamento das variáveis e auxiliaram na realização das previsões e na monitoração do processo. Séries temporais Modelos lineares e não-lineares Autocorrelação Previsão Gráficos de controle Time series Linear e non-linear models Autocorrelation Forecast Control charts
339	[en] IDENTIFICATION MECHANISMS OF SPURIOUS DIVISIONS IN THRESHOLD AUTOREGRESSIVE MODELS / [pt] MECANISMOS DE IDENTIFICAÇÃO DE DIVISÕES ESPÚRIAS EM MODELOS DE REGRESSÃO COM LIMIARES ANGELO SERGIO MILFONT PEREIRA 10 December 2002 (has links) [pt] O objetivo desta dissertação é propor um mecanismo de testes para a avaliação dos resultados obtidos em uma modelagem TS-TARX.A principal motivação é encontrar uma solução para um problema comum na modelagem TS-TARX : os modelos espúrios que são gerados durante o processo de divisão do espaço das variáveis independentes.O modelo é uma heurística baseada em análise de árvore de regressão, como discutido por Brieman -3, 1984-. O modelo proposto para a análise de séries temporais é chamado TARX - Threshold Autoregressive with eXternal variables-. A idéia central é encontrar limiares que separem regimes que podem ser explicados através de modelos lineares. Este processo é um algoritmo que preserva o método de regressão por mínimos quadrados recursivo -MQR-. Combinando a árvore de decisão com a técnica de regressão -MQR-, o modelo se tornou o TS-TARX -Tree Structured - Threshold AutoRegression with external variables-.Será estendido aqui o trabalho iniciado por Aranha em -1, 2001-. Onde a partir de uma base de dados conhecida, um algoritmo eficiente gera uma árvore de decisão por meio de regras, e as equações de regressão estimadas para cada um dos regimes encontrados. Este procedimento pode gerar alguns modelos espúrios ou por construção,devido a divisão binária da árvore, ou pelo fato de não existir neste momento uma metodologia de comparação dos modelos resultantes.Será proposta uma metodologia através de sucessivos testes de Chow -5, 1960- que identificará modelos espúrios e reduzirá a quantidade de regimes encontrados, e consequentemente de parâmetros a estimar. A complexidade do modelo final gerado é reduzida a partir da identificação de redundâncias, sem perder o poder preditivo dos modelos TS-TARX .O trabalho conclui com exemplos ilustrativos e algumas aplicações em bases de dados sintéticas, e casos reais que auxiliarão o entendimento. / [en] The goal of this dissertation is to propose a test mechanism to evaluate the results obtained from the TS-TARX modeling procedure.The main motivation is to find a solution to a usual problem related to TS-TARX modeling: spurious models are generated in the process of dividing the space state of the independent variables.The model is a heuristics based on regression tree analysis, as discussed by Brieman -3, 1984-. The model used to estimate the parameters of the time series is a TARX -Threshold Autoregressive with eXternal variables-.The main idea is to find thresholds that split the independent variable space into regimes which can be described by a local linear model. In this process, the recursive least square regression model is preserved. From the combination of regression tree analysis and recursive least square regression techniques, the model becomes TS-TARX -Tree Structured - Threshold Autoregression with eXternal variables-.The works initiated by Aranha in -1, 2001- will be extended. In his works, from a given data base, one efficient algorithm generates a decision tree based on splitting rules, and the corresponding regression equations for each one of the regimes found.Spurious models may be generated either from its building procedure, or from the fact that a procedure to compare the resulting models had not been proposed.To fill this gap, a methodology will be proposed. In accordance with the statistical tests proposed by Chow in -5, 196-, a series of consecutive tests will be performed.The Chow tests will provide the tools to identify spurious models and to reduce the number of regimes found. The complexity of the final model, and the number of parameters to estimate are therefore reduced by the identification and elimination of redundancies, without bringing risks to the TS-TARX model predictive power.This work is concluded with illustrative examples and some applications to real data that will help the readers understanding. [en] NONLINEAR TIME SERIES ANALYSIS [pt] MODELOS LINEARES POR PARTES [en] PIECEWISE LINEAR MODELS [pt] TS-TARX [en] TS-TARX, [pt] TESTE DE CHOW [en] CHOW TEST
340	Des poissons sous influence ? : une analyse à large échelle des relations entre les gradients abiotiques et l’ichtyofaune des estuaires tidaux européens / Fish under influence? : a large-scale analysis of relations between abiotic gradients and fish assemblages of European tidal estuaries Nicolas, Delphine 02 July 2010 (has links) Cette thèse cherche à déterminer l’influence de l’environnement abiotique sur la structure des assemblages de poissons dans les estuaires européens tidaux à partir d’une approche macroécologique. L’environnement abiotique de 135 estuaires, du Portugal à l’Ecosse, est caractérisé par une quinzaine de descripteurs en utilisant une approche écohydrologique. Les assemblages de poissons d’une centaine d’estuaires sont caractérisés par les données de pêche acquises au cours de campagnes scientifiques conduites dans le cadre de la Directive-Cadre européenne sur l’Eau (DCE). Néanmoins, ces données sont souvent hétérogènes du fait des différences entre les protocoles d’échantillonnage utilisés. Afin de limiter cette hétérogénéité, une sélection rigoureuse et une procédure de standardisation des données ont été effectuées. Les assemblages de poissons sont décrits à l’aide d’indices globaux ou fonctionnels relatifs à la richesse spécifique et à l’abondance. A l’aide de modèles linéaires généralisés, des relations sont établies entre des attributs de l’ichtyofaune et des gradients abiotiques à large échelle et au sein de l’estuaire. La richesse spécifique totale, et en particulier celle des espèces marines et migratrices amphihalines, augmente avec la taille de l’estuaire. De plus, elle apparaît plus élevée dans les estuaires associés à un large plateau continental. Les plus fortes densités totales et, en particulier, celles des espèces résidentes et marines, sont associées aux estuaires présentant une grande proportion en zones intertidales. Les assemblages de poissons estuariens apparaissent fortement structurés par le gradient de salinité à la fois en termes de richesse spécifique et de densité. En parallèle, cette thèse apporte des éléments témoignant d’un décalage vers le Nord de plusieurs espèces de poissons estuariens dans le contexte du réchauffement climatique global. Les résultats de cette thèse contribueront à l’amélioration des indicateurs biotiques basés sur l’ichtyofaune qui sont actuellement développés dans le contexte de la DCE. / Based on a macroecological approach, this thesis aims at determining the influence of the abiotic environment on the structure of fish assemblages among European tidal estuaries. The abiotic environment of 135 North-Eastern Atlantic estuaries from Portugal to Scotland was characterised by fifteen descriptors using an ecohydrological approach. The fish assemblages of about a hundred estuaries were characterised by fish data collected during scientific surveys conducted in the context of the European Water Framework Directive (WFD). Nonetheless, differences among sampling protocols resulted in highly heterogeneous datasets. To limit this heterogeneity, a rigorous selections and standardisation processes were carried out. Fish assemblages were described by total or functional indices related to species richness or abundance. Relationships were identified between large-scale and intra-estuarine abiotic gradients and fish attributes by fitting generalised linear models. Results showed that the total number of species, and more especially of marine and diadromous species, increased with the estuary size. Moreover, the total species richness appeared higher in estuaries associated to a wide continental shelf. The greatest total densities, and more particularly total densities of resident and marine species, were associated to estuaries with a great proportion of intertidal areas. Fish assemblages appeared also strongly structured by the salinity gradient in terms of both species richness and density. Furthermore, this thesis brought some evidence of northward migration of estuarine fish species in the context of the global warming. The results of this thesis will contribute to improve the fish indicators that are currently developed in the context of the European WFD. Assemblages de poissons Filtres environnementaux Estuaires tidaux Modèles linéaires généralisés Europe Macroécologie Fish assemblages Tidal estuaries Europe Macroecology Environmental filters Generalised linear models

Search results