• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 67
  • 62
  • 11
  • 10
  • 10
  • 6
  • 5
  • 4
  • 4
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 189
  • 47
  • 45
  • 39
  • 31
  • 25
  • 20
  • 19
  • 18
  • 16
  • 16
  • 16
  • 15
  • 15
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

Impact of unbalancedness and heteroscedasticity on classic parametric significance tests of two-way fixed-effects ANOVA tests

Chaka, Lyson 31 October 2017 (has links)
Classic parametric statistical tests, like the analysis of variance (ANOVA), are powerful tools used for comparing population means. These tests produce accurate results provided the data satisfies underlying assumptions such as homoscedasticity and balancedness, otherwise biased results are obtained. However, these assumptions are rarely satisfied in real-life. Alternative procedures must be explored. This thesis aims at investigating the impact of heteroscedasticity and unbalancedness on effect sizes in two-way fixed-effects ANOVA models. A real-life dataset, from which three different samples were simulated was used to investigate the changes in effect sizes under the influence of unequal variances and unbalancedness. The parametric bootstrap approach was proposed in case of unequal variances and non-normality. The results obtained indicated that heteroscedasticity significantly inflates effect sizes while unbalancedness has non-significant impact on effect sizes in two-way ANOVA models. However, the impact worsens when the data is both unbalanced and heteroscedastic. / Statistics / M. Sc. (Statistics)
172

Robust Nonparametric Sequential Distributed Spectrum Sensing under EMI and Fading

Sahasranand, K R January 2015 (has links) (PDF)
Opportunistic use of unused spectrum could efficiently be carried out using the paradigm of Cognitive Radio (CR). A spectrum remains idle when the primary user (licensee) is not using it. The secondary nodes detect this spectral hole quickly and make use of it for data transmission during this interval and stop transmitting once the primary starts transmitting. Detection of spectral holes by the secondary is called spectrum sensing in the CR scenario. Spectrum Sensing is formulated as a hypothesis testing problem wherein under H0 the spectrum is free and under H1, occupied. The samples will have different probability distributions, P0 and P1, under H0 and H1 respectively. In the first part of the thesis, a new algorithm - entropy test is presented, which performs better than the available algorithms when P0 is known but not P1. This is extended to a distributed setting as well, in which different secondary nodes collect samples independently and send their decisions to a Fusion Centre (FC) over a noisy MAC which then makes the final decision. The asymptotic optimality of the algorithm is also shown. In the second part, the spectrum sensing problem under impediments such as fading, electromagnetic interference and outliers is tackled. Here the detector does not possess full knowledge of either P0 or P1. This is a more general and practically relevant setting. It is found that a recently developed algorithm (which we call random walk test) under suitable modifications works well. The performance of the algorithm theoretically and via simulations is shown. The same algorithm is extended to the distributed setting as above.
173

Estudo comparativo de gr?ficos de probabilidade normal para an?lise de experimentos fatoriais n?o replicados

N?brega, Manass?s Pereira 17 May 2010 (has links)
Made available in DSpace on 2015-03-03T15:28:32Z (GMT). No. of bitstreams: 1 ManassesPN_DISSERT.pdf: 2146671 bytes, checksum: a562634d1e686680a598403ed93762dd (MD5) Previous issue date: 2010-05-17 / Two-level factorial designs are widely used in industrial experimentation. However, many factors in such a design require a large number of runs to perform the experiment, and too many replications of the treatments may not be feasible, considering limitations of resources and of time, making it expensive. In these cases, unreplicated designs are used. But, with only one replicate, there is no internal estimate of experimental error to make judgments about the significance of the observed efects. One of the possible solutions for this problem is to use normal plots or half-normal plots of the efects. Many experimenters use the normal plot, while others prefer the half-normal plot and, often, for both cases, without justification. The controversy about the use of these two graphical techniques motivates this work, once there is no register of formal procedure or statistical test that indicates \which one is best". The choice between the two plots seems to be a subjective issue. The central objective of this master's thesis is, then, to perform an experimental comparative study of the normal plot and half-normal plot in the context of the analysis of the 2k unreplicated factorial experiments. This study involves the construction of simulated scenarios, in which the graphics performance to detect significant efects and to identify outliers is evaluated in order to verify the following questions: Can be a plot better than other? In which situations? What kind of information does a plot increase to the analysis of the experiment that might complement those provided by the other plot? What are the restrictions on the use of graphics? Herewith, this work intends to confront these two techniques; to examine them simultaneously in order to identify similarities, diferences or relationships that contribute to the construction of a theoretical reference to justify or to aid in the experimenter's decision about which of the two graphical techniques to use and the reason for this use. The simulation results show that the half-normal plot is better to assist in the judgement of the efects, while the normal plot is recommended to detect outliers in the data / Os experimentos fatoriais 2k s?o muito utilizados na experimenta??o industrial. Contudo, quanto maior o n?mero de fatores considerados maior ser? a quantidade de provas necess?rias para a execu??o de um experimento, e realizar replica??es dos tratamentos pode ser invi?vel, considerando as limita??es de recursos e de tempo, tornando tal experimento dispendioso. Nestes casos, s~ao utilizados os fatoriais 2k n?o replicados. Mas, sem replica??oo, n?o ? poss?vel obter uma estimativa direta da variabilidade do erro experimental para se avaliar a signific^ancia dos efeitos. Uma das poss?veis solu??es para este problema ? utilizar os gr?fificos normal ou semi-normal dos efeitos. Muitos pesquisadores usam o gr?fifico normal, ao passo que outros preferem o semi-normal e, em muitas vezes, para ambos os casos, sem alguma justificativa. A controv?rsia sobre o uso destas duas t?cnicas gr?ficas ? o que motiva a realiza??o do presente trabalho, uma vez que n?o h? registro de procedimento formal ou teste estat?stico que indique \qual delas ? melhor". A escolha entre os dois gr?fificos parece ser uma quest~ao subjetiva. O objetivo central desta disserta??o ?, ent?o, realizar um estudo comparativo experimental dos gr?fificos normal e semi-normal no contexto da an?lise dos experimentos fatoriais 2k n?o replicados. Tal estudo consiste na constru??o de cen?rios simulados, nos quais o desempenho dos gr?fificos em detectar os efeitos significativos e identificar valores discrepantes ? avaliado com o intuito de verificar as seguintes quest?es: Um gr?fifico pode ser melhor que o outro? Em que situa??es? Que informa??es um gr?fifico acrescenta ? an?lise do experimento que possam complementar aquelas fornecidas pelo outro gr?fifico? Quais as restri??es no uso de cada gr?fifico? Com isso, prop?e-se confrontar estas duas t?cnicas; examin?-las simultaneamente a fim de conhecer semelhan?as, diferen?as ou rela??es que possam contribuir para a constru??o de um referencial te?rico que sirva como um subs?dio para justificar ou auxiliar na decis~ao do pesquisador sobre qual das duas t?cnicas gr?fificas utilizar e o porqu^e deste uso. Os resultados das simula??es mostram que o gr?fifico semi-normal ? melhor para auxiliar no julgamento dos efeitos, ao passo que o gr?fifico normal ? recomendado para detectar a presen?a de valores discrepantes nos dados
174

O uso de quase U-estatísticas para séries temporais uni e multivaridas / The use of quasi U-statistics for univariate and multivariate time series

Valk, Marcio 17 August 2018 (has links)
Orientador: Aluísio de Souza Pinheiro / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Matemática Estatítica e Computação Científica / Made available in DSpace on 2018-08-17T14:57:09Z (GMT). No. of bitstreams: 1 Valk_Marcio_D.pdf: 2306844 bytes, checksum: 31162915c290291a91806cdc6f69f697 (MD5) Previous issue date: 2011 / Resumo: Classificação e agrupamento de séries temporais são problemas bastante explorados na literatura atual. Muitas técnicas são apresentadas para resolver estes problemas. No entanto, as restrições necessárias, em geral, tornam os procedimentos específicos e aplicáveis somente a uma determinada classe de séries temporais. Além disso, muitas dessas abordagens são empíricas. Neste trabalho, propomos métodos para classificação e agrupamento de séries temporais baseados em quase U-estatísticas(Pinheiro et al. (2009) e Pinheiro et al. (2010)). Como núcleos das U-estatísticas são utilizadas métricas baseadas em ferramentas bem conhecidas na literatura de séries temporais, entre as quais o periodograma e a autocorrelação amostral. Três situações principais são consideradas: séries univariadas; séries multivariadas; e séries com valores aberrantes. _E demonstrada a normalidade assintética dos testes propostos para uma ampla classe de métricas e modelos. Os métodos são estudados também por simulação e ilustrados por aplicação em dados reais. / Abstract: Classifcation and clustering of time series are problems widely explored in the current literature. Many techniques are presented to solve these problems. However, the necessary restrictions in general, make the procedures specific and applicable only to a certain class of time series. Moreover, many of these approaches are empirical. We present methods for classi_cation and clustering of time series based on Quasi U-statistics (Pinheiro et al. (2009) and Pinheiro et al. (2010)). As kernel of U-statistics are used metrics based on tools well known in the literature of time series, including the sample autocorrelation and periodogram. Three main situations are considered: univariate time series, multivariate time series, and time series with outliers. It is demonstrated the asymptotic normality of the proposed tests for a wide class of metrics and models. The methods are also studied by simulation and applied in a real data set. / Doutorado / Estatistica / Doutor em Estatística
175

Extensões dos modelos de regressão quantílica bayesianos / Extensions of bayesian quantile regression models

Bruno Ramos dos Santos 29 April 2016 (has links)
Esta tese visa propor extensões dos modelos de regressão quantílica bayesianos, considerando dados de proporção com inflação de zeros, e também dados censurados no zero. Inicialmente, é sugerida uma análise de observações influentes, a partir da representação por mistura localização-escala da distribuição Laplace assimétrica, em que as distribuições a posteriori das variáveis latentes são comparadas com o intuito de identificar possíveis observações aberrantes. Em seguida, é proposto um modelo de duas partes para analisar dados de proporção com inflação de zeros ou uns, estudando os quantis condicionais e a probabilidade da variável resposta ser igual a zero. Além disso, são propostos modelos de regressão quantílica bayesiana para dados contínuos com um componente discreto no zero, em que parte dessas observações é suposta censurada. Esses modelos podem ser considerados mais completos na análise desse tipo de dados, uma vez que a probabilidade de censura é verificada para cada quantil de interesse. E por último, é considerada uma aplicação desses modelos com correlação espacial, para estudar os dados da eleição presidencial no Brasil em 2014. Nesse caso, os modelos de regressão quantílica são capazes de incorporar essa informação espacial a partir do processo Laplace assimétrico. Para todos os modelos propostos foi desenvolvido um pacote do software R, que está exemplificado no apêndice. / This thesis aims to propose extensions of Bayesian quantile regression models, considering proportion data with zero inflation, and also censored data at zero. Initially, it is suggested an analysis of influential observations, based on the location-scale mixture representation of the asymmetric Laplace distribution, where the posterior distribution of the latent variables are compared with the goal of identifying possible outlying observations. Next, a two-part model is proposed to analyze proportion data with zero or one inflation, studying the conditional quantile and the probability of the response variable being equal to zero. Following, Bayesian quantile regression models are proposed for continuous data with a discrete component at zero, where part of these observations are assumed censored. These models may be considered more complete in the analysis of this type of data, as the censoring probability varies with the quantiles of interest. For last, it is considered an application of these models with spacial correlation, in order to study the data about the last presidential election in Brazil in 2014. In this example, the quantile regression models are able to incorporate spatial dependence with the asymmetric Laplace process. For all the proposed models it was developed a R package, which is exemplified in the appendix.
176

Algoritmy pro detekci anomálií v datech z klinických studií a zdravotnických registrů / Algorithms for anomaly detection in data from clinical trials and health registries

Bondarenko, Maxim January 2018 (has links)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
177

Algoritmy pro detekci anomálií v datech z klinických studií a zdravotnických registrů / Algorithms for anomaly detection in data from clinical trials and health registries

Bondarenko, Maxim January 2018 (has links)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
178

Dolovací modul systému pro dolování z dat na platformě NetBeans / Data Mining Module of a Data Mining System on NetBeans Platform

Výtvar, Jaromír January 2010 (has links)
The aim of this work is to get basic overview about the process of obtaining knowledge from databases - datamining and to analyze the datamining system developed at FIT BUT on the NetBeans platform in order to create a new mining module. We decided to implement a module for mining outliers and to extend existing regression module with multiple linear regression using generalized linear models. New methods using existing methods of Oracle Data Mining.
179

Uncertainty in radar emitter classification and clustering / Gestion des incertitudes en identification des modes radar

Revillon, Guillaume 18 April 2019 (has links)
En Guerre Electronique, l’identification des signaux radar est un atout majeur de la prise de décisions tactiques liées au théâtre d’opérations militaires. En fournissant des informations sur la présence de menaces, la classification et le partitionnement des signaux radar ont alors un rôle crucial assurant un choix adapté des contre-mesures dédiées à ces menaces et permettant la détection de signaux radar inconnus pour la mise à jour des bases de données. Les systèmes de Mesures de Soutien Electronique enregistrent la plupart du temps des mélanges de signaux radar provenant de différents émetteurs présents dans l’environnement électromagnétique. Le signal radar, décrit par un motif de modulations impulsionnelles, est alors souvent partiellement observé du fait de mesures manquantes et aberrantes. Le processus d’identification se fonde sur l’analyse statistique des paramètres mesurables du signal radar qui le caractérisent tant quantitativement que qualitativement. De nombreuses approches mêlant des techniques de fusion de données et d’apprentissage statistique ont été développées. Cependant, ces algorithmes ne peuvent pas gérer les données manquantes et des méthodes de substitution de données sont requises afin d’utiliser ces derniers. L’objectif principal de cette thèse est alors de définir un modèle de classification et partitionnement intégrant la gestion des valeurs aberrantes et manquantes présentes dans tout type de données. Une approche fondée sur les modèles de mélange de lois de probabilités est proposée dans cette thèse. Les modèles de mélange fournissent un formalisme mathématique flexible favorisant l’introduction de variables latentes permettant la gestion des données aberrantes et la modélisation des données manquantes dans les problèmes de classification et de partionnement. L’apprentissage du modèle ainsi que la classification et le partitionnement sont réalisés dans un cadre d’inférence bayésienne où une méthode d’approximation variationnelle est introduite afin d’estimer la loi jointe a posteriori des variables latentes et des paramètres. Des expériences sur diverses données montrent que la méthode proposée fournit de meilleurs résultats que les algorithmes standards. / In Electronic Warfare, radar signals identification is a supreme asset for decision making in military tactical situations. By providing information about the presence of threats, classification and clustering of radar signals have a significant role ensuring that countermeasures against enemies are well-chosen and enabling detection of unknown radar signals to update databases. Most of the time, Electronic Support Measures systems receive mixtures of signals from different radar emitters in the electromagnetic environment. Hence a radar signal, described by a pulse-to-pulse modulation pattern, is often partially observed due to missing measurements and measurement errors. The identification process relies on statistical analysis of basic measurable parameters of a radar signal which constitute both quantitative and qualitative data. Many general and practical approaches based on data fusion and machine learning have been developed and traditionally proceed to feature extraction, dimensionality reduction and classification or clustering. However, these algorithms cannot handle missing data and imputation methods are required to generate data to use them. Hence, the main objective of this work is to define a classification/clustering framework that handles both outliers and missing values for any types of data. Here, an approach based on mixture models is developed since mixture models provide a mathematically based, flexible and meaningful framework for the wide variety of classification and clustering requirements. The proposed approach focuses on the introduction of latent variables that give us the possibility to handle sensitivity of the model to outliers and to allow a less restrictive modelling of missing data. A Bayesian treatment is adopted for model learning, supervised classification and clustering and inference is processed through a variational Bayesian approximation since the joint posterior distribution of latent variables and parameters is untractable. Some numerical experiments on synthetic and real data show that the proposed method provides more accurate results than standard algorithms.
180

Robust methods in multivariate time series / Méthodes robustes dans les séries chronologiques multivariées / Métodos robustos em séries temporais multivariadas

Aranda Cotta, Higor Henrique 22 August 2019 (has links)
Ce manuscrit propose de nouvelles méthodes d’estimation robustes pour les fonctions matricielles d’autocovariance et d’autocorrélation de séries chronologiques multivariées stationnaires pouvant présenter des valeurs aberrantes aléatoires additives. Ces fonctions jouent un rôle important dans l’identification et l’estimation des paramètres de modèles de séries chronologiques multivariées stationnaires. Nous proposons tout d'abord de nouveaux estimateurs des fonctions matricielles d’autocovariance et d’autocorrélation construits en utilisant une approche spectrale à l'aide du périodogramme matriciel. Comme dans le cas des estimateurs classiques des fonctions d’autocovariance et d’autocorrélation matricielles, ces estimateurs sont affectés par des observations aberrantes. Ainsi, toute procédure d'identification ou d'estimation les utilisant est directement affectée, ce qui entraîne des conclusions erronées. Pour atténuer ce problème, nous proposons l’utilisation de techniques statistiques robustes pour créer des estimateurs résistants aux observations aléatoires aberrantes. Dans un premier temps, nous proposons de nouveaux estimateurs des fonctions d’autocorvariance et d’autocorrélation de séries chronologiques univariées. Les domaines temporel et fréquentiel sont liés par la relation existant entre la fonction d’autocovariance et la densité spectrale. Le périodogramme étant sensible aux données aberrantes, nous obtenons un estimateur robuste en le remplaçant parle $M$-périodogramme. Les propriétés asymptotiques des estimateurs sont établies. Leurs performances sont étudiées au moyen de simulations numériques pour différentes tailles d’échantillons et différents scénarios de contamination. Les résultats empiriques indiquent que les méthodes proposées fournissent des valeurs proches de celles obtenues par la fonction d'autocorrélation classique quand les données ne sont pas contaminées et resistent à différents cénarios de contamination. Ainsi, les estimateurs proposés dans cette thèse sont des méthodes alternatives utilisables pour des séries chronologiques présentant ou non des valeurs aberrantes. Les estimateurs obtenus pour des séries chronologiques univariées sont ensuite étendus au cas de séries multivariées. Cette extension est simplifiée par le fait que le calcul du périodogramme croisé ne fait intervenir que les coefficients de Fourier de chaque composante de la série. Le $M$-périodogramme matriciel apparaît alors comme une alternative robuste au périodogramme matriciel pour construire des estimateurs robustes des fonctions matricielles d’autocovariance et d’autocorrélation. Les propriétés asymptotiques sont étudiées et des expériences numériques sont réalisées. Comme exemple d'application avec des données réelles, nous utilisons les fonctions proposées pour ajuster un modèle autoregressif par la méthode de Yule-Walker à des données de pollution collectées dans la région de Vitória au Brésil.Enfin, l'estimation robuste du nombre de facteurs dans les modèles factoriels de grande dimension est considérée afin de réduire la dimensionnalité. En présence de valeurs aberrantes, les critères d’information proposés par Bai & Ng (2002) tendent à surestimer le nombre de facteurs. Pour atténuer ce problème, nous proposons de remplacer la matrice de covariance standard par la matrice de covariance robuste proposée dans ce manuscrit. Nos simulations montrent qu'en l'absence de contamination, les méthodes standards et robustes sont équivalentes. En présence d'observations aberrantes, le nombre de facteurs estimés augmente avec les méthodes non robustes alors qu'il reste le même en utilisant les méthodes robustes. À titre d'application avec des données réelles, nous étudions des concentrations de polluant PM$_{10}$ mesurées dans la région de l'Île-de-France en France. / This manuscript proposes new robust estimation methods for the autocovariance and autocorrelation matrices functions of stationary multivariates time series that may have random additives outliers. These functions play an important role in the identification and estimation of time series model parameters. We first propose new estimators of the autocovariance and of autocorrelation matrices functions constructed using a spectral approach considering the periodogram matrix periodogram which is the natural estimator of the spectral density matrix. As in the case of the classic autocovariance and autocorrelation matrices functions estimators, these estimators are affected by aberrant observations. Thus, any identification or estimation procedure using them is directly affected, which leads to erroneous conclusions. To mitigate this problem, we propose the use of robust statistical techniques to create estimators resistant to aberrant random observations.As a first step, we propose new estimators of autocovariance and autocorrelation functions of univariate time series. The time and frequency domains are linked by the relationship between the autocovariance function and the spectral density. As the periodogram is sensitive to aberrant data, we get a robust estimator by replacing it with the $M$-periodogram. The $M$-periodogram is obtained by replacing the Fourier coefficients related to periodogram calculated by the standard least squares regression with the ones calculated by the $M$-robust regression. The asymptotic properties of estimators are established. Their performances are studied by means of numerical simulations for different sample sizes and different scenarios of contamination. The empirical results indicate that the proposed methods provide close values of those obtained by the classical autocorrelation function when the data is not contaminated and it is resistant to different contamination scenarios. Thus, the estimators proposed in this thesis are alternative methods that can be used for time series with or without outliers.The estimators obtained for univariate time series are then extended to the case of multivariate series. This extension is simplified by the fact that the calculation of the cross-periodogram only involves the Fourier coefficients of each component from the univariate series. Thus, the $M$-periodogram matrix is a robust periodogram matrix alternative to build robust estimators of the autocovariance and autocorrelation matrices functions. The asymptotic properties are studied and numerical experiments are performed. As an example of an application with real data, we use the proposed functions to adjust an autoregressive model by the Yule-Walker method to Pollution data collected in the Vitória region Brazil.Finally, the robust estimation of the number of factors in large factorial models is considered in order to reduce the dimensionality. It is well known that the values random additive outliers affect the covariance and correlation matrices and the techniques that depend on the calculation of their eigenvalues and eigenvectors, such as the analysis principal components and the factor analysis, are affected. Thus, in the presence of outliers, the information criteria proposed by Bai & Ng (2002) tend to overestimate the number of factors. To alleviate this problem, we propose to replace the standard covariance matrix with the robust covariance matrix proposed in this manuscript. Our Monte Carlo simulations show that, in the absence of contamination, the standard and robust methods are equivalent. In the presence of outliers, the number of estimated factors increases with the non-robust methods while it remains the same using robust methods. As an application with real data, we study pollutant concentrations PM$_{10}$ measured in the Île-de-France region of France. / Este manuscrito é centrado em propor novos métodos de estimaçao das funçoes de autocovariancia e autocorrelaçao matriciais de séries temporais multivariadas com e sem presença de observaçoes discrepantes aleatorias. As funçoes de autocovariancia e autocorrelaçao matriciais desempenham um papel importante na analise e na estimaçao dos parametros de modelos de série temporal multivariadas. Primeiramente, nos propomos novos estimadores dessas funçoes matriciais construıdas, considerando a abordagem do dominio da frequencia por meio do periodograma matricial, um estimador natural da matriz de densidade espectral. Como no caso dos estimadores tradicionais das funçoes de autocovariancia e autocorrelaçao matriciais, os nossos estimadores tambem sao afetados pelas observaçoes discrepantes. Assim, qualquer analise subsequente que os utilize é diretamente afetada causando conclusoes equivocadas. Para mitigar esse problema, nos propomos a utilizaçao de técnicas de estatistica robusta para a criaçao de estimadores resistentes as observaçoes discrepantes aleatorias. Inicialmente, nos propomos novos estimadores das funçoes de autocovariancia e autocorrelaçao de séries temporais univariadas considerando a conexao entre o dominio do tempo e da frequencia por meio da relaçao entre a funçao de autocovariancia e a densidade espectral, do qual o periodograma tradicional é o estimador natural. Esse estimador é sensivel as observaçoes discrepantes. Assim, a robustez é atingida considerando a utilizaçao do Mperiodograma. O M-periodograma é obtido substituindo a regressao por minimos quadrados com a M-regressao no calculo das estimativas dos coeficientes de Fourier relacionados ao periodograma. As propriedades assintoticas dos estimadores sao estabelecidas. Para diferentes tamanhos de amostras e cenarios de contaminaçao, a performance dos estimadores é investigada. Os resultados empiricos indicam que os métodos propostos provem resultados acurados. Isto é, os métodos propostos obtêm valores proximos aos da funçao de autocorrelaçao tradicional no contexto de nao contaminaçao dos dados. Quando ha contaminaçao, os M-estimadores permanecem inalterados. Deste modo, as funçoes de M-autocovariancia e de M-autocorrelaçao propostas nesta tese sao alternativas vi aveis para séries temporais com e sem observaçoes discrepantes. A boa performance dos estimadores para o cenario de séries temporais univariadas motivou a extensao para o contexto de séries temporais multivariadas. Essa extensao é direta, haja vista que somente os coeficientes de Fourier relativos à cada uma das séries univariadas sao necessarios para o calculo do periodograma cruzado. Novamente, a relaçao de dualidade entre o dominio da frequência e do tempo é explorada por meio da conexao entre a funçao matricial de autocovariancia e a matriz de densidade espectral de séries temporais multivariadas. É neste sentido que, o presente artigo propoe a matriz M-periodograma como um substituto robusto à matriz periodograma tradicional na criaçao de estimadores das funçoes matriciais de autocovariancia e autocorrelaçao. As propriedades assintoticas sao estudas e experimentos numéricos sao realizados. Como exemplo de aplicaçao à dados reais, nos aplicamos as funçoes propostas no artigo na estimaçao dos parâmetros do modelo de série temporal multivariada pelo método de Yule-Walker para a modelagem dos dados MP10 da regiao de Vitoria/Brasil. Finalmente, a estimaçao robusta dos numeros de fatores em modelos fatoriais aproximados de alta dimensao é considerada com o objetivo de reduzir a dimensionalidade. Ésabido que dados discrepantes afetam as matrizes de covariancia e correlaçao. Em adiçao, técnicas que dependem do calculo dos autovalores e autovetores dessas matrizes, como a analise de componentes principais e a analise fatorial, sao completamente afetadas. Assim, na presença de observaçoes discrepantes, o critério de informaçao proposto por Bai & Ng (2002) tende a superestimar o numero de fatores. [...]

Page generated in 0.0789 seconds