Global ETD Search

181	Uncertainty in radar emitter classification and clustering / Gestion des incertitudes en identification des modes radar Revillon, Guillaume 18 April 2019 (has links) En Guerre Electronique, l’identification des signaux radar est un atout majeur de la prise de décisions tactiques liées au théâtre d’opérations militaires. En fournissant des informations sur la présence de menaces, la classification et le partitionnement des signaux radar ont alors un rôle crucial assurant un choix adapté des contre-mesures dédiées à ces menaces et permettant la détection de signaux radar inconnus pour la mise à jour des bases de données. Les systèmes de Mesures de Soutien Electronique enregistrent la plupart du temps des mélanges de signaux radar provenant de différents émetteurs présents dans l’environnement électromagnétique. Le signal radar, décrit par un motif de modulations impulsionnelles, est alors souvent partiellement observé du fait de mesures manquantes et aberrantes. Le processus d’identification se fonde sur l’analyse statistique des paramètres mesurables du signal radar qui le caractérisent tant quantitativement que qualitativement. De nombreuses approches mêlant des techniques de fusion de données et d’apprentissage statistique ont été développées. Cependant, ces algorithmes ne peuvent pas gérer les données manquantes et des méthodes de substitution de données sont requises afin d’utiliser ces derniers. L’objectif principal de cette thèse est alors de définir un modèle de classification et partitionnement intégrant la gestion des valeurs aberrantes et manquantes présentes dans tout type de données. Une approche fondée sur les modèles de mélange de lois de probabilités est proposée dans cette thèse. Les modèles de mélange fournissent un formalisme mathématique flexible favorisant l’introduction de variables latentes permettant la gestion des données aberrantes et la modélisation des données manquantes dans les problèmes de classification et de partionnement. L’apprentissage du modèle ainsi que la classification et le partitionnement sont réalisés dans un cadre d’inférence bayésienne où une méthode d’approximation variationnelle est introduite afin d’estimer la loi jointe a posteriori des variables latentes et des paramètres. Des expériences sur diverses données montrent que la méthode proposée fournit de meilleurs résultats que les algorithmes standards. / In Electronic Warfare, radar signals identification is a supreme asset for decision making in military tactical situations. By providing information about the presence of threats, classification and clustering of radar signals have a significant role ensuring that countermeasures against enemies are well-chosen and enabling detection of unknown radar signals to update databases. Most of the time, Electronic Support Measures systems receive mixtures of signals from different radar emitters in the electromagnetic environment. Hence a radar signal, described by a pulse-to-pulse modulation pattern, is often partially observed due to missing measurements and measurement errors. The identification process relies on statistical analysis of basic measurable parameters of a radar signal which constitute both quantitative and qualitative data. Many general and practical approaches based on data fusion and machine learning have been developed and traditionally proceed to feature extraction, dimensionality reduction and classification or clustering. However, these algorithms cannot handle missing data and imputation methods are required to generate data to use them. Hence, the main objective of this work is to define a classification/clustering framework that handles both outliers and missing values for any types of data. Here, an approach based on mixture models is developed since mixture models provide a mathematically based, flexible and meaningful framework for the wide variety of classification and clustering requirements. The proposed approach focuses on the introduction of latent variables that give us the possibility to handle sensitivity of the model to outliers and to allow a less restrictive modelling of missing data. A Bayesian treatment is adopted for model learning, supervised classification and clustering and inference is processed through a variational Bayesian approximation since the joint posterior distribution of latent variables and parameters is untractable. Some numerical experiments on synthetic and real data show that the proposed method provides more accurate results than standard algorithms. Traitement du signal en radar Méthodes bayésiennes Incertitude Émetteurs radar Classification Partitionnement Valeurs aberrantes Données manquantes Modèles de mélange Signal processing Bayesian methods Uncertainty Radar emitter Classification Clustering Outliers Missing data Mixture models
182	Robust methods in multivariate time series / Méthodes robustes dans les séries chronologiques multivariées / Métodos robustos em séries temporais multivariadas Aranda Cotta, Higor Henrique 22 August 2019 (has links) Ce manuscrit propose de nouvelles méthodes d’estimation robustes pour les fonctions matricielles d’autocovariance et d’autocorrélation de séries chronologiques multivariées stationnaires pouvant présenter des valeurs aberrantes aléatoires additives. Ces fonctions jouent un rôle important dans l’identification et l’estimation des paramètres de modèles de séries chronologiques multivariées stationnaires. Nous proposons tout d'abord de nouveaux estimateurs des fonctions matricielles d’autocovariance et d’autocorrélation construits en utilisant une approche spectrale à l'aide du périodogramme matriciel. Comme dans le cas des estimateurs classiques des fonctions d’autocovariance et d’autocorrélation matricielles, ces estimateurs sont affectés par des observations aberrantes. Ainsi, toute procédure d'identification ou d'estimation les utilisant est directement affectée, ce qui entraîne des conclusions erronées. Pour atténuer ce problème, nous proposons l’utilisation de techniques statistiques robustes pour créer des estimateurs résistants aux observations aléatoires aberrantes. Dans un premier temps, nous proposons de nouveaux estimateurs des fonctions d’autocorvariance et d’autocorrélation de séries chronologiques univariées. Les domaines temporel et fréquentiel sont liés par la relation existant entre la fonction d’autocovariance et la densité spectrale. Le périodogramme étant sensible aux données aberrantes, nous obtenons un estimateur robuste en le remplaçant parle $M$-périodogramme. Les propriétés asymptotiques des estimateurs sont établies. Leurs performances sont étudiées au moyen de simulations numériques pour différentes tailles d’échantillons et différents scénarios de contamination. Les résultats empiriques indiquent que les méthodes proposées fournissent des valeurs proches de celles obtenues par la fonction d'autocorrélation classique quand les données ne sont pas contaminées et resistent à différents cénarios de contamination. Ainsi, les estimateurs proposés dans cette thèse sont des méthodes alternatives utilisables pour des séries chronologiques présentant ou non des valeurs aberrantes. Les estimateurs obtenus pour des séries chronologiques univariées sont ensuite étendus au cas de séries multivariées. Cette extension est simplifiée par le fait que le calcul du périodogramme croisé ne fait intervenir que les coefficients de Fourier de chaque composante de la série. Le $M$-périodogramme matriciel apparaît alors comme une alternative robuste au périodogramme matriciel pour construire des estimateurs robustes des fonctions matricielles d’autocovariance et d’autocorrélation. Les propriétés asymptotiques sont étudiées et des expériences numériques sont réalisées. Comme exemple d'application avec des données réelles, nous utilisons les fonctions proposées pour ajuster un modèle autoregressif par la méthode de Yule-Walker à des données de pollution collectées dans la région de Vitória au Brésil.Enfin, l'estimation robuste du nombre de facteurs dans les modèles factoriels de grande dimension est considérée afin de réduire la dimensionnalité. En présence de valeurs aberrantes, les critères d’information proposés par Bai & Ng (2002) tendent à surestimer le nombre de facteurs. Pour atténuer ce problème, nous proposons de remplacer la matrice de covariance standard par la matrice de covariance robuste proposée dans ce manuscrit. Nos simulations montrent qu'en l'absence de contamination, les méthodes standards et robustes sont équivalentes. En présence d'observations aberrantes, le nombre de facteurs estimés augmente avec les méthodes non robustes alors qu'il reste le même en utilisant les méthodes robustes. À titre d'application avec des données réelles, nous étudions des concentrations de polluant PM$_{10}$ mesurées dans la région de l'Île-de-France en France. / This manuscript proposes new robust estimation methods for the autocovariance and autocorrelation matrices functions of stationary multivariates time series that may have random additives outliers. These functions play an important role in the identification and estimation of time series model parameters. We first propose new estimators of the autocovariance and of autocorrelation matrices functions constructed using a spectral approach considering the periodogram matrix periodogram which is the natural estimator of the spectral density matrix. As in the case of the classic autocovariance and autocorrelation matrices functions estimators, these estimators are affected by aberrant observations. Thus, any identification or estimation procedure using them is directly affected, which leads to erroneous conclusions. To mitigate this problem, we propose the use of robust statistical techniques to create estimators resistant to aberrant random observations.As a first step, we propose new estimators of autocovariance and autocorrelation functions of univariate time series. The time and frequency domains are linked by the relationship between the autocovariance function and the spectral density. As the periodogram is sensitive to aberrant data, we get a robust estimator by replacing it with the $M$-periodogram. The $M$-periodogram is obtained by replacing the Fourier coefficients related to periodogram calculated by the standard least squares regression with the ones calculated by the $M$-robust regression. The asymptotic properties of estimators are established. Their performances are studied by means of numerical simulations for different sample sizes and different scenarios of contamination. The empirical results indicate that the proposed methods provide close values of those obtained by the classical autocorrelation function when the data is not contaminated and it is resistant to different contamination scenarios. Thus, the estimators proposed in this thesis are alternative methods that can be used for time series with or without outliers.The estimators obtained for univariate time series are then extended to the case of multivariate series. This extension is simplified by the fact that the calculation of the cross-periodogram only involves the Fourier coefficients of each component from the univariate series. Thus, the $M$-periodogram matrix is a robust periodogram matrix alternative to build robust estimators of the autocovariance and autocorrelation matrices functions. The asymptotic properties are studied and numerical experiments are performed. As an example of an application with real data, we use the proposed functions to adjust an autoregressive model by the Yule-Walker method to Pollution data collected in the Vitória region Brazil.Finally, the robust estimation of the number of factors in large factorial models is considered in order to reduce the dimensionality. It is well known that the values random additive outliers affect the covariance and correlation matrices and the techniques that depend on the calculation of their eigenvalues and eigenvectors, such as the analysis principal components and the factor analysis, are affected. Thus, in the presence of outliers, the information criteria proposed by Bai & Ng (2002) tend to overestimate the number of factors. To alleviate this problem, we propose to replace the standard covariance matrix with the robust covariance matrix proposed in this manuscript. Our Monte Carlo simulations show that, in the absence of contamination, the standard and robust methods are equivalent. In the presence of outliers, the number of estimated factors increases with the non-robust methods while it remains the same using robust methods. As an application with real data, we study pollutant concentrations PM$_{10}$ measured in the Île-de-France region of France. / Este manuscrito é centrado em propor novos métodos de estimaçao das funçoes de autocovariancia e autocorrelaçao matriciais de séries temporais multivariadas com e sem presença de observaçoes discrepantes aleatorias. As funçoes de autocovariancia e autocorrelaçao matriciais desempenham um papel importante na analise e na estimaçao dos parametros de modelos de série temporal multivariadas. Primeiramente, nos propomos novos estimadores dessas funçoes matriciais construıdas, considerando a abordagem do dominio da frequencia por meio do periodograma matricial, um estimador natural da matriz de densidade espectral. Como no caso dos estimadores tradicionais das funçoes de autocovariancia e autocorrelaçao matriciais, os nossos estimadores tambem sao afetados pelas observaçoes discrepantes. Assim, qualquer analise subsequente que os utilize é diretamente afetada causando conclusoes equivocadas. Para mitigar esse problema, nos propomos a utilizaçao de técnicas de estatistica robusta para a criaçao de estimadores resistentes as observaçoes discrepantes aleatorias. Inicialmente, nos propomos novos estimadores das funçoes de autocovariancia e autocorrelaçao de séries temporais univariadas considerando a conexao entre o dominio do tempo e da frequencia por meio da relaçao entre a funçao de autocovariancia e a densidade espectral, do qual o periodograma tradicional é o estimador natural. Esse estimador é sensivel as observaçoes discrepantes. Assim, a robustez é atingida considerando a utilizaçao do Mperiodograma. O M-periodograma é obtido substituindo a regressao por minimos quadrados com a M-regressao no calculo das estimativas dos coeficientes de Fourier relacionados ao periodograma. As propriedades assintoticas dos estimadores sao estabelecidas. Para diferentes tamanhos de amostras e cenarios de contaminaçao, a performance dos estimadores é investigada. Os resultados empiricos indicam que os métodos propostos provem resultados acurados. Isto é, os métodos propostos obtêm valores proximos aos da funçao de autocorrelaçao tradicional no contexto de nao contaminaçao dos dados. Quando ha contaminaçao, os M-estimadores permanecem inalterados. Deste modo, as funçoes de M-autocovariancia e de M-autocorrelaçao propostas nesta tese sao alternativas vi aveis para séries temporais com e sem observaçoes discrepantes. A boa performance dos estimadores para o cenario de séries temporais univariadas motivou a extensao para o contexto de séries temporais multivariadas. Essa extensao é direta, haja vista que somente os coeficientes de Fourier relativos à cada uma das séries univariadas sao necessarios para o calculo do periodograma cruzado. Novamente, a relaçao de dualidade entre o dominio da frequência e do tempo é explorada por meio da conexao entre a funçao matricial de autocovariancia e a matriz de densidade espectral de séries temporais multivariadas. É neste sentido que, o presente artigo propoe a matriz M-periodograma como um substituto robusto à matriz periodograma tradicional na criaçao de estimadores das funçoes matriciais de autocovariancia e autocorrelaçao. As propriedades assintoticas sao estudas e experimentos numéricos sao realizados. Como exemplo de aplicaçao à dados reais, nos aplicamos as funçoes propostas no artigo na estimaçao dos parâmetros do modelo de série temporal multivariada pelo método de Yule-Walker para a modelagem dos dados MP10 da regiao de Vitoria/Brasil. Finalmente, a estimaçao robusta dos numeros de fatores em modelos fatoriais aproximados de alta dimensao é considerada com o objetivo de reduzir a dimensionalidade. Ésabido que dados discrepantes afetam as matrizes de covariancia e correlaçao. Em adiçao, técnicas que dependem do calculo dos autovalores e autovetores dessas matrizes, como a analise de componentes principais e a analise fatorial, sao completamente afetadas. Assim, na presença de observaçoes discrepantes, o critério de informaçao proposto por Bai & Ng (2002) tende a superestimar o numero de fatores. [...] Séries chronologiques multivariées Robustesse Valeurs aberrantes Domaine temporel Domaine fréquentiel Multivariate time series Robustness Outliers Time domain Frequency domain Séries temporais multivariadas Robustez Observações discrepantes Domínio do tempo Domínio da frequência
183	Inférence robuste à la présence des valeurs aberrantes dans les enquêtes Dongmo Jiongo, Valéry 12 1900 (has links) Cette thèse comporte trois articles dont un est publié et deux en préparation. Le sujet central de la thèse porte sur le traitement des valeurs aberrantes représentatives dans deux aspects importants des enquêtes que sont : l’estimation des petits domaines et l’imputation en présence de non-réponse partielle. En ce qui concerne les petits domaines, les estimateurs robustes dans le cadre des modèles au niveau des unités ont été étudiés. Sinha & Rao (2009) proposent une version robuste du meilleur prédicteur linéaire sans biais empirique pour la moyenne des petits domaines. Leur estimateur robuste est de type «plugin», et à la lumière des travaux de Chambers (1986), cet estimateur peut être biaisé dans certaines situations. Chambers et al. (2014) proposent un estimateur corrigé du biais. En outre, un estimateur de l’erreur quadratique moyenne a été associé à ces estimateurs ponctuels. Sinha & Rao (2009) proposent une procédure bootstrap paramétrique pour estimer l’erreur quadratique moyenne. Des méthodes analytiques sont proposées dans Chambers et al. (2014). Cependant, leur validité théorique n’a pas été établie et leurs performances empiriques ne sont pas pleinement satisfaisantes. Ici, nous examinons deux nouvelles approches pour obtenir une version robuste du meilleur prédicteur linéaire sans biais empirique : la première est fondée sur les travaux de Chambers (1986), et la deuxième est basée sur le concept de biais conditionnel comme mesure de l’influence d’une unité de la population. Ces deux classes d’estimateurs robustes des petits domaines incluent également un terme de correction pour le biais. Cependant, ils utilisent tous les deux l’information disponible dans tous les domaines contrairement à celui de Chambers et al. (2014) qui utilise uniquement l’information disponible dans le domaine d’intérêt. Dans certaines situations, un biais non négligeable est possible pour l’estimateur de Sinha & Rao (2009), alors que les estimateurs proposés exhibent un faible biais pour un choix approprié de la fonction d’influence et de la constante de robustesse. Les simulations Monte Carlo sont effectuées, et les comparaisons sont faites entre les estimateurs proposés et ceux de Sinha & Rao (2009) et de Chambers et al. (2014). Les résultats montrent que les estimateurs de Sinha & Rao (2009) et de Chambers et al. (2014) peuvent avoir un biais important, alors que les estimateurs proposés ont une meilleure performance en termes de biais et d’erreur quadratique moyenne. En outre, nous proposons une nouvelle procédure bootstrap pour l’estimation de l’erreur quadratique moyenne des estimateurs robustes des petits domaines. Contrairement aux procédures existantes, nous montrons formellement la validité asymptotique de la méthode bootstrap proposée. Par ailleurs, la méthode proposée est semi-paramétrique, c’est-à-dire, elle n’est pas assujettie à une hypothèse sur les distributions des erreurs ou des effets aléatoires. Ainsi, elle est particulièrement attrayante et plus largement applicable. Nous examinons les performances de notre procédure bootstrap avec les simulations Monte Carlo. Les résultats montrent que notre procédure performe bien et surtout performe mieux que tous les compétiteurs étudiés. Une application de la méthode proposée est illustrée en analysant les données réelles contenant des valeurs aberrantes de Battese, Harter & Fuller (1988). S’agissant de l’imputation en présence de non-réponse partielle, certaines formes d’imputation simple ont été étudiées. L’imputation par la régression déterministe entre les classes, qui inclut l’imputation par le ratio et l’imputation par la moyenne sont souvent utilisées dans les enquêtes. Ces méthodes d’imputation peuvent conduire à des estimateurs imputés biaisés si le modèle d’imputation ou le modèle de non-réponse n’est pas correctement spécifié. Des estimateurs doublement robustes ont été développés dans les années récentes. Ces estimateurs sont sans biais si l’un au moins des modèles d’imputation ou de non-réponse est bien spécifié. Cependant, en présence des valeurs aberrantes, les estimateurs imputés doublement robustes peuvent être très instables. En utilisant le concept de biais conditionnel, nous proposons une version robuste aux valeurs aberrantes de l’estimateur doublement robuste. Les résultats des études par simulations montrent que l’estimateur proposé performe bien pour un choix approprié de la constante de robustesse. / This thesis focuses on the treatment of representative outliers in two important aspects of surveys: small area estimation and imputation for item non-response. Concerning small area estimation, robust estimators in unit-level models have been studied. Sinha & Rao (2009) proposed estimation procedures designed for small area means, based on robustified maximum likelihood parameters estimates of linear mixed model and robust empirical best linear unbiased predictors of the random effect of the underlying model. Their robust methods for estimating area means are of the plug-in type, and in view of the results of Chambers (1986), the resulting robust estimators may be biased in some situations. Biascorrected estimators have been proposed by Chambers et al. (2014). In addition, these robust small area estimators were associated with the estimation of the Mean Square Error (MSE). Sinha & Rao (2009) proposed a parametric bootstrap procedure based on the robust estimates of the parameters of the underlying linear mixed model to estimate the MSE. Analytical procedures for the estimation of the MSE have been proposed in Chambers et al. (2014). However, their theoretical validity has not been formally established and their empirical performances are not fully satisfactorily. Here, we investigate two new approaches for the robust version the best empirical unbiased estimator: the first one relies on the work of Chambers (1986), while the second proposal uses the concept of conditional bias as an influence measure to assess the impact of units in the population. These two classes of robust small area estimators also include a correction term for the bias. However, they are both fully bias-corrected, in the sense that the correction term takes into account the potential impact of the other domains on the small area of interest unlike the one of Chambers et al. (2014) which focuses only on the domain of interest. Under certain conditions, non-negligible bias is expected for the Sinha-Rao method, while the proposed methods exhibit significant bias reduction, controlled by appropriate choices of the influence function and tuning constants. Monte Carlo simulations are conducted, and comparisons are made between: the new robust estimators, the Sinha-Rao estimator, and the bias-corrected estimator. Empirical results suggest that the Sinha-Rao method and the bias-adjusted estimator of Chambers et al (2014) may exhibit a large bias, while the new procedures offer often better performances in terms of bias and mean squared error. In addition, we propose a new bootstrap procedure for MSE estimation of robust small area predictors. Unlike existing approaches, we formally prove the asymptotic validity of the proposed bootstrap method. Moreover, the proposed method is semi-parametric, i.e., it does not rely on specific distributional assumptions about the errors and random effects of the unit-level model underlying the small-area estimation, thus it is particularly attractive and more widely applicable. We assess the finite sample performance of our bootstrap estimator through Monte Carlo simulations. The results show that our procedure performs satisfactorily well and outperforms existing ones. Application of the proposed method is illustrated by analyzing a well-known outlier-contaminated small county crops area data from North-Central Iowa farms and Landsat satellite images. Concerning imputation in the presence of item non-response, some single imputation methods have been studied. The deterministic regression imputation, which includes the ratio imputation and mean imputation are often used in surveys. These imputation methods may lead to biased imputed estimators if the imputation model or the non-response model is not properly specified. Recently, doubly robust imputed estimators have been developed. However, in the presence of outliers, the doubly robust imputed estimators can be very unstable. Using the concept of conditional bias as a measure of influence (Beaumont, Haziza and Ruiz-Gazen, 2013), we propose an outlier robust version of the doubly robust imputed estimator. Thus this estimator is denoted as a triple robust imputed estimator. The results of simulation studies show that the proposed estimator performs satisfactorily well for an appropriate choice of the tuning constant. Estimateur corrigé pour le biais Biais conditionnel Valeurs aberrantes Inférence basée sur le modèle Inférence basée sur le plan Petits domaines Bootstrap Modèle linéaire mixte Robustesse Imputation Corrected-bias estimator Conditional bias Outliers Model-based inference Sampling-based inference Small-area Linear mixed model Robustness
184	變數轉換之穩健迴歸分析張嘉璁 Unknown Date (has links) 在傳統的線性迴歸分析當中，當基本假設不滿足時，有時可考慮變數轉換使得資料能夠比較符合基本假設。在眾多的轉換方法當中，以Box和Cox(1964)所提出的乘冪轉換(Box-Cox power transformation)最為常用，乘冪轉換可將某些複雜的系統轉換成線性常態模式。然而當資料存在離群值(outlier)時，Box-Cox Transformation會受到影響，因此不是一種穩健方法。在本篇論文當中，我們利用前進演算法(forward search algorithm)求得最小消去平方估計量(Least trimmed squares estimator)，在過程當中估計出穩健的轉換參數。 Box-Cox乘冪轉換前進演算法最小中位數平方法最小消去平方法分數統計量穩健迴歸損壞點離群值 Box-Cox power transformation Forward search algorithm Least median of squares Least trimmed squares Score statistic Robust regression Breakdown point Outliers
185	Sélection de modèles robuste : régression linéaire et algorithme à sauts réversibles Gagnon, Philippe 10 1900 (has links) No description available. analyse en composantes principales inférence bayésienne robustesse valeurs aberrantes Bayesian inference Markov chain Monte Carlo methods Outliers Principal component analysis Random walk Metropolis algorithm Robustness Super heavy-tailed distributions
186	Seguro contra risco de downside de uma carteira: uma proposta híbrida frequentista-Bayesiana com uso de derivativos Pérgola, Gabriel Campos 23 January 2013 (has links) Submitted by Gabriel Campos Pérgola (gabrielpergola@gmail.com) on 2013-02-04T12:56:43Z No. of bitstreams: 1 DissertationGabrielPergola2013.pdf: 521205 bytes, checksum: 85369078a82b0d5cc02f8248961e9214 (MD5) / Rejected by Suzinei Teles Garcia Garcia (suzinei.garcia@fgv.br), reason: Prezado Gabriel, Não recebemos os arquivo em PDF. Att. Suzi 3799-7876 on 2013-02-05T18:53:00Z (GMT) / Submitted by Gabriel Campos Pérgola (gabrielpergola@gmail.com) on 2013-02-05T19:00:17Z No. of bitstreams: 2 DissertationGabrielPergola2013.pdf: 521205 bytes, checksum: 85369078a82b0d5cc02f8248961e9214 (MD5) DissertationGabrielPergola2013.pdf: 521205 bytes, checksum: 85369078a82b0d5cc02f8248961e9214 (MD5) / Approved for entry into archive by Suzinei Teles Garcia Garcia (suzinei.garcia@fgv.br) on 2013-02-05T19:07:12Z (GMT) No. of bitstreams: 2 DissertationGabrielPergola2013.pdf: 521205 bytes, checksum: 85369078a82b0d5cc02f8248961e9214 (MD5) DissertationGabrielPergola2013.pdf: 521205 bytes, checksum: 85369078a82b0d5cc02f8248961e9214 (MD5) / Made available in DSpace on 2013-02-05T19:09:04Z (GMT). No. of bitstreams: 2 DissertationGabrielPergola2013.pdf: 521205 bytes, checksum: 85369078a82b0d5cc02f8248961e9214 (MD5) DissertationGabrielPergola2013.pdf: 521205 bytes, checksum: 85369078a82b0d5cc02f8248961e9214 (MD5) Previous issue date: 23-01-13 / Portfolio insurance allows a manager to limit downside risk while allowing participation in upside markets. The purpose of this dissertation is to introduce a framework to portfolio insurance optimization from a hybrid frequentist-Bayesian approach. We obtain the joint distribution of regular returns from a frequentist statistical method, once the outliers have been identified and removed from the data sample. The joint distribution of extreme returns, in its turn, is modelled by a Bayesian network, whose topology reflects the events that can significantly impact the portfolio performance. Once we link the regular and extreme distributions of returns, we simulate future scenarios for the portfolio value. The insurance subportfolio is then optimized by the Differential Evolution algorithm. We show the framework in a step by step example for a long portfolio including stocks participating in the Bovespa Index (Ibovespa), using market data from 2008 to 2012. / Seguros de carteiras proporcionam aos gestores limitar o risco de downside sem renunciar a movimentos de upside. Nesta dissertação, propomos um arcabouço de otimização de seguro de carteira a partir de um modelo híbrido frequentista-Bayesiano com uso de derivativos. Obtemos a distribuição conjunta de retornos regulares através de uma abordagem estatística frequentista, uma vez removidos os outliers da amostra. A distribuição conjunta dos retornos extremos, por sua vez, é modelada através de Redes Bayesianas, cuja topologia contempla os eventos que o gestor considera crítico ao desempenho da carteira. Unindo as distribuições de retornos regulares e extremos, simulamos cenários futuros para a carteira. O seguro é, então, otimizado através do algoritmo Evolução Diferencial. Mostramos uma aplicação passo a passo para uma carteira comprada em ações do Ibovespa, utilizando dados de mercado entre 2008 e 2012. Seguro de carteira Modelos híbridos Modelos multivariados de retornos Identificação de outliers Minimum Covariance Determinant Distribuição Hiperbólica Generalizada Redes bayesianas Simulação Otimização Algoritmo Evolução Diferencial Portfolio insurance Hybrid methods Generalized hyperbolic distribution Bayesian nets Simulation Economia Derivativos (Finanças) Algoritmos
187	異常住宅價格檢測與處理之研究－以個別估價觀點分析 / The study of singular residential price detection and management - with the valuations by appraisers' perspective 高裕政, Kao, Yu Cheng Unknown Date (has links) 國內近年來有許多文獻在進行特徵價格模型預測時，避免樣本中存在異常點會造成模型估計值產生偏差，會使用統計軟體進行異常點檢測，但皆是直接將檢測出的異常點刪除，未加以著墨探究這些異常點的特徵結構、成因及特色等。因此，本研究透過統計檢定方法，探討刪除異常點前後整體樣本的特徵結構變化，並以個別估價觀點加以探討住宅交易樣本異常點的成因與特色，藉此歸納出實價登錄資料未揭露的重要特徵，以及迴歸模型搜尋疑似申報不實案件之可行性。透過敘述統計及樣本結構差異檢定結果發現，異常樣本的離散程度相對原始樣本與正常樣本較大，且經過刪除異常點的正常樣本特徵結構差異程度縮小；異常點的形成可能受到區位變數無法反映實際情況及樣本群聚程度影響，也可能因模型未納入某些重要的特徵變數，而使隱含該變數的樣本被判斷為異常點；異常樣本與正常樣本的成交總價、土地坪數、建物坪數、總樓層、所在樓層及屋齡等變數平均數、變異數及中位數有顯著差異。藉由個案分析結果歸納，可能因異常個案的住宅屬性存在整幢大樓住商混合使用、特殊鄰居、附屬建物占比過高、高總價豪宅產品、都更效益、增建效益、裝潢效益、約定專用空間效益、樓層高度挑高、獨特視野景觀或特殊區位條件；外部環境存在鄰近嫌惡設施或迎毗設施；交易情況存在買方身分特殊之影響，但受限於實價登錄未要求登載並揭露這些特徵，故模型未考量這些因素對價格的影響，使得模型可能將隱含這些特徵的樣本判斷為異常點，並進而影響模型預測結果。另外也發現，實價登錄資料存在登載錯誤及價格申報不實的情況，且可能被模型判斷為異常點。 / Many literatures use statistics-way to detect outliers in preventing any extreme deviation in hedonic price model prediction. Nevertheless, deleting the outliers instead of investigation into the structures, causes and features. Hence, this thesis studies the feature structures variation of the sample before and after deleting the outliers and with the valuations by appraisers’ perspective to inquire into the factors and features of the outliers in residential transactions. Thereby to summarize the significant features that are not disclosed by real price registration and feasibility in searching the possible false declaration of price by regression. Through descriptive statistics and sample structural difference parametric and nonparametric test shows the discreteness level of singular (outliers only) samples is greater than the primary (outliers including) and normal (outliers deleting) samples and the feature structure variation lessened after deleting the outliers in normal samples. The formation of outliers may be influenced by location variable not able to reflect actual circumstances and level of clustering in samples. Maybe some significant variables are not subsumed into the model, which leads to the judgement of samples with this variable to be outliers. The mean, variance and median in total traded price, land size, building size, total floors, exact floor and house age of singular samples are notably different with normal ones. With the analysis of cases, the possible reasons may be residential and commercial mixed-use in building, peculiar neighbors, high proportion of accessory building, luxury houses, urban renewal benefits, building addition benefits, interior decoration benefits, agreed space benefits, high-ceiling benefits, unique view or location, YIMBY and NIMBY property in environment and special relationship between the buyer and seller. Nevertheless, due to the nondisclosure of these features in real price registration that the model does not take these into consideration. That leads to the judgement of samples with these features as outliers and affects the model prediction. Also the registration error and false declaration in price may also be judged as outliers. 異常點特徵價格模型個別估價實價登錄不動產資訊透明住宅價格申報不實 Outliers Hedonic price model Appraiser Real price registration Transparency of real estate information False declaration of house price
188	AUTOMATED OPTIMAL FORECASTING OF UNIVARIATE MONITORING PROCESSES : Employing a novel optimal forecast methodology to define four classes of forecast approaches and testing them on real-life monitoring processes Razroev, Stanislav January 2019 (has links) This work aims to explore practical one-step-ahead forecasting of structurally changing data, an unstable behaviour, that real-life data connected to human activity often exhibit. This setting can be characterized as monitoring process. Various forecast models, methods and approaches can range from being simple and computationally "cheap" to very sophisticated and computationally "expensive". Moreover, different forecast methods handle different data-patterns and structural changes differently: for some particular data types or data intervals some particular forecast methods are better than the others, something that is usually not known beforehand. This raises a question: "Can one design a forecast procedure, that effectively and optimally switches between various forecast methods, adapting the forecast methods usage to the changes in the incoming data flow?" The thesis answers this question by introducing optimality concept, that allows optimal switching between simultaneously executed forecast methods, thus "tailoring" forecast methods to the changes in the data. It is also shown, how another forecast approach: combinational forecasting, where forecast methods are combined using weighted average, can be utilized by optimality principle and can therefore benefit from it. Thus, four classes of forecast results can be considered and compared: basic forecast methods, basic optimality, combinational forecasting, and combinational optimality. The thesis shows, that the usage of optimality gives results, where most of the time optimality is no worse or better than the best of forecast methods, that optimality is based on. Optimality reduces also scattering from multitude of various forecast suggestions to a single number or only a few numbers (in a controllable fashion). Optimality gives additionally lower bound for optimal forecasting: the hypothetically best achievable forecast result. The main conclusion is that optimality approach makes more or less obsolete other traditional ways of treating the monitoring processes: trying to find the single best forecast method for some structurally changing data. This search still can be sought, of course, but it is best done within optimality approach as its innate component. All this makes the proposed optimality approach for forecasting purposes a valid "representative" of a more broad ensemble approach (which likewise motivated development of now popular Ensemble Learning concept as a valid part of Machine Learning framework). / Denna avhandling syftar till undersöka en praktisk ett-steg-i-taget prediktering av strukturmässigt skiftande data, ett icke-stabilt beteende som verkliga data kopplade till människoaktiviteter ofta demonstrerar. Denna uppsättning kan alltså karakteriseras som övervakningsprocess eller monitoringsprocess. Olika prediktionsmodeller, metoder och tillvägagångssätt kan variera från att vara enkla och "beräkningsbilliga" till sofistikerade och "beräkningsdyra". Olika prediktionsmetoder hanterar dessutom olika mönster eller strukturförändringar i data på olika sätt: för vissa typer av data eller vissa dataintervall är vissa prediktionsmetoder bättre än andra, vilket inte brukar vara känt i förväg. Detta väcker en fråga: "Kan man skapa en predictionsprocedur, som effektivt och på ett optimalt sätt skulle byta mellan olika prediktionsmetoder och för att adaptera dess användning till ändringar i inkommande dataflöde?" Avhandlingen svarar på frågan genom att introducera optimalitetskoncept eller optimalitet, något som tillåter ett optimalbyte mellan parallellt utförda prediktionsmetoder, för att på så sätt skräddarsy prediktionsmetoder till förändringar i data. Det visas också, hur ett annat prediktionstillvägagångssätt: kombinationsprediktering, där olika prediktionsmetoder kombineras med hjälp av viktat medelvärde, kan utnyttjas av optimalitetsprincipen och därmed få nytta av den. Alltså, fyra klasser av prediktionsresultat kan betraktas och jämföras: basprediktionsmetoder, basoptimalitet, kombinationsprediktering och kombinationsoptimalitet. Denna avhandling visar, att användning av optimalitet ger resultat, där optimaliteten för det mesta inte är sämre eller bättre än den bästa av enskilda prediktionsmetoder, som själva optimaliteten är baserad på. Optimalitet reducerar också spridningen från mängden av olika prediktionsförslag till ett tal eller bara några enstaka tal (på ett kontrollerat sätt). Optimalitet producerar ytterligare en nedre gräns för optimalprediktion: det hypotetiskt bästa uppnåeliga prediktionsresultatet. Huvudslutsatsen är följande: optimalitetstillvägagångssätt gör att andra traditionella sätt att ta hand om övervakningsprocesser blir mer eller mindre föråldrade: att leta bara efter den enda bästa enskilda prediktionsmetoden för data med strukturskift. Sådan sökning kan fortfarande göras, men det är bäst att göra den inom optimalitetstillvägagångssättet, där den ingår som en naturlig komponent. Allt detta gör det föreslagna optimalitetstillvägagångssättetet för prediktionsändamål till en giltig "representant" för det mer allmäna ensembletillvägagångssättet (något som också motiverade utvecklingen av numera populär Ensembleinlärning som en giltig del av Maskininlärning). predictions forecasting optimal forecasting forecast classes optimality rules ensemble forecasting state switching combinational forecasting optimality framework exponential smoothing ARIMA ARMA SARIMA Double-Seasonal Holt-Winters time series Wikipedia Wikimedia Wikipedia data Twitter Twitter data electricity data monitoring processes monitoring process monitoring error metrics outliers missing values Mathematics Matematik
189	Statistical Machine Learning in Biomedical Engineering González Cebrián, Alba 15 April 2024 (has links) [ES] Esta tesis, desarrollada bajo una beca de formación de personal investigador de la Universitat Politècnica de València, tiene como objetivo proponer y aplicar metodologías de Statistical Machine Learning en contextos de Ingeniería Biomédica. Este concepto pretende aunar el uso de modelos de aprendizaje automático junto con la búsqueda de comprensión e interpretabilidad clásica del razonamiento estadístico, dando lugar a soluciones tecnológicas de problemas biomédicos que no pasen únicamente por el objetivo de optimizar el desempeño predictivo de los modelos. Para ello, se han dibujado dos objetivos principales que vertebran además el documento: proponer metodologías novedosas dentro del paraguas del Statistical Machine Learning, y aplicar soluciones a problemas biomédicos reales manteniendo esta filosofía en mente. Estos objetivos se han materializado en contribuciones metodológicas para la simulación de valores atípicos y la imputación de datos faltantes en presencia de datos atípicos, y en contribuciones aplicadas a casos reales para la mejora de procesos de atención médica, la mejora en el diagnóstico y pronóstico de enfermedades, y la estandarización de procedimientos de medición en entornos biotecnológicos. Dichas contribuciones se han artículado en capítulos correspondientes a las dos partes principales ya mencionadas. Finalmente, las conclusiones y líneas futuras cierran el documento, recalcando los mensajes principales de las contribuciones de la tesis doctoral en general, y sentando además las bases para líneas futuras que se han dibujado a consecuencia del trabajo realizado a lo largo del doctorado. / [CA] Aquesta tesi, desenvolupada sota una beca de formació de personal investigador de la Universitat Politècnica de València, té com a objectiu proposar i aplicar metodologies de Statistical Machine Learning en contextos d'Enginyeria Biomèdica. Aquest concepte pretén unir l'ús de models d'aprenentatge automàtic juntament amb la cerca de comprensió i interpretació clàssica del raonament estadístic, donant lloc a solucions tecnològiques de problemes biomèdics que no passen únicament per l'objectiu d'optimitzar el rendiment predictiu dels models. Per a això, s'han dibuixat dos objectius principals que vertebren a més el document: proposar metodologies noves dins del paraigua del Statistical Machine Learning, i aplicar solucions a problemes biomèdics reals mantenint aquesta filosofia en ment. Aquests objectius s'han materialitzat en contribucions metodològiques per a la simulació de valors atípics i la imputació de dades mancants en presència de valors atípics, i en contribucions aplicades a casos reals per a la millora de processos d'atenció mèdica, la millora en el diagnòstic i pronòstic de malalties, i l'estandardització de procediments de mesurament en entorns biotecnològics. Aquestes contribucions s'han articulat en capítols corresponents a les dues parts principals ja esmentades. Finalment, les conclusions i línies futures tanquen el document, recalant els missatges principals de les contribucions, de la tesi doctoral en general, i assentant a més les bases per a línies futures que s'han dibuixat com a consequència del treball realitzat al llarg del doctorat. / [EN] This thesis, developed under a research personnel formation grant from the Universitat Politècnica de València, aims to propose and apply methodologies of Statistical Machine Learning in Biomedical Engineering contexts. This concept seeks to combine machine learning models with the classic understanding and interpretability of statistical reasoning, resulting in technological solutions for biomedical problems that go beyond solely optimizing the predictive performance of models. To achieve this, two main objectives have been outlined, which also structure the document: proposing novel methodologies within the umbrella of Statistical Machine Learning and applying solutions to real biomedical problems while keeping this philosophy in mind. These objectives have materialized into methodological contributions for simulating outliers and imputing missing data in the presence of outliers and applied contributions to real cases for improving healthcare processes, enhancing disease diagnosis and prognosis, and standardizing measurement procedures in biotechnological environments. These contributions are articulated in chapters corresponding to the aforementioned two main parts. Finally, the conclusions and future lines of research conclude the document, reiterating the main messages of the contributions and the overall doctoral thesis and laying the groundwork for future lines of inquiry stemming from the work conducted throughout the doctorate. / González Cebrián, A. (2024). Statistical Machine Learning in Biomedical Engineering [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/203529 Biomedical engineering Statistical Machine Learning Latent Variable-based Models Multivariate Analysis Principal Component Analysis (PCA) Data Science Outliers Missing Data Medicine 4.0 Modelos de Variables Latentes Medicina 4.0 Ciencia de datos Datos anómalos Datos faltantes Análisis multivariante Ingeniería biomédica ESTADISTICA E INVESTIGACION OPERATIVA
190	Nestandardní úlohy v odstranění rozmazání obrazu / Image Deblurring in Demanding Conditions Kotera, Jan January 2020 (has links) Title: Image Deblurring in Demanding Conditions Author: Jan Kotera Department: Institute of Information Theory and Automation, Czech Academy of Sciences Supervisor: Doc. Ing. Filip Šroubek, Ph.D., DSc., Institute of Information Theory and Automation, Czech Academy of Sciences Abstract: Image deblurring is a computer vision task consisting of removing blur from image, the objective is to recover the sharp image corresponding to the blurred input. If the nature and shape of the blur is unknown and must be estimated from the input image, image deblurring is called blind and naturally presents a more difficult problem. This thesis focuses on two primary topics related to blind image deblurring. In the first part we work with the standard image deblurring based on the common convolution blur model and present a method of increasing robustness of the deblur- ring to phenomena violating the linear acquisition model, such as for example inten- sity clipping caused by sensor saturation in overexposed pixels. If not properly taken care of, these effects significantly decrease accuracy of the blur estimation and visual quality of the restored image. Rather than tailoring the deblurring method explicitly for each particular type of acquisition model violation we present a general approach based on flexible automatic...

Search results