Spelling suggestions: "subject:"skewt"" "subject:"skewatm""
1 |
Mixtures of Skew-t Factor AnalyzersMurray, Paula 11 1900 (has links)
Model-based clustering allows for the identification of subgroups in a data set through the use of finite mixture models. When applied to high-dimensional microarray data, we can discover groups of genes characterized by their gene expression profiles. In this thesis, a mixture of skew-t factor analyzers is introduced for the clustering of high-dimensional data. Notably, we make use of a version of the skew-t distribution which has not previously appeared in mixture-modelling literature. Allowing a constraint on the factor loading matrix leads to two mixtures of skew-t factor analyzers models. These models are implemented using the alternating expectation-conditional maximization algorithm for parameter estimation with an Aitken's acceleration stopping criterion used to determine convergence. The Bayesian information criterion is used for model selection and the performance of each model is assessed using the adjusted Rand index. The models are applied to both real and simulated data, obtaining clustering results which are equivalent or superior to those of established clustering methods.
|
2 |
Topics on Regularization of Parameters in Multivariate Linear RegressionChen, Lianfu 2011 December 1900 (has links)
My dissertation mainly focuses on the regularization of parameters in the multivariate linear regression under different assumptions on the distribution of the errors. It consists of two topics where we develop iterative procedures to construct sparse estimators for both the regression coefficient and scale matrices simultaneously, and a third topic where we develop a method for testing if the skewness parameter in the skew-normal distribution is parallel to one of the eigenvectors of the scale matrix.
In the first project, we propose a robust procedure for constructing a sparse estimator of a multivariate regression coefficient matrix that accounts for the correlations of the response variables. Robustness to outliers is achieved using heavy-tailed t distributions for the multivariate response, and shrinkage is introduced by adding to the negative log-likelihood l1 penalties on the entries of both the regression coefficient matrix and the precision matrix of the responses. Taking advantage of the hierarchical representation of a multivariate t distribution as the scale mixture of normal distributions and the EM algorithm, the optimization problem is solved iteratively where at each EM iteration suitably modified multivariate regression with covariance estimation (MRCE) algorithms proposed by Rothman, Levina and Zhu are used. We propose two new optimization algorithms for the penalized likelihood, called MRCEI and MRCEII, which differ from MRCE in the way that the tuning parameters for the two matrices are selected. Estimating the degrees of freedom when penalizing the entries of the matrices presents new computational challenges. A simulation study and real data analysis demonstrate that the MRCEII, which selects the tuning parameter of the precision matrix of the multiple responses using the Cp criterion, generally does the best among all methods considered in terms of the prediction error, and MRCEI outperforms the MRCE methods when the regression coefficient matrix is less sparse.
The second project is motivated by the existence of the skewness in the data for which the symmetric distribution assumption on the errors does not hold. We extend the procedure we have proposed to the case where the errors in the multivariate linear regression follow a multivariate skew-normal or skew-t distribution. Based on the convenient representation of skew-normal and skew-t as well as the EM algorithm, we develop an optimization algorithm, called MRST, to iteratively minimize the negative penalized log-likelihood. We also carry out a simulation study to assess the performance of the method and illustrate its application with one real data example.
In the third project, we discuss the asymptotic distributions of the eigenvalues and eigenvectors for the MLE of the scale matrix in a multivariate skew-normal distribution. We propose a statistic for testing whether the skewness vector is proportional to one of the eigenvectors of the scale matrix based on the likelihood ratio. Under the alternative, the likelihood is maximized numerically with two different ways of parametrization for the scale matrix: Modified Cholesky Decomposition (MCD) and Givens Angle. We conduct a simulation study and show that the statistic obtained using Givens Angle parametrization performs well and is more reliable than that obtained using MCD.
|
3 |
A Matrix Variate Generalization of the Skew Pearson Type VII and Skew T DistributionZheng, Shimin, Gupta, A. K., Liu, Xuefeng 01 January 2012 (has links)
We define and study multivariate and matrix variate skew Pearson type VII and skew t-distributions. We derive the marginal and conditional distributions, the linear transformation, and the stochastic representations of the multivariate and matrix variate skew Pearson type VII distributions and skew t-distributions. Also, we study the limiting distributions.
|
4 |
MELHORAMENTOS INFERENCIAIS NO MODELO BETA-SKEW-T-EGARCH / INFERENTIAL IMPROVEMENTS OF BETA-SKEW-T-EGARCH MODELMuller, Fernanda Maria 25 February 2016 (has links)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / The Beta-Skew-t-EGARCH model was recently proposed in literature to model the
volatility of financial returns. The inferences over the model parameters are based on the maximum
likelihood method. The maximum likelihood estimators present good asymptotic properties;
however, in finite sample sizes they can be considerably biased. Monte Carlo simulations
were used to evaluate the finite sample performance of point estimators. Numerical results indicated
that the maximum likelihood estimators of some parameters are biased in sample sizes
smaller than 3,000. Thus, bootstrap bias correction procedures were considered to obtain more
accurate estimators in small samples. Better quality of forecasts was observed when the model
with bias-corrected estimators was considered. In addition, we propose a likelihood ratio test
to assist in the selection of the Beta-Skew-t-EGARCH model with one or two volatility components.
The numerical evaluation of the two-component test showed distorted null rejection
rates in sample sizes smaller than or equal to 1,000. To improve the performance of the proposed
test in small samples, the bootstrap-based likelihood ratio test and the bootstrap Bartlett
correction were considered. The bootstrap-based test exhibited the closest null rejection rates
to the nominal values. The evaluation results of the two-component tests showed their practical
usefulness. Finally, an application to the log-returns of the German stock index of the proposed
methods was presented. / O modelo Beta-Skew-t-EGARCH foi recentemente proposto para modelar a volatilidade
de retornos financeiros. A estimação dos parâmetros do modelo é feita via máxima verossimilhança.
Esses estimadores possuem boas propriedades assintóticas, mas em amostras
de tamanho finito eles podem ser consideravelmente viesados. Com a finalidade de avaliar as
propriedades dos estimadores, em amostras de tamanho finito, realizou-se um estudo de simulações
de Monte Carlo. Os resultados numéricos indicam que os estimadores de máxima
verossimilhança de alguns parâmetros do modelo são viesados em amostras de tamanho inferior
a 3000. Para obter estimadores pontuais mais acurados foram consideradas correções de
viés via o método bootstrap. Verificou-se que os estimadores corrigidos apresentaram menor
viés relativo percentual. Também foi observada melhor qualidade das previsões quando o modelo
com estimadores corrigidos são considerados. Para auxiliar na seleção entre o modelo
Beta-Skew-t-EGARCH com um ou dois componentes de volatilidade foi apresentado um teste
da razão de verossimilhanças. A avaliação numérica do teste de dois componentes proposto demonstrou
taxas de rejeição nula distorcidas em tamanhos amostrais menores ou iguais a 1000.
Para melhorar o desempenho do teste foram consideradas a correção bootstrap e a correção de
Bartlett bootstrap. Os resultados numéricos indicam a utilidade prática dos testes de dois componentes
propostos. O teste bootstrap exibiu taxas de rejeição nula mais próximas dos valores
nominais. Ao final do trabalho foi realizada uma aplicação dos testes de dois componentes e
do modelo Beta-Skew-t-EGARCH, bem como suas versões corrigidas, a dados do índice de
mercado da Alemanha.
|
5 |
A Matrix Variate Generalization of the Skew Pearson Type VII and Skew T DistributionZheng, Shimin, Gupta, A. K., Liu, Xuefeng 01 January 2012 (has links)
We define and study multivariate and matrix variate skew Pearson type VII and skew t-distributions. We derive the marginal and conditional distributions, the linear transformation, and the stochastic representations of the multivariate and matrix variate skew Pearson type VII distributions and skew t-distributions. Also, we study the limiting distributions.
|
6 |
Multivariate Skew-t Distributions in Econometrics and EnvironmetricsMarchenko, Yulia V. 2010 December 1900 (has links)
This dissertation is composed of three articles describing novel approaches for
analysis and modeling using multivariate skew-normal and skew-t distributions in
econometrics and environmetrics.
In the first article we introduce the Heckman selection-t model. Sample selection
arises often as a result of the partial observability of the outcome of interest in
a study. In the presence of sample selection, the observed data do not represent a
random sample from the population, even after controlling for explanatory variables.
Heckman introduced a sample-selection model to analyze such data and proposed a
full maximum likelihood estimation method under the assumption of normality. The
method was criticized in the literature because of its sensitivity to the normality assumption.
In practice, data, such as income or expenditure data, often violate the
normality assumption because of heavier tails. We first establish a new link between
sample-selection models and recently studied families of extended skew-elliptical distributions.
This then allows us to introduce a selection-t model, which models the
error distribution using a Student’s t distribution. We study its properties and investigate
the finite-sample performance of the maximum likelihood estimators for
this model. We compare the performance of the selection-t model to the Heckman
selection model and apply it to analyze ambulatory expenditures.
In the second article we introduce a family of multivariate log-skew-elliptical distributions,
extending the list of multivariate distributions with positive support. We
investigate their probabilistic properties such as stochastic representations, marginal
and conditional distributions, and existence of moments, as well as inferential properties.
We demonstrate, for example, that as for the log-t distribution, the positive
moments of the log-skew-t distribution do not exist. Our emphasis is on two special
cases, the log-skew-normal and log-skew-t distributions, which we use to analyze U.S.
precipitation data.
Many commonly used statistical methods assume that data are normally distributed.
This assumption is often violated in practice which prompted the development
of more flexible distributions. In the third article we describe two such multivariate
distributions, the skew-normal and the skew-t, and present commands for
fitting univariate and multivariate skew-normal and skew-t regressions in the statistical
software package Stata.
|
7 |
不對稱分配於風險值之應用 - 以台灣股市為例 / An application of asymmetric distribution in value at risk - taking Taiwan stock market as an example沈之元, Shen,Chih-Yuan Unknown Date (has links)
本文以台灣股價加權指數,使用 AR(3)-GJR-GRACH(1,1) 模型,白噪音假設為 Normal 、 Skew-Normal 、 Student t 、 skew-t 、 EPD 、 SEPD 、與 AEPD 等七種分配。著重於兩個部份,(一) Student t 分配一族與 EPD 分配一族在模型配適與風險值估計的比較;(二) 預測風險值區分為低震盪與高震盪兩個區間,比較不同分配在兩區間預測風險值的差異。
實證分析顯示, t 分配一族與 EPD 分配一族配適的結果,無論是只考慮峰態 ( t 分配與 EPD 分配) ,或者加入影響偏態的參數 ( skew-t 分配與 SEPD 分配) , t 分配一族的配適程度都較 EPD 分配一族為佳。更進一步考慮分配兩尾厚度不同的 AEPD 分配,配適結果為七種分配中最佳。
風險值的估計在低震盪的區間,常態分配與其他厚尾分配皆能通過回溯測試,採用厚尾分配效果不大;在高震盪的區間,左尾風險值回溯測試結果,常態分配與其他厚尾分配皆無法全數通過,但仍以 AEPD 分配為最佳。最後比較損失函數,左尾風險值估計以 AEPD 分配為最佳,右尾風險值則無一致的結果。因此我們認為 AEPD 分配可作為風險管理有用的工具。
|
8 |
Some extensions in measurement error models / Algumas extensões em modelos com erros de mediçãoTomaya, Lorena Yanet Cáceres 14 December 2018 (has links)
In this dissertation, we approach three different contributions in measurement error model (MEM). Initially, we carry out maximum penalized likelihood inference in MEMs under the normality assumption. The methodology is based on the method proposed by Firth (1993), which can be used to improve some asymptotic properties of the maximum likelihood estimators. In the second contribution, we develop two new estimation methods based on generalized fiducial inference for the precision parameters and the variability product under the Grubbs model considering the two-instrument case. One method is based on a fiducial generalized pivotal quantity and the other one is built on the method of the generalized fiducial distribution. Comparisons with two existing approaches are reported. Finally, we propose to study inference in a heteroscedastic MEM with known error variances. Instead of the normal distribution for the random components, we develop a model that assumes a skew-t distribution for the true covariate and a centered Students t distribution for the error terms. The proposed model enables to accommodate skewness and heavy-tailedness in the data, while the degrees of freedom of the distributions can be different. We use the maximum likelihood method to estimate the model parameters and compute them via an EM-type algorithm. All proposed methodologies are assessed numerically through simulation studies and illustrated with real datasets extracted from the literature. / Neste trabalho abordamos três contribuições diferentes em modelos com erros de medição (MEM). Inicialmente estudamos inferência pelo método de máxima verossimilhança penalizada em MEM sob a suposição de normalidade. A metodologia baseia-se no método proposto por Firth (1993), o qual pode ser usado para melhorar algumas propriedades assintóticas de os estimadores de máxima verossimilhança. Em seguida, propomos construir dois novos métodos de estimação baseados na inferência fiducial generalizada para os parâmetros de precisão e a variabilidade produto no modelo de Grubbs para o caso de dois instrumentos. O primeiro método é baseado em uma quantidade pivotal generalizada fiducial e o outro é baseado no método da distribuição fiducial generalizada. Comparações com duas abordagens existentes são reportadas. Finalmente, propomos estudar inferência em um MEM heterocedástico em que as variâncias dos erros são consideradas conhecidas. Nós desenvolvemos um modelo que assume uma distribuição t-assimétrica para a covariável verdadeira e uma distribuição t de Student centrada para os termos dos erros. O modelo proposto permite acomodar assimetria e cauda pesada nos dados, enquanto os graus de liberdade das distribuições podem ser diferentes. Usamos o método de máxima verossimilhança para estimar os parâmetros do modelo e calculá-los através de um algoritmo tipo EM. Todas as metodologias propostas são avaliadas numericamente em estudos de simulação e são ilustradas com conjuntos de dados reais extraídos da literatura
|
9 |
Diagnóstico de influência bayesiano em modelos de regressão da família t-assimétrica / Bayesian influence diagnostic in skew-t family linear regression modelsSilva, Diego Wesllen da 05 May 2017 (has links)
O modelo de regressão linear com erros na família de distribuições t-assimétrica, que contempla as distribuições normal, t-Student e normal assimétrica como casos particulares, tem sido considerado uma alternativa robusta ao modelo normal. Para concluir qual modelo é, de fato, mais robusto, é importante ter um método tanto para identificar uma observação como discrepante quanto aferir a influência que esta observação terá em nossas estimativas. Nos modelos de regressão bayesianos, uma das medidas de identificação de observações discrepantes mais conhecidas é a conditional predictive ordinate (CPO). Analisamos a influência dessas observações nas estimativas tanto de forma global, isto é, no vetor completo de parâmetros do modelo quanto de forma marginal, apenas nos parâmetros regressores. Consideramos a norma L1 e a divergência Kullback-Leibler como medidas de influência das observações nas estimativas dos parâmetros. Além disso, encontramos as distribuições condicionais completas de todos os modelos para o uso do algoritmo de Gibbs obtendo, assim, amostras da distribuição a posteriori dos parâmetros. Tais amostras são utilizadas no calculo do CPO e das medidas de divergência estudadas. A principal contribuição deste trabalho é obter as medidas de influência global e marginal calculadas para os modelos t-Student, normal assimétrico e t-assimétrico. Na aplicação em dados reais originais e contaminados, observamos que, em geral, o modelo t-Student é uma alternativa robusta ao modelo normal. Por outro lado, o modelo t-assimétrico não é, em geral, uma alternativa robusta ao modelo normal. A capacidade de robustificação do modelo t-assimétrico está diretamente ligada à posição do resíduo do ponto discrepante em relação a distribuição dos resíduos. / The linear regression model with errors in the skew-t family, which includes the normal, Student-t and skew normal distributions as particular cases, has been considered as a robust alternative to the normal model. To conclude which model is in fact more robust its important to have a method to identify an observation as outlier, as well as to assess the influence of this observation in the estimates. In bayesian regression models, one of the most known measures to identify an outlier is the conditional predictive ordinate (CPO). We analyze the influence of these observations on the estimates both in a global way, that is, in the complete parameter vector of the model and in a marginal way, only in the regressor parameters. We consider the L1 norm and the Kullback-Leibler divergence as influence measures of the observations on the parameter estimates. Using the bayesian approach, we find the complete conditional distributions of all the models for the usage of the Gibbs sampler thus obtaining samples of the posterior distribution of the parameters. These samples are used in the calculation of the CPO and the studied divergence measures. The major contribution of this work is to present the global and marginal influence measures calculated for the Student-t, skew normal and skew-t models. In the application on original and contaminated real data, we observed that in general the Student-t model is a robust alternative to the normal model. However, the skew-t model is not a robust alternative to the normal model. The robustification capability of the skew-t model is directly linked to the position of the residual of the outlier in relation to the distribution of the residuals.
|
10 |
Diagnóstico de influência bayesiano em modelos de regressão da família t-assimétrica / Bayesian influence diagnostic in skew-t family linear regression modelsDiego Wesllen da Silva 05 May 2017 (has links)
O modelo de regressão linear com erros na família de distribuições t-assimétrica, que contempla as distribuições normal, t-Student e normal assimétrica como casos particulares, tem sido considerado uma alternativa robusta ao modelo normal. Para concluir qual modelo é, de fato, mais robusto, é importante ter um método tanto para identificar uma observação como discrepante quanto aferir a influência que esta observação terá em nossas estimativas. Nos modelos de regressão bayesianos, uma das medidas de identificação de observações discrepantes mais conhecidas é a conditional predictive ordinate (CPO). Analisamos a influência dessas observações nas estimativas tanto de forma global, isto é, no vetor completo de parâmetros do modelo quanto de forma marginal, apenas nos parâmetros regressores. Consideramos a norma L1 e a divergência Kullback-Leibler como medidas de influência das observações nas estimativas dos parâmetros. Além disso, encontramos as distribuições condicionais completas de todos os modelos para o uso do algoritmo de Gibbs obtendo, assim, amostras da distribuição a posteriori dos parâmetros. Tais amostras são utilizadas no calculo do CPO e das medidas de divergência estudadas. A principal contribuição deste trabalho é obter as medidas de influência global e marginal calculadas para os modelos t-Student, normal assimétrico e t-assimétrico. Na aplicação em dados reais originais e contaminados, observamos que, em geral, o modelo t-Student é uma alternativa robusta ao modelo normal. Por outro lado, o modelo t-assimétrico não é, em geral, uma alternativa robusta ao modelo normal. A capacidade de robustificação do modelo t-assimétrico está diretamente ligada à posição do resíduo do ponto discrepante em relação a distribuição dos resíduos. / The linear regression model with errors in the skew-t family, which includes the normal, Student-t and skew normal distributions as particular cases, has been considered as a robust alternative to the normal model. To conclude which model is in fact more robust its important to have a method to identify an observation as outlier, as well as to assess the influence of this observation in the estimates. In bayesian regression models, one of the most known measures to identify an outlier is the conditional predictive ordinate (CPO). We analyze the influence of these observations on the estimates both in a global way, that is, in the complete parameter vector of the model and in a marginal way, only in the regressor parameters. We consider the L1 norm and the Kullback-Leibler divergence as influence measures of the observations on the parameter estimates. Using the bayesian approach, we find the complete conditional distributions of all the models for the usage of the Gibbs sampler thus obtaining samples of the posterior distribution of the parameters. These samples are used in the calculation of the CPO and the studied divergence measures. The major contribution of this work is to present the global and marginal influence measures calculated for the Student-t, skew normal and skew-t models. In the application on original and contaminated real data, we observed that in general the Student-t model is a robust alternative to the normal model. However, the skew-t model is not a robust alternative to the normal model. The robustification capability of the skew-t model is directly linked to the position of the residual of the outlier in relation to the distribution of the residuals.
|
Page generated in 0.0227 seconds