Global ETD Search

51	Daugiamačių Gauso skirstinių mišinio statistinė analizė, taikant duomenų projektavimą / The Projection-based Statistical Analysis of the Multivariate Gaussian Distribution Mixture Kavaliauskas, Mindaugas 21 January 2005 (has links) Problem of the dissertation. The Gaussian random values are very common in practice, because if a random value depends on many additive factors, according to the Central Limit Theorem (if particular conditions are satisfied), the sum is approximately from Gaussian distribution. If the observed random value belongs to one of the several classes, it is from the Gaussian distribution mixture model. The mixtures of the Gaussian distributions are common in various fields: biology, medicine, astronomy, military science and many others. The most important statistical problems are problems of mixture identification and data clustering. In case of high data dimension, these tasks are not completely solved. The new parameter estimation of the multivariate Gaussian distribution mixture model and data clustering methods are proposed and analysed in the dissertation. Since it is much easier to solve these problems in univariate case, the projection-based approach is used. The aim of the dissertation. The aim of this work is the development of constructive algorithms for distribution analysis and clustering of data from the mixture model of the Gaussian distributions. Mathematics Klasterizavimas EM algoritmas Gaussian distribution mixture EM algorithm Gauso skirstinių mišinys Clustering Projection pursuit Projektavimas
52	Daugiamačiu Gauso skirstinių mišinio statistinė analizė, taikant duomenų projektavimą / The Projection-based Statistical Analysis of the Multivariate Gaussian Distribution Mixture Kavaliauskas, Mindaugas 21 January 2005 (has links) Problem of the dissertation. The Gaussian random values are very common in practice, because if a random value depends on many additive factors, according to the Central Limit Theorem (if particular conditions are satisfied), the sum is approximately from Gaussian distribution. If the observed random value belongs to one of the several classes, it is from the Gaussian distribution mixture model. The mixtures of the Gaussian distributions are common in various fields: biology, medicine, astronomy, military science and many others. The most important statistical problems are problems of mixture identification and data clustering. In case of high data dimension, these tasks are not completely solved. The new parameter estimation of the multivariate Gaussian distribution mixture model and data clustering methods are proposed and analysed in the dissertation. Since it is much easier to solve these problems in univariate case, the projection-based approach is used. The aim of the dissertation. The aim of this work is the development of constructive algorithms for distribution analysis and clustering of data from the mixture model of the Gaussian distributions. Mathematics Clustering Projection pursuit Projektavimas Gaussian distribution mixture EM algorithm Gauso skirstinių mišinys EM algoritmas Klasterizavimas
53	Econometric Models of Crop Yields: Two Essays Tolhurst, Tor 17 May 2013 (has links) This thesis is an investigation of econometric crop yield models divided into two essays. In the first essay, I propose estimating a single heteroscedasticity coefficient for all counties within a crop-reporting district by pooling county-level crop yield data in a two-stage estimation process. In the context of crop insurance---where heteroscedaticity has significant economic implications---I demonstrate the pooling approach provides economically and statistically significant improvements in rating crop insurance contracts over contemporary methods. In the second essay, I propose a new method for measuring the rate of technological change in crop yields. To date the agricultural economics literature has measured technological change exclusively at the mean; in contrast, the proposed model can measure the rate of technological change in endogenously-defined yield subpopulations. I find evidence of different rates of technological change in yield subpopulations, which leads to interesting questions about the effect of technological change on agricultural production. / Ontario Ministry of Agriculture and Food
54	Topics on Regularization of Parameters in Multivariate Linear Regression Chen, Lianfu 2011 December 1900 (has links) My dissertation mainly focuses on the regularization of parameters in the multivariate linear regression under different assumptions on the distribution of the errors. It consists of two topics where we develop iterative procedures to construct sparse estimators for both the regression coefficient and scale matrices simultaneously, and a third topic where we develop a method for testing if the skewness parameter in the skew-normal distribution is parallel to one of the eigenvectors of the scale matrix. In the first project, we propose a robust procedure for constructing a sparse estimator of a multivariate regression coefficient matrix that accounts for the correlations of the response variables. Robustness to outliers is achieved using heavy-tailed t distributions for the multivariate response, and shrinkage is introduced by adding to the negative log-likelihood l1 penalties on the entries of both the regression coefficient matrix and the precision matrix of the responses. Taking advantage of the hierarchical representation of a multivariate t distribution as the scale mixture of normal distributions and the EM algorithm, the optimization problem is solved iteratively where at each EM iteration suitably modified multivariate regression with covariance estimation (MRCE) algorithms proposed by Rothman, Levina and Zhu are used. We propose two new optimization algorithms for the penalized likelihood, called MRCEI and MRCEII, which differ from MRCE in the way that the tuning parameters for the two matrices are selected. Estimating the degrees of freedom when penalizing the entries of the matrices presents new computational challenges. A simulation study and real data analysis demonstrate that the MRCEII, which selects the tuning parameter of the precision matrix of the multiple responses using the Cp criterion, generally does the best among all methods considered in terms of the prediction error, and MRCEI outperforms the MRCE methods when the regression coefficient matrix is less sparse. The second project is motivated by the existence of the skewness in the data for which the symmetric distribution assumption on the errors does not hold. We extend the procedure we have proposed to the case where the errors in the multivariate linear regression follow a multivariate skew-normal or skew-t distribution. Based on the convenient representation of skew-normal and skew-t as well as the EM algorithm, we develop an optimization algorithm, called MRST, to iteratively minimize the negative penalized log-likelihood. We also carry out a simulation study to assess the performance of the method and illustrate its application with one real data example. In the third project, we discuss the asymptotic distributions of the eigenvalues and eigenvectors for the MLE of the scale matrix in a multivariate skew-normal distribution. We propose a statistic for testing whether the skewness vector is proportional to one of the eigenvectors of the scale matrix based on the likelihood ratio. Under the alternative, the likelihood is maximized numerically with two different ways of parametrization for the scale matrix: Modified Cholesky Decomposition (MCD) and Givens Angle. We conduct a simulation study and show that the statistic obtained using Givens Angle parametrization performs well and is more reliable than that obtained using MCD. Eigenvector EM algorithm Lasso regression Regularization.
55	Session Clustering Using Mixtures of Proportional Hazards Models Mair, Patrick, Hudec, Marcus January 2008 (has links) (PDF) Emanating from classical Weibull mixture models we propose a framework for clustering survival data with various proportionality restrictions imposed. By introducing mixtures of Weibull proportional hazards models on a multivariate data set a parametric cluster approach based on the EM-algorithm is carried out. The problem of non-response in the data is considered. The application example is a real life data set stemming from the analysis of a world-wide operating eCommerce application. Sessions are clustered due to the dwell times a user spends on certain page-areas. The solution allows for the interpretation of the navigation behavior in terms of survival and hazard functions. A software implementation by means of an R package is provided. (author´s abstract) / Series: Research Report Series / Department of Statistics and Mathematics
56	Contaminated Chi-square Modeling and Its Application in Microarray Data Analysis Zhou, Feng 01 January 2014 (has links) Mixture modeling has numerous applications. One particular interest is microarray data analysis. My dissertation research is focused on the Contaminated Chi-Square (CCS) Modeling and its application in microarray. A moment-based method and two likelihood-based methods including Modified Likelihood Ratio Test (MLRT) and Expectation-Maximization (EM) Test are developed for testing the omnibus null hypothesis of no contamination of a central chi-square distribution by a non-central Chi-Square distribution. When the omnibus null hypothesis is rejected, we further developed the moment-based test and the EM test for testing an extra component to the Contaminated Chi-Square (CCS+EC) Model. The moment-based approach is easy and there is no need for re-sampling or random field theory to obtain critical values. When the statistical models are complicated such as large mixtures of dimensional distributions, MLRT and EM test may have better power than moment based approaches, and the MLRT and EM tests developed herein enjoy an elegant asymptotic theory. Contaminated Chi-Square Model EM Algorithm MLRT Microarray Mixture Model Microarrays
57	Uma metodologia para a detecção de mudanças em imagens multitemporais de sensoriamento remoto empregando Support Vector Machines Ferreira, Rute Henrique da Silva January 2014 (has links) Esta tese investiga uma abordagem supervisionada para o problema da detecção de mudanças em imagens multitemporais de sensoriamento remoto empregando Support Vector Machines (SVM) com o uso dos kernels polinomial e gaussiano (RBF). A proposta metodológica está baseada na diferença das imagens-fração produzidas para cada data. Em imagens de cenas naturais a diferença nas frações de solo e vegetação tendem a apresentar uma distribuição simétrica em torno da origem. Esse fato pode ser usado para modelar duas distribuições normais multivariadas: mudança e não-mudança. O algoritmo Expectation-Maximization (EM) é implementado para estimar os parâmetros (vetor de médias, matriz de covariância e probabilidade a priori) associados a essas duas distribuições. Amostras aleatórias são extraídas dessas distribuições e usadas para treinar o classificador SVM nesta abordagem supervisionada. A metodologia proposta realiza testes com o uso de conjuntos de dados multitemporais de imagens multiespectrais TM-Landsat, que cobrem a mesma cena em duas datas diferentes. Os resultados são comparados com outros procedimentos, incluindo trabalhos anteriores, um conjunto de dados sintéticos e o classificador SVM One-Class. / In this thesis, we investigate a supervised approach to change detection in remote sensing multi-temporal image data by applying Support Vector Machines (SVM) technique using polynomial kernel and Gaussian kernel (RBF). The methodology is based on the difference-fraction images produced for two dates. In natural scenes, the difference in the fractions such as vegetation and bare soil occurring in two different dates tend to present a distribution symmetric around the origin of the coordinate system. This fact can be used to model two normal multivariate distributions: class change and no-change. The Expectation-Maximization algorithm (EM) is implemented to estimate the parameters (mean vector, covariance matrix and a priori probability) associated with these two distributions. Random samples are drawn from these distributions and used to train the SVM classifier in this supervised approach.The proposed methodology performs tests using multi-temporal TMLandsat multispectral image data covering the same scene in two different dates. The results are compared to other procedures including previous work, a synthetic data set and SVM One-Class. Sensoriamento remoto Detecção de mudanças Change detection Kernel methods Fraction-images EM algorithm
58	Uma abordagem para a detecção de mudanças em imagens multitemporais de sensoriamento remoto empregando Support Vector Machines com uma nova métrica de pertinência Angelo, Neide Pizzolato January 2014 (has links) Esta tese investiga uma abordagem não supervisionada para o problema da detecção de mudanças em imagens multiespectrais e multitemporais de sensoriamento remoto empregando Support Vector Machines (SVM) com o uso dos kernels polinomial e RBF e de uma nova métrica de pertinência de pixels. A proposta metodológica está baseada na diferença das imagens-fração produzidas para cada data. Em imagens de cenas naturais essa diferença nas frações de solo e vegetação tendem a apresentar uma distribuição simétrica próxima à origem. Essa caracteristica pode ser usada para modelar as distribuições normais multivariadas das classes mudança e não-mudança. O algoritmo Expectation-Maximization (EM) é implementado com a finalidade de estimar os parâmetros (vetor de médias, matriz de covariância e probabilidade a priori) associados a essas duas distribuições. A seguir, amostras aleatórias e normalmente distribuidas são extraídas dessas distribuições e rotuladas segundo sua pertinência em uma das classes. Essas amostras são então usadas no treinamento do classificador SVM. A partir desta classificação é estimada uma nova métrica de pertinência de pixels. A metodologia proposta realiza testes com o uso de conjuntos de dados multitemporais de imagens multiespectrais Landsat-TM que cobrem a mesma cena em duas datas diferentes. A métrica de pertinência proposta é validada através de amostras de teste controladas obtidas a partir da técnica Change Vetor Analysis, além disso, os resultados de pertinência obtidos para a imagem original com essa nova métrica são comparados aos resultados de pertinência obtidos para a mesma imagem pela métrica proposta em (Zanotta, 2010). Baseado nos resultados apresentados neste trabalho que mostram que a métrica para determinação de pertinência é válida e também apresenta resultados compatíveis com outra técnica de pertinência publicada na literatura e considerando que para obter esses resultados utilizou-se poucas amostras de treinamento, espera-se que essa métrica deva apresentar melhores resultados que os que seriam apresentados com classificadores paramétricos quando aplicado a imagens multitemporais e hiperespectrais. / This thesis investigates a unsupervised approach to the problem of change detection in multispectral and multitemporal remote sensing images using Support Vector Machines (SVM) with the use of polynomial and RBF kernels and a new metric of pertinence of pixels. The methodology is based on the difference-fraction images produced for each date. In images of natural scenes. This difference in the fractions of bare soil and vegetation tend to have a symmetrical distribution close to the origin. This feature can be used to model the multivariate normal distributions of the classes change and no-change. The Expectation- Maximization algorithm (EM) is implemented in order to estimate the parameters (mean vector, covariance matrix and a priori probability) associated with these two distributions. Then random and normally distributed samples are extracted from these distributions and labeled according to their pertinence to the classes. These samples are then used in the training of SVM classifier. From this classification is estimated a new metric of pertinence of pixel. The proposed methodology performs tests using multitemporal data sets of multispectral Landsat-TM images that cover the same scene at two different dates. The proposed metric of pertinence is validated via controlled test samples obtained from Change Vector Analysis technique. In addition, the results obtained at the original image with the new metric are compared to the results obtained at the same image applying the pertinence metric proposed in (Zanotta, 2010). Based on the results presented here showing that the metric of pertinence is valid, and also provides results consistent with other published in the relevant technical literature, and considering that to obtain these results was used a few training samples, it is expected that the metric proposed should present better results than those that would be presented with parametric classifiers when applied to multitemporal and hyperspectral images. Detecção de mudanças Landsat Change detection Kernel methods Fraction-images EM algorithm Metric of pertinence
59	Robust multivariate mixture regression models Li, Xiongya January 1900 (has links) Doctor of Philosophy / Department of Statistics / Weixing Song / In this dissertation, we proposed a new robust estimation procedure for two multivariate mixture regression models and applied this novel method to functional mapping of dynamic traits. In the first part, a robust estimation procedure for the mixture of classical multivariate linear regression models is discussed by assuming that the error terms follow a multivariate Laplace distribution. An EM algorithm is developed based on the fact that the multivariate Laplace distribution is a scale mixture of the multivariate standard normal distribution. The performance of the proposed algorithm is thoroughly evaluated by some simulation and comparison studies. In the second part, the similar idea is extended to the mixture of linear mixed regression models by assuming that the random effect and the regression error jointly follow a multivariate Laplace distribution. Compared with the existing robust t procedure in the literature, simulation studies indicate that the finite sample performance of the proposed estimation procedure outperforms or is at least comparable to the robust t procedure. Comparing to t procedure, there is no need to determine the degrees of freedom, so the new robust estimation procedure is computationally more efficient than the robust t procedure. The ascent property for both EM algorithms are also proved. In the third part, the proposed robust method is applied to identify quantitative trait loci (QTL) underlying a functional mapping framework with dynamic traits of agricultural or biomedical interest. A robust multivariate Laplace mapping framework was proposed to replace the normality assumption. Simulation studies show the proposed method is comparable to the robust multivariate t-distribution developed in literature and outperforms the normal procedure. As an illustration, the proposed method is also applied to a real data set. Finite mixtures Multivariate regression Robust estimation Multivariate Laplace distribution EM algorithm Quantitative trait loci
60	Uma metodologia para a detecção de mudanças em imagens multitemporais de sensoriamento remoto empregando Support Vector Machines Ferreira, Rute Henrique da Silva January 2014 (has links) Esta tese investiga uma abordagem supervisionada para o problema da detecção de mudanças em imagens multitemporais de sensoriamento remoto empregando Support Vector Machines (SVM) com o uso dos kernels polinomial e gaussiano (RBF). A proposta metodológica está baseada na diferença das imagens-fração produzidas para cada data. Em imagens de cenas naturais a diferença nas frações de solo e vegetação tendem a apresentar uma distribuição simétrica em torno da origem. Esse fato pode ser usado para modelar duas distribuições normais multivariadas: mudança e não-mudança. O algoritmo Expectation-Maximization (EM) é implementado para estimar os parâmetros (vetor de médias, matriz de covariância e probabilidade a priori) associados a essas duas distribuições. Amostras aleatórias são extraídas dessas distribuições e usadas para treinar o classificador SVM nesta abordagem supervisionada. A metodologia proposta realiza testes com o uso de conjuntos de dados multitemporais de imagens multiespectrais TM-Landsat, que cobrem a mesma cena em duas datas diferentes. Os resultados são comparados com outros procedimentos, incluindo trabalhos anteriores, um conjunto de dados sintéticos e o classificador SVM One-Class. / In this thesis, we investigate a supervised approach to change detection in remote sensing multi-temporal image data by applying Support Vector Machines (SVM) technique using polynomial kernel and Gaussian kernel (RBF). The methodology is based on the difference-fraction images produced for two dates. In natural scenes, the difference in the fractions such as vegetation and bare soil occurring in two different dates tend to present a distribution symmetric around the origin of the coordinate system. This fact can be used to model two normal multivariate distributions: class change and no-change. The Expectation-Maximization algorithm (EM) is implemented to estimate the parameters (mean vector, covariance matrix and a priori probability) associated with these two distributions. Random samples are drawn from these distributions and used to train the SVM classifier in this supervised approach.The proposed methodology performs tests using multi-temporal TMLandsat multispectral image data covering the same scene in two different dates. The results are compared to other procedures including previous work, a synthetic data set and SVM One-Class. Sensoriamento remoto Detecção de mudanças Change detection Kernel methods Fraction-images EM algorithm

Search results