Spelling suggestions: "subject:"EM algorithm."" "subject:"EM allgorithm.""
61 |
Mixture models for estimating operation time distributions.Chen, Yi-Ling 12 July 2005 (has links)
Surgeon operation time is a useful and important information for hospital management, which involves operation time estimation for patients under different diagnoses, operation room scheduling, operating room utilization improvements and so on. In this work, we will focus on studying the operation time distributions of thirteen operations performed in the gynecology (GYN) department of one major teaching hospital in southern Taiwan. We firstly investigate what types of distributions are suitable in describing these operation times empirically, where log-normal and mixture log-normal distribution are identified to be acceptable statistically in describing these operation times. Then we compare and characterize the operations into different categories based on the operation time distribution estimates. Later we try to illustrate the possible reason why distributions
for some operations with large data set turn out to be mixture of certain log-normal distributions. Finally we end with discussions on possible future work.
|
62 |
On adaptive transmission, signal detection and channel estimation for multiple antenna systemsXie, Yongzhe 15 November 2004 (has links)
This research concerns analysis of system capacity, development of adaptive transmission schemes with known channel state information at the transmitter (CSIT) and design of new signal detection and channel estimation schemes with low complexity in some multiple antenna systems. We first analyze the sum-rate capacity of the downlink of a cellular system with multiple transmit antennas and multiple receive antennas assuming perfect CSIT. We evaluate the ergodic sum-rate capacity and show how the sum-rate capacity increases as the number of users and the number of receive antennas increases. We develop upper and lower bounds on the sum-rate capacity and study various adaptive MIMO schemes to achieve, or approach, the sum-rate capacity. Next, we study the minimum outage probability transmission schemes in a multiple-input-single-output (MISO) flat fading channel assuming partial CSIT. Considering two special cases: the mean feedback and the covariance feedback, we derive the optimum spatial transmission directions and show that the associated optimum power allocation scheme, which minimizes the outage probability, is closely related to the target rate and the accuracy of the CSIT. Since CSIT is obtained at the cost of feedback bandwidth, we also consider optimal allocation of bandwidth between the data channel and the feedback channel in order to maximize the average throughput of the data channel in MISO, flat fading, frequency division duplex (FDD) systems. We show that beamforming based on feedback CSI can achieve an average rate larger than the capacity without CSIT under a wide range of mobility conditions. We next study a SAGE-aided List-BLAST detection scheme for MIMO systems which can achieve performance close to that of the maximum-likelihood detector with low complexity. Finally, we apply the EM and SAGE algorithms in channel estimation for OFDM systems with multiple transmit antennas and compare them with a recently proposed least-squares based estimation algorithm. The EM and SAGE algorithms partition the problem of estimating a multi-input channel into independent channel estimation for each transmit-receive antenna pair, therefore avoiding the matrix inversion encountered in the joint least-squares estimation.
|
63 |
Hardware Utilization Measurement and Optimization: A Statistical Investigation and Simulation StudyWang, Zhizheng January 2015 (has links)
It is essential for the managers to make investment on hardware based on the utilization information of the equipment. From December 2014, a pool of hardware and a scheduling and resource sharing system is implemented by one of the software testing sections in Ericsson. To monitor the efficiency of these equipment and the workflow, a model of non-homogeneous M/M/c queue is developed that successfully captures the main aspects of the system. The model is decomposed into arrival, service, failure and each part is estimated. Mixture exponential is estimated with EM algorithm and the impact of scheduling change is also examined. Finally a simulation of workflow is done with Python module and the optimized number of hardware is proposed based on this M/M/c queue system.
|
64 |
Daugiamačių Gauso skirstinių mišinio statistinė analizė, taikant duomenų projektavimą / The Projection-based Statistical Analysis of the Multivariate Gaussian Distribution MixtureKavaliauskas, Mindaugas 21 January 2005 (has links)
Problem of the dissertation. The Gaussian random values are very common in practice, because if a random value depends on many additive factors, according to the Central Limit Theorem (if particular conditions are satisfied), the sum is approximately from Gaussian distribution. If the observed random value belongs to one of the several classes, it is from the Gaussian distribution mixture model. The mixtures of the Gaussian distributions are common in various fields: biology, medicine, astronomy, military science and many others. The most important statistical problems are problems of mixture identification and data clustering. In case of high data dimension, these tasks are not completely solved. The new parameter estimation of the multivariate Gaussian distribution mixture model and data clustering methods are proposed and analysed in the dissertation. Since it is much easier to solve these problems in univariate case, the projection-based approach is used. The aim of the dissertation. The aim of this work is the development of constructive algorithms for distribution analysis and clustering of data from the mixture model of the Gaussian distributions.
|
65 |
Daugiamačiu Gauso skirstinių mišinio statistinė analizė, taikant duomenų projektavimą / The Projection-based Statistical Analysis of the Multivariate Gaussian Distribution MixtureKavaliauskas, Mindaugas 21 January 2005 (has links)
Problem of the dissertation. The Gaussian random values are very common in practice, because if a random value depends on many additive factors, according to the Central Limit Theorem (if particular conditions are satisfied), the sum is approximately from Gaussian distribution. If the observed random value belongs to one of the several classes, it is from the Gaussian distribution mixture model. The mixtures of the Gaussian distributions are common in various fields: biology, medicine, astronomy, military science and many others. The most important statistical problems are problems of mixture identification and data clustering. In case of high data dimension, these tasks are not completely solved. The new parameter estimation of the multivariate Gaussian distribution mixture model and data clustering methods are proposed and analysed in the dissertation. Since it is much easier to solve these problems in univariate case, the projection-based approach is used. The aim of the dissertation. The aim of this work is the development of constructive algorithms for distribution analysis and clustering of data from the mixture model of the Gaussian distributions.
|
66 |
Econometric Models of Crop Yields: Two EssaysTolhurst, Tor 17 May 2013 (has links)
This thesis is an investigation of econometric crop yield models divided into two essays. In the first essay, I propose estimating a single heteroscedasticity coefficient for all counties within a crop-reporting district by pooling county-level crop yield data in a two-stage estimation process. In the context of crop insurance---where heteroscedaticity has significant economic implications---I demonstrate the pooling approach provides economically and statistically significant improvements in rating crop insurance contracts over contemporary methods. In the second essay, I propose a new method for measuring the rate of technological change in crop yields. To date the agricultural economics literature has measured technological change exclusively at the mean; in contrast, the proposed model can measure the rate of technological change in endogenously-defined yield subpopulations. I find evidence of different rates of technological change in yield subpopulations, which leads to interesting questions about the effect of technological change on agricultural production. / Ontario Ministry of Agriculture and Food
|
67 |
Topics on Regularization of Parameters in Multivariate Linear RegressionChen, Lianfu 2011 December 1900 (has links)
My dissertation mainly focuses on the regularization of parameters in the multivariate linear regression under different assumptions on the distribution of the errors. It consists of two topics where we develop iterative procedures to construct sparse estimators for both the regression coefficient and scale matrices simultaneously, and a third topic where we develop a method for testing if the skewness parameter in the skew-normal distribution is parallel to one of the eigenvectors of the scale matrix.
In the first project, we propose a robust procedure for constructing a sparse estimator of a multivariate regression coefficient matrix that accounts for the correlations of the response variables. Robustness to outliers is achieved using heavy-tailed t distributions for the multivariate response, and shrinkage is introduced by adding to the negative log-likelihood l1 penalties on the entries of both the regression coefficient matrix and the precision matrix of the responses. Taking advantage of the hierarchical representation of a multivariate t distribution as the scale mixture of normal distributions and the EM algorithm, the optimization problem is solved iteratively where at each EM iteration suitably modified multivariate regression with covariance estimation (MRCE) algorithms proposed by Rothman, Levina and Zhu are used. We propose two new optimization algorithms for the penalized likelihood, called MRCEI and MRCEII, which differ from MRCE in the way that the tuning parameters for the two matrices are selected. Estimating the degrees of freedom when penalizing the entries of the matrices presents new computational challenges. A simulation study and real data analysis demonstrate that the MRCEII, which selects the tuning parameter of the precision matrix of the multiple responses using the Cp criterion, generally does the best among all methods considered in terms of the prediction error, and MRCEI outperforms the MRCE methods when the regression coefficient matrix is less sparse.
The second project is motivated by the existence of the skewness in the data for which the symmetric distribution assumption on the errors does not hold. We extend the procedure we have proposed to the case where the errors in the multivariate linear regression follow a multivariate skew-normal or skew-t distribution. Based on the convenient representation of skew-normal and skew-t as well as the EM algorithm, we develop an optimization algorithm, called MRST, to iteratively minimize the negative penalized log-likelihood. We also carry out a simulation study to assess the performance of the method and illustrate its application with one real data example.
In the third project, we discuss the asymptotic distributions of the eigenvalues and eigenvectors for the MLE of the scale matrix in a multivariate skew-normal distribution. We propose a statistic for testing whether the skewness vector is proportional to one of the eigenvectors of the scale matrix based on the likelihood ratio. Under the alternative, the likelihood is maximized numerically with two different ways of parametrization for the scale matrix: Modified Cholesky Decomposition (MCD) and Givens Angle. We conduct a simulation study and show that the statistic obtained using Givens Angle parametrization performs well and is more reliable than that obtained using MCD.
|
68 |
Session Clustering Using Mixtures of Proportional Hazards ModelsMair, Patrick, Hudec, Marcus January 2008 (has links) (PDF)
Emanating from classical Weibull mixture models we propose a framework for clustering survival data with various proportionality restrictions imposed. By introducing mixtures of Weibull proportional hazards models on a multivariate data set a parametric cluster approach based on the EM-algorithm is carried out. The problem of non-response in the data is considered. The application example is a real life data set stemming from the analysis of a world-wide operating eCommerce application. Sessions are clustered due to the dwell times a user spends on certain page-areas. The solution allows for the interpretation of the navigation behavior in terms of survival and hazard functions. A software implementation by means of an R package is provided. (author´s abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
69 |
Contaminated Chi-square Modeling and Its Application in Microarray Data AnalysisZhou, Feng 01 January 2014 (has links)
Mixture modeling has numerous applications. One particular interest is microarray data analysis. My dissertation research is focused on the Contaminated Chi-Square (CCS) Modeling and its application in microarray. A moment-based method and two likelihood-based methods including Modified Likelihood Ratio Test (MLRT) and Expectation-Maximization (EM) Test are developed for testing the omnibus null hypothesis of no contamination of a central chi-square distribution by a non-central Chi-Square distribution. When the omnibus null hypothesis is rejected, we further developed the moment-based test and the EM test for testing an extra component to the Contaminated Chi-Square (CCS+EC) Model. The moment-based approach is easy and there is no need for re-sampling or random field theory to obtain critical values. When the statistical models are complicated such as large mixtures of dimensional distributions, MLRT and EM test may have better power than moment based approaches, and the MLRT and EM tests developed herein enjoy an elegant asymptotic theory.
|
70 |
Uma metodologia para a detecção de mudanças em imagens multitemporais de sensoriamento remoto empregando Support Vector MachinesFerreira, Rute Henrique da Silva January 2014 (has links)
Esta tese investiga uma abordagem supervisionada para o problema da detecção de mudanças em imagens multitemporais de sensoriamento remoto empregando Support Vector Machines (SVM) com o uso dos kernels polinomial e gaussiano (RBF). A proposta metodológica está baseada na diferença das imagens-fração produzidas para cada data. Em imagens de cenas naturais a diferença nas frações de solo e vegetação tendem a apresentar uma distribuição simétrica em torno da origem. Esse fato pode ser usado para modelar duas distribuições normais multivariadas: mudança e não-mudança. O algoritmo Expectation-Maximization (EM) é implementado para estimar os parâmetros (vetor de médias, matriz de covariância e probabilidade a priori) associados a essas duas distribuições. Amostras aleatórias são extraídas dessas distribuições e usadas para treinar o classificador SVM nesta abordagem supervisionada. A metodologia proposta realiza testes com o uso de conjuntos de dados multitemporais de imagens multiespectrais TM-Landsat, que cobrem a mesma cena em duas datas diferentes. Os resultados são comparados com outros procedimentos, incluindo trabalhos anteriores, um conjunto de dados sintéticos e o classificador SVM One-Class. / In this thesis, we investigate a supervised approach to change detection in remote sensing multi-temporal image data by applying Support Vector Machines (SVM) technique using polynomial kernel and Gaussian kernel (RBF). The methodology is based on the difference-fraction images produced for two dates. In natural scenes, the difference in the fractions such as vegetation and bare soil occurring in two different dates tend to present a distribution symmetric around the origin of the coordinate system. This fact can be used to model two normal multivariate distributions: class change and no-change. The Expectation-Maximization algorithm (EM) is implemented to estimate the parameters (mean vector, covariance matrix and a priori probability) associated with these two distributions. Random samples are drawn from these distributions and used to train the SVM classifier in this supervised approach.The proposed methodology performs tests using multi-temporal TMLandsat multispectral image data covering the same scene in two different dates. The results are compared to other procedures including previous work, a synthetic data set and SVM One-Class.
|
Page generated in 0.0387 seconds