Spelling suggestions: "subject:"then EM algorithm"" "subject:"them EM algorithm""
41 |
Semiparametric regression analysis of zero-inflated dataLiu, Hai 01 July 2009 (has links)
Zero-inflated data abound in ecological studies as well as in other scientific and quantitative fields. Nonparametric regression with zero-inflated response may be studied via the zero-inflated generalized additive model (ZIGAM). ZIGAM assumes that the conditional distribution of the response variable belongs to the zero-inflated 1-parameter exponential family which is a probabilistic mixture of the zero atom and the 1-parameter exponential family, where the zero atom accounts for an excess of zeroes in the data. We propose the constrained zero-inflated generalized additive model (COZIGAM) for analyzing zero-inflated data, with the further assumption that the probability of non-zero-inflation is some monotone function of the (non-zero-inflated) exponential family distribution mean. When the latter assumption obtains, the new approach provides a unified framework for modeling zero-inflated data, which is more parsimonious and efficient than the unconstrained ZIGAM. We develop an iterative algorithm for model estimation based on the penalized likelihood approach, and derive formulas for constructing confidence intervals of the maximum penalized likelihood estimator. Some asymptotic properties including the consistency of the regression function estimator and the limiting distribution of the parametric estimator are derived. We also propose a Bayesian model selection criterion for choosing between the unconstrained and the constrained ZIGAMs. We consider several useful extensions of the COZIGAM, including imposing additive-component-specific proportional and partial constraints, and incorporating threshold effects to account for regime shift phenomena. The new methods are illustrated with both simulated data and real applications. An R package COZIGAM has been developed for model fitting and model selection with zero-inflated data.
|
42 |
NIG distribution in modelling stock returns with assumption about stochastic volatility : Estimation of parameters and application to VaR and ETL.Kucharska, Magdalena, Pielaszkiewicz, Jolanta January 2009 (has links)
<p>We model Normal Inverse Gaussian distributed log-returns with the assumption of stochastic volatility. We consider different methods of parametrization of returns and following the paper of Lindberg, [21] we</p><p>assume that the volatility is a linear function of the number of trades. In addition to the Lindberg’s paper, we suggest daily stock volumes and amounts as alternative measures of the volatility.</p><p>As an application of the models, we perform Value-at-Risk and Expected Tail Loss predictions by the Lindberg’s volatility model and by our own suggested model. These applications are new and not described in the</p><p>literature. For better understanding of our caluclations, programmes and simulations, basic informations and properties about the Normal Inverse Gaussian and Inverse Gaussian distributions are provided. Practical applications of the models are implemented on the Nasdaq-OMX, where we have calculated Value-at-Risk and Expected Tail Loss</p><p>for the Ericsson B stock data during the period 1999 to 2004.</p>
|
43 |
Fast Learning by Bounding Likelihoods in Sigmoid Type Belief NetworksJaakkola, Tommi S., Saul, Lawrence K., Jordan, Michael I. 09 February 1996 (has links)
Sigmoid type belief networks, a class of probabilistic neural networks, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and supervised learning problems. Often the parameters used in these networks need to be learned from examples. Unfortunately, estimating the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them exactly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains. The complementary networks can be used for continuous density estimation as well.
|
44 |
Learning from Incomplete DataGhahramani, Zoubin, Jordan, Michael I. 24 January 1995 (has links)
Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data.
|
45 |
On adaptive transmission, signal detection and channel estimation for multiple antenna systemsXie, Yongzhe 15 November 2004 (has links)
This research concerns analysis of system capacity, development of adaptive transmission schemes with known channel state information at the transmitter (CSIT) and design of new signal detection and channel estimation schemes with low complexity in some multiple antenna systems. We first analyze the sum-rate capacity of the downlink of a cellular system with multiple transmit antennas and multiple receive antennas assuming perfect CSIT. We evaluate the ergodic sum-rate capacity and show how the sum-rate capacity increases as the number of users and the number of receive antennas increases. We develop upper and lower bounds on the sum-rate capacity and study various adaptive MIMO schemes to achieve, or approach, the sum-rate capacity. Next, we study the minimum outage probability transmission schemes in a multiple-input-single-output (MISO) flat fading channel assuming partial CSIT. Considering two special cases: the mean feedback and the covariance feedback, we derive the optimum spatial transmission directions and show that the associated optimum power allocation scheme, which minimizes the outage probability, is closely related to the target rate and the accuracy of the CSIT. Since CSIT is obtained at the cost of feedback bandwidth, we also consider optimal allocation of bandwidth between the data channel and the feedback channel in order to maximize the average throughput of the data channel in MISO, flat fading, frequency division duplex (FDD) systems. We show that beamforming based on feedback CSI can achieve an average rate larger than the capacity without CSIT under a wide range of mobility conditions. We next study a SAGE-aided List-BLAST detection scheme for MIMO systems which can achieve performance close to that of the maximum-likelihood detector with low complexity. Finally, we apply the EM and SAGE algorithms in channel estimation for OFDM systems with multiple transmit antennas and compare them with a recently proposed least-squares based estimation algorithm. The EM and SAGE algorithms partition the problem of estimating a multi-input channel into independent channel estimation for each transmit-receive antenna pair, therefore avoiding the matrix inversion encountered in the joint least-squares estimation.
|
46 |
NIG distribution in modelling stock returns with assumption about stochastic volatility : Estimation of parameters and application to VaR and ETL.Kucharska, Magdalena, Pielaszkiewicz, Jolanta January 2009 (has links)
We model Normal Inverse Gaussian distributed log-returns with the assumption of stochastic volatility. We consider different methods of parametrization of returns and following the paper of Lindberg, [21] we assume that the volatility is a linear function of the number of trades. In addition to the Lindberg’s paper, we suggest daily stock volumes and amounts as alternative measures of the volatility. As an application of the models, we perform Value-at-Risk and Expected Tail Loss predictions by the Lindberg’s volatility model and by our own suggested model. These applications are new and not described in the literature. For better understanding of our caluclations, programmes and simulations, basic informations and properties about the Normal Inverse Gaussian and Inverse Gaussian distributions are provided. Practical applications of the models are implemented on the Nasdaq-OMX, where we have calculated Value-at-Risk and Expected Tail Loss for the Ericsson B stock data during the period 1999 to 2004.
|
47 |
Monte Carlo Statistical Methods¡GIntegration and OptimizationPan, Tian-Tian 10 July 2012 (has links)
¡@¡@This paper is refer to the chapter 1 to chapter 5 (except chapter 4 ) of the book, Monte Carlo Statistical Methods(second edition), the author is Robert and Casella(2004). The goal is to translate the chapter 1 to chapter 5 contents of this book into Chinese, modify the mistakes, add the details of the examples, translate the algorithm of the examples into Mathematica(7th) codes, and use the Simulated Annealing method to deal with the estimation of parameters by rounding data, and discuss the results. This paper provides Mathematica(7th) codes of almost every example, and show the actual results, so it can be regarded as a toolbook for those people who are interested in reading this book or may solve some problems related to those examples.
|
48 |
Mixture models for estimating operation time distributions.Chen, Yi-Ling 12 July 2005 (has links)
Surgeon operation time is a useful and important information for hospital management, which involves operation time estimation for patients under different diagnoses, operation room scheduling, operating room utilization improvements and so on. In this work, we will focus on studying the operation time distributions of thirteen operations performed in the gynecology (GYN) department of one major teaching hospital in southern Taiwan. We firstly investigate what types of distributions are suitable in describing these operation times empirically, where log-normal and mixture log-normal distribution are identified to be acceptable statistically in describing these operation times. Then we compare and characterize the operations into different categories based on the operation time distribution estimates. Later we try to illustrate the possible reason why distributions
for some operations with large data set turn out to be mixture of certain log-normal distributions. Finally we end with discussions on possible future work.
|
49 |
On adaptive transmission, signal detection and channel estimation for multiple antenna systemsXie, Yongzhe 15 November 2004 (has links)
This research concerns analysis of system capacity, development of adaptive transmission schemes with known channel state information at the transmitter (CSIT) and design of new signal detection and channel estimation schemes with low complexity in some multiple antenna systems. We first analyze the sum-rate capacity of the downlink of a cellular system with multiple transmit antennas and multiple receive antennas assuming perfect CSIT. We evaluate the ergodic sum-rate capacity and show how the sum-rate capacity increases as the number of users and the number of receive antennas increases. We develop upper and lower bounds on the sum-rate capacity and study various adaptive MIMO schemes to achieve, or approach, the sum-rate capacity. Next, we study the minimum outage probability transmission schemes in a multiple-input-single-output (MISO) flat fading channel assuming partial CSIT. Considering two special cases: the mean feedback and the covariance feedback, we derive the optimum spatial transmission directions and show that the associated optimum power allocation scheme, which minimizes the outage probability, is closely related to the target rate and the accuracy of the CSIT. Since CSIT is obtained at the cost of feedback bandwidth, we also consider optimal allocation of bandwidth between the data channel and the feedback channel in order to maximize the average throughput of the data channel in MISO, flat fading, frequency division duplex (FDD) systems. We show that beamforming based on feedback CSI can achieve an average rate larger than the capacity without CSIT under a wide range of mobility conditions. We next study a SAGE-aided List-BLAST detection scheme for MIMO systems which can achieve performance close to that of the maximum-likelihood detector with low complexity. Finally, we apply the EM and SAGE algorithms in channel estimation for OFDM systems with multiple transmit antennas and compare them with a recently proposed least-squares based estimation algorithm. The EM and SAGE algorithms partition the problem of estimating a multi-input channel into independent channel estimation for each transmit-receive antenna pair, therefore avoiding the matrix inversion encountered in the joint least-squares estimation.
|
50 |
Hardware Utilization Measurement and Optimization: A Statistical Investigation and Simulation StudyWang, Zhizheng January 2015 (has links)
It is essential for the managers to make investment on hardware based on the utilization information of the equipment. From December 2014, a pool of hardware and a scheduling and resource sharing system is implemented by one of the software testing sections in Ericsson. To monitor the efficiency of these equipment and the workflow, a model of non-homogeneous M/M/c queue is developed that successfully captures the main aspects of the system. The model is decomposed into arrival, service, failure and each part is estimated. Mixture exponential is estimated with EM algorithm and the impact of scheduling change is also examined. Finally a simulation of workflow is done with Python module and the optimized number of hardware is proposed based on this M/M/c queue system.
|
Page generated in 0.0498 seconds