Global ETD Search

51	A Flexible Zero-Inflated Poisson Regression Model Roemmele, Eric S. 01 January 2019 (has links) A practical problem often encountered with observed count data is the presence of excess zeros. Zero-inflation in count data can easily be handled by zero-inflated models, which is a two-component mixture of a point mass at zero and a discrete distribution for the count data. In the presence of predictors, zero-inflated Poisson (ZIP) regression models are, perhaps, the most commonly used. However, the fully parametric ZIP regression model could sometimes be restrictive, especially with respect to the mixing proportions. Taking inspiration from some of the recent literature on semiparametric mixtures of regressions models for flexible mixture modeling, we propose a semiparametric ZIP regression model. We present an "EM-like" algorithm for estimation and a summary of asymptotic properties of the estimators. The proposed semiparametric models are then applied to a data set involving clandestine methamphetamine laboratories and Alzheimer's disease. Bootstrap Count data EM Algorithm zero-inflation semiparametric model Statistical Models Statistical Theory
52	Semiparametric regression analysis of zero-inflated data Liu, Hai 01 July 2009 (has links) Zero-inflated data abound in ecological studies as well as in other scientific and quantitative fields. Nonparametric regression with zero-inflated response may be studied via the zero-inflated generalized additive model (ZIGAM). ZIGAM assumes that the conditional distribution of the response variable belongs to the zero-inflated 1-parameter exponential family which is a probabilistic mixture of the zero atom and the 1-parameter exponential family, where the zero atom accounts for an excess of zeroes in the data. We propose the constrained zero-inflated generalized additive model (COZIGAM) for analyzing zero-inflated data, with the further assumption that the probability of non-zero-inflation is some monotone function of the (non-zero-inflated) exponential family distribution mean. When the latter assumption obtains, the new approach provides a unified framework for modeling zero-inflated data, which is more parsimonious and efficient than the unconstrained ZIGAM. We develop an iterative algorithm for model estimation based on the penalized likelihood approach, and derive formulas for constructing confidence intervals of the maximum penalized likelihood estimator. Some asymptotic properties including the consistency of the regression function estimator and the limiting distribution of the parametric estimator are derived. We also propose a Bayesian model selection criterion for choosing between the unconstrained and the constrained ZIGAMs. We consider several useful extensions of the COZIGAM, including imposing additive-component-specific proportional and partial constraints, and incorporating threshold effects to account for regime shift phenomena. The new methods are illustrated with both simulated data and real applications. An R package COZIGAM has been developed for model fitting and model selection with zero-inflated data. Asymptotic normality Constrained model EM algorithm Model selection Penalized likelihood Threshold model Statistics and Probability
53	NIG distribution in modelling stock returns with assumption about stochastic volatility : Estimation of parameters and application to VaR and ETL. Kucharska, Magdalena, Pielaszkiewicz, Jolanta January 2009 (has links) <p>We model Normal Inverse Gaussian distributed log-returns with the assumption of stochastic volatility. We consider different methods of parametrization of returns and following the paper of Lindberg, [21] we</p><p>assume that the volatility is a linear function of the number of trades. In addition to the Lindberg’s paper, we suggest daily stock volumes and amounts as alternative measures of the volatility.</p><p>As an application of the models, we perform Value-at-Risk and Expected Tail Loss predictions by the Lindberg’s volatility model and by our own suggested model. These applications are new and not described in the</p><p>literature. For better understanding of our caluclations, programmes and simulations, basic informations and properties about the Normal Inverse Gaussian and Inverse Gaussian distributions are provided. Practical applications of the models are implemented on the Nasdaq-OMX, where we have calculated Value-at-Risk and Expected Tail Loss</p><p>for the Ericsson B stock data during the period 1999 to 2004.</p> NIG distribution Value at Risk Expected Tail Loss Lindberg method EM algorithm risk analysis ETL VaR
54	Fast Learning by Bounding Likelihoods in Sigmoid Type Belief Networks Jaakkola, Tommi S., Saul, Lawrence K., Jordan, Michael I. 09 February 1996 (has links) Sigmoid type belief networks, a class of probabilistic neural networks, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and supervised learning problems. Often the parameters used in these networks need to be learned from examples. Unfortunately, estimating the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them exactly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains. The complementary networks can be used for continuous density estimation as well. AI MIT Artificial Intelligence Belief networks Probabilistic networks EM algorithm Density estimation Likelihood bounds
55	Learning from Incomplete Data Ghahramani, Zoubin, Jordan, Michael I. 24 January 1995 (has links) Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data. AI MIT Artificial Intelligence missing data mixture models statistical learning EM algorithm maximum likelihood neural networks
56	On adaptive transmission, signal detection and channel estimation for multiple antenna systems Xie, Yongzhe 15 November 2004 (has links) This research concerns analysis of system capacity, development of adaptive transmission schemes with known channel state information at the transmitter (CSIT) and design of new signal detection and channel estimation schemes with low complexity in some multiple antenna systems. We first analyze the sum-rate capacity of the downlink of a cellular system with multiple transmit antennas and multiple receive antennas assuming perfect CSIT. We evaluate the ergodic sum-rate capacity and show how the sum-rate capacity increases as the number of users and the number of receive antennas increases. We develop upper and lower bounds on the sum-rate capacity and study various adaptive MIMO schemes to achieve, or approach, the sum-rate capacity. Next, we study the minimum outage probability transmission schemes in a multiple-input-single-output (MISO) flat fading channel assuming partial CSIT. Considering two special cases: the mean feedback and the covariance feedback, we derive the optimum spatial transmission directions and show that the associated optimum power allocation scheme, which minimizes the outage probability, is closely related to the target rate and the accuracy of the CSIT. Since CSIT is obtained at the cost of feedback bandwidth, we also consider optimal allocation of bandwidth between the data channel and the feedback channel in order to maximize the average throughput of the data channel in MISO, flat fading, frequency division duplex (FDD) systems. We show that beamforming based on feedback CSI can achieve an average rate larger than the capacity without CSIT under a wide range of mobility conditions. We next study a SAGE-aided List-BLAST detection scheme for MIMO systems which can achieve performance close to that of the maximum-likelihood detector with low complexity. Finally, we apply the EM and SAGE algorithms in channel estimation for OFDM systems with multiple transmit antennas and compare them with a recently proposed least-squares based estimation algorithm. The EM and SAGE algorithms partition the problem of estimating a multi-input channel into independent channel estimation for each transmit-receive antenna pair, therefore avoiding the matrix inversion encountered in the joint least-squares estimation. adaptive transmission MIMO EM algorithm signal detection channel state information OFDM
57	NIG distribution in modelling stock returns with assumption about stochastic volatility : Estimation of parameters and application to VaR and ETL. Kucharska, Magdalena, Pielaszkiewicz, Jolanta January 2009 (has links) We model Normal Inverse Gaussian distributed log-returns with the assumption of stochastic volatility. We consider different methods of parametrization of returns and following the paper of Lindberg, [21] we assume that the volatility is a linear function of the number of trades. In addition to the Lindberg’s paper, we suggest daily stock volumes and amounts as alternative measures of the volatility. As an application of the models, we perform Value-at-Risk and Expected Tail Loss predictions by the Lindberg’s volatility model and by our own suggested model. These applications are new and not described in the literature. For better understanding of our caluclations, programmes and simulations, basic informations and properties about the Normal Inverse Gaussian and Inverse Gaussian distributions are provided. Practical applications of the models are implemented on the Nasdaq-OMX, where we have calculated Value-at-Risk and Expected Tail Loss for the Ericsson B stock data during the period 1999 to 2004. NIG distribution Value at Risk Expected Tail Loss Lindberg method EM algorithm risk analysis ETL VaR
58	Multivariate Poisson hidden Markov models for analysis of spatial counts Karunanayake, Chandima Piyadharshani 08 June 2007 Multivariate count data are found in a variety of fields. For modeling such data, one may consider the multivariate Poisson distribution. Overdispersion is a problem when modeling the data with the multivariate Poisson distribution. Therefore, in this thesis we propose a new multivariate Poisson hidden Markov model based on the extension of independent multivariate Poisson finite mixture models, as a solution to this problem. This model, which can take into account the spatial nature of weed counts, is applied to weed species counts in an agricultural field. The distribution of counts depends on the underlying sequence of states, which are unobserved or hidden. These hidden states represent the regions where weed counts are relatively homogeneous. Analysis of these data involves the estimation of the number of hidden states, Poisson means and covariances. Parameter estimation is done using a modified EM algorithm for maximum likelihood estimation. <p>We extend the univariate Markov-dependent Poisson finite mixture model to the multivariate Poisson case (bivariate and trivariate) to model counts of two or three species. Also, we contribute to the hidden Markov model research area by developing Splus/R codes for the analysis of the multivariate Poisson hidden Markov model. Splus/R codes are written for the estimation of multivariate Poisson hidden Markov model using the EM algorithm and the forward-backward procedure and the bootstrap estimation of standard errors. The estimated parameters are used to calculate the goodness of fit measures of the models.<p>Results suggest that the multivariate Poisson hidden Markov model, with five states and an independent covariance structure, gives a reasonable fit to this dataset. Since this model deals with overdispersion and spatial information, it will help to get an insight about weed distribution for herbicide applications. This model may lead researchers to find other factors such as soil moisture, fertilizer level, etc., to determine the states, which govern the distribution of the weed counts. EM algorithm multivariate Poisson hidden Markov model Weed species counts Multivariate Poisson distribution
59	Multivariate Poisson hidden Markov models for analysis of spatial counts Karunanayake, Chandima Piyadharshani 08 June 2007 (has links) Multivariate count data are found in a variety of fields. For modeling such data, one may consider the multivariate Poisson distribution. Overdispersion is a problem when modeling the data with the multivariate Poisson distribution. Therefore, in this thesis we propose a new multivariate Poisson hidden Markov model based on the extension of independent multivariate Poisson finite mixture models, as a solution to this problem. This model, which can take into account the spatial nature of weed counts, is applied to weed species counts in an agricultural field. The distribution of counts depends on the underlying sequence of states, which are unobserved or hidden. These hidden states represent the regions where weed counts are relatively homogeneous. Analysis of these data involves the estimation of the number of hidden states, Poisson means and covariances. Parameter estimation is done using a modified EM algorithm for maximum likelihood estimation. <p>We extend the univariate Markov-dependent Poisson finite mixture model to the multivariate Poisson case (bivariate and trivariate) to model counts of two or three species. Also, we contribute to the hidden Markov model research area by developing Splus/R codes for the analysis of the multivariate Poisson hidden Markov model. Splus/R codes are written for the estimation of multivariate Poisson hidden Markov model using the EM algorithm and the forward-backward procedure and the bootstrap estimation of standard errors. The estimated parameters are used to calculate the goodness of fit measures of the models.<p>Results suggest that the multivariate Poisson hidden Markov model, with five states and an independent covariance structure, gives a reasonable fit to this dataset. Since this model deals with overdispersion and spatial information, it will help to get an insight about weed distribution for herbicide applications. This model may lead researchers to find other factors such as soil moisture, fertilizer level, etc., to determine the states, which govern the distribution of the weed counts. EM algorithm multivariate Poisson hidden Markov model Weed species counts Multivariate Poisson distribution
60	Monte Carlo Statistical Methods¡GIntegration and Optimization Pan, Tian-Tian 10 July 2012 (has links) ¡@¡@This paper is refer to the chapter 1 to chapter 5 (except chapter 4 ) of the book, Monte Carlo Statistical Methods(second edition), the author is Robert and Casella(2004). The goal is to translate the chapter 1 to chapter 5 contents of this book into Chinese, modify the mistakes, add the details of the examples, translate the algorithm of the examples into Mathematica(7th) codes, and use the Simulated Annealing method to deal with the estimation of parameters by rounding data, and discuss the results. This paper provides Mathematica(7th) codes of almost every example, and show the actual results, so it can be regarded as a toolbook for those people who are interested in reading this book or may solve some problems related to those examples. Importance Sampling Simulated Annealing Rounding Error EM Algorithm Envelope Accept-Reject Methods Accept-Reject Methods

Search results