Spelling suggestions: "subject:"mixture codels"" "subject:"mixture 2models""
11 |
Detecting underlying emotional sensitivity in bereaved children via a multivariate normal mixture distributionKelbick, Nicole DePriest 07 November 2003 (has links)
No description available.
|
12 |
Clustering Discrete Valued Time SeriesRoick, Tyler January 2017 (has links)
There is a need for the development of models that are able to account for discreteness in data, along with its time series properties and correlation. A review of the application of thinning operators to adapt the ARMA recursion to the integer-valued case is first discussed. A class of integer-valued ARMA (INARMA) models arises from this application. Our focus falls on INteger-valued AutoRegressive (INAR) type models. The INAR type models can be used in conjunction with existing model-based clustering techniques to cluster discrete valued time series data. This approach is then illustrated with the addition of autocorrelations. With the use of a finite mixture model, several existing techniques such as the selection of the number of clusters, estimation using expectation-maximization and model selection are applicable. The proposed model is then demonstrated on real data to illustrate its clustering applications. / Thesis / Master of Science (MSc)
|
13 |
Clustering Matrix Variate Data Using Finite Mixture Models with Component-Wise RegularizationTait, Peter A 11 1900 (has links)
Matrix variate distributions present a innate way to model random matrices. Realiza-
tions of random matrices are created by concurrently observing variables in different
locations or at different time points. We use a finite mixture model composed of
matrix variate normal densities to cluster matrix variate data. The matrix variate
data was generated by accelerometers worn by children in a clinical study conducted
at McMaster. Their acceleration along the three planes of motion over the course of
seven days, forms their matrix variate data. We use the resulting clusters to verify
existing group membership labels derived from a test of motor-skills proficiency used
to assess the children’s locomotion. / Thesis / Master of Science (MSc)
|
14 |
Analysis of Three-Way Data and Other Topics in Clustering and ClassificationGallaugher, Michael Patrick Brian January 2020 (has links)
Clustering and classification is the process of finding underlying group structure in heterogenous data. With the rise of the “big data” phenomenon, more complex data structures have made it so traditional clustering methods are oftentimes not advisable or feasible. This thesis presents methodology for analyzing three different examples of these more complex data types. The first is three-way (matrix variate) data, or data that come in the form of matrices. A large emphasis is placed on clustering skewed three-way data, and high dimensional three-way data. The second is click- stream data, which considers a user’s internet search patterns. Finally, co-clustering methodology is discussed for very high-dimensional two-way (multivariate) data. Parameter estimation for all these methods is based on the expectation maximization (EM) algorithm. Both simulated and real data are used for illustration. / Thesis / Doctor of Philosophy (PhD)
|
15 |
Outlier Detection in Gaussian Mixture ModelsClark, Katharine January 2020 (has links)
Unsupervised classification is a problem often plagued by outliers, yet there is a paucity of work on handling outliers in unsupervised classification. Mixtures of Gaussian distributions are a popular choice in model-based clustering. A single outlier can affect parameters estimation and, as such, must be accounted for. This issue is further complicated by the presence of multiple outliers. Predicting the proportion of outliers correctly is paramount as it minimizes misclassification error. It is proved that, for a finite Gaussian mixture model, the log-likelihoods of the subset models are distributed according to a mixture of beta-type distributions. This relationship is leveraged in two ways. First, an algorithm is proposed that predicts the proportion of outliers by measuring the adherence of a set of subset log-likelihoods to a beta-type mixture reference distribution. This algorithm removes the least likely points, which are deemed outliers, until model assumptions are met. Second, a hypothesis test is developed, which, at a chosen significance level, can test whether a dataset contains a single outlier. / Thesis / Master of Science (MSc)
|
16 |
Random Forest Analogues for Mixture Discriminant AnalysisMallo, Muz 09 June 2022 (has links)
Finite mixture modelling is a powerful and well-developed paradigm, having proven useful in unsupervised learning and, to a lesser extent supervised learning (mixture discriminant analysis), especially in the case(s) of data with local variation and/or latent variables. It is the aim of this thesis to improve upon mixture discriminant analysis by introducing two types of random forest analogues which are called Mix- Forests. The first MixForest is based on Gaussian mixture models from the famous family of Gaussian parsimonious clustering models and will be useful in classify- ing lower dimensional data. The second MixForest extends the technique to higher dimensional data via the use of mixtures of factor analyzers from the well-known family of parsimonious Gaussian mixture models. MixForests will be utilized in the analysis of real data to demonstrate potential increases in classification accuracy as well as inferential procedures such as generalization error estimation and variable importance measures. / Thesis / Doctor of Philosophy (PhD)
|
17 |
Mixture models for clustering and dimension reductionVerbeek, Jakob 08 December 2004 (has links) (PDF)
In Chapter 1 we give a general introduction and motivate the need for clustering and dimension reduction methods. We continue in Chapther 2 with a review of different types of existing clustering and dimension reduction methods.<br /><br />In Chapter 3 we introduce mixture densities and the expectation-maximization (EM) algorithm to estimate their parameters. Although the EM algorithm has many attractive properties, it is not guaranteed to return optimal parameter estimates. We present greedy EM parameter estimation algorithms which start with a one-component mixture and then iteratively add a component to the mixture and re-estimate the parameters of the current mixture. Experimentally, we demonstrate that our algorithms avoid many of the sub-optimal estimates returned by the EM algorithm. Finally, we present an approach to accelerate mixture densities estimation from many data points. We apply this approach to both the standard EM algorithm and our greedy EM algorithm.<br /><br />In Chapter 4 we present a non-linear dimension reduction method that uses a constrained EM algorithm for parameter estimation. Our approach is similar to Kohonen's self-organizing map, but in contrast to the self-organizing map, our parameter estimation algorithm is guaranteed to converge and optimizes a well-defined objective function. In addition, our method allows data with missing values to be used for parameter estimation and it is readily applied to data that is not specified by real numbers but for example by discrete variables. We present the results of several experiments to demonstrate our method and to compare it with Kohonen's self-organizing map.<br /><br />In Chapter 5 we consider an approach for non-linear dimension reduction which is based on a combination of clustering and linear dimension reduction. This approach forms one global non-linear low dimensional data representation by combining multiple, locally valid, linear low dimensional representations. We derive an improvement of the original parameter estimation algorithm, which requires less computation and leads to better parameter estimates. We experimentally compare this approach to several other dimension reduction methods. We also apply this approach to a setting where high dimensional 'outputs' have to be predicted from high dimensional 'inputs'. Experimentally, we show that the considered non-linear approach leads to better predictions than a similar approach which also combines several local linear representations, but does not combine them into one global non-linear representation.<br /><br />In Chapter 6 we summarize our conclusions and discuss directions for further research.
|
18 |
Semiparametric mixture modelsXiang, Sijia January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Weixin Yao / This dissertation consists of three parts that are related to semiparametric mixture models.
In Part I, we construct the minimum profile Hellinger distance (MPHD) estimator for a class of semiparametric mixture models where one component has known distribution with possibly unknown parameters while the other component density and the mixing proportion are unknown. Such semiparametric mixture models have been often used in biology and the sequential clustering algorithm.
In Part II, we propose a new class of semiparametric mixture of regression models, where the mixing proportions and variances are constants, but the component regression functions are smooth functions of a covariate. A one-step backfitting estimate and two EM-type algorithms have been proposed to achieve the optimal convergence rate for both the global parameters and nonparametric regression functions. We derive the asymptotic property of the proposed estimates and show that both proposed EM-type algorithms preserve the asymptotic ascent property.
In Part III, we apply the idea of single-index model to the mixture of regression models and propose three new classes of models: the mixture of single-index models (MSIM), the mixture of regression models with varying single-index proportions (MRSIP), and the mixture of regression models with varying single-index proportions and variances (MRSIPV). Backfitting estimates and the corresponding algorithms have been proposed for the new models to achieve the optimal convergence rate for both the parameters and the nonparametric functions. We show that the nonparametric functions can be estimated as if the parameters were known and the parameters can be estimated with the same rate of convergence, n[subscript](-1/2), that is achieved in a parametric model.
|
19 |
Two component semiparametric density mixture models with a known componentZhou Shen (5930258) 17 January 2019 (has links)
<pre>Finite mixture models have been successfully used in many applications, such as classification, clustering, and many others. As opposed to classical parametric mixture models, nonparametric and semiparametric mixture models often provide more flexible approaches to the description of inhomogeneous populations. As an example, in the last decade a particular two-component semiparametric density mixture model with a known component has attracted substantial research interest. Our thesis provides an innovative way of estimation for this model based on minimization of a smoothed objective functional, conceptually similar to the log-likelihood. The minimization is performed with the help of an EM-like algorithm. We show that the algorithm is convergent and the minimizers of the objective functional, viewed as estimators of the model parameters, are consistent. </pre><pre><br></pre><pre>More specifically, in our thesis, a semiparametric mixture of two density functions is considered where one of them is known while the weight and the other function are unknown. For the first part, a new sufficient identifiability condition for this model is derived, and a specific class of distributions describing the unknown component is given for which this condition is mostly satisfied. A novel approach to estimation of this model is derived. That approach is based on an idea of using a smoothed likelihood-like functional as an objective functional in order to avoid ill-posedness of the original problem. Minimization of this functional is performed using an iterative Majorization-Minimization (MM) algorithm that estimates all of the unknown parts of the model. The algorithm possesses a descent property with respect to the objective functional. Moreover, we show that the algorithm converges even when the unknown density is not defined on a compact interval. Later, we also study properties of the minimizers of this functional viewed as estimators of the mixture model parameters. Their convergence to the true solution with respect to a bandwidth parameter is justified by reconsidering in the framework of Tikhonov-type functional. They also turn out to be large-sample consistent; this is justified using empirical minimization approach. The third part of the thesis contains a series of simulation studies, comparison with another method and a real data example. All of them show the good performance of the proposed algorithm in recovering unknown components from data.</pre>
|
20 |
On Convergence Properties of the EM Algorithm for Gaussian MixturesJordan, Michael, Xu, Lei 21 April 1995 (has links)
"Expectation-Maximization'' (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite Gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix $P$, and we provide an explicit expression for the matrix. We then analyze the convergence of EM in terms of special properties of $P$ and provide new results analyzing the effect that $P$ has on the likelihood surface. Based on these mathematical results, we present a comparative discussion of the advantages and disadvantages of EM and other algorithms for the learning of Gaussian mixture models.
|
Page generated in 0.0502 seconds