Spelling suggestions: "subject:"mixture codels"" "subject:"mixture 2models""
31 |
Analysis of Four and Five-Way Data and Other Topics in ClusteringTait, Peter A. January 2021 (has links)
Clustering is the process of finding underlying group structure in data. As the scale of
data collection continues to grow, this “big data” phenomenon results in more complex data structures. These data structures are not always compatible with traditional
clustering methods, making their use problematic. This thesis presents methodology
for analyzing samples of four-way and higher data, examples of these more complex
data types. These data structures consist of samples of continuous data arranged in
multidimensional arrays. A large emphasis is placed on clustering this data using
mixture models that leverage tensor-variate distributions to model the data. Parameter estimation for all these methods are based on the expectation-maximization
algorithm. Both simulated and real data are used for illustration. / Thesis / Doctor of Science (PhD)
|
32 |
Out-of-distribution Recognition and Classification of Time-Series Pulsed Radar Signals / Out-of-distribution Igenkänning och Klassificering av Pulserade Radar SignalerHedvall, Paul January 2022 (has links)
This thesis investigates out-of-distribution recognition for time-series data of pulsedradar signals. The classifier is a naive Bayesian classifier based on Gaussian mixturemodels and Dirichlet process mixture models. In the mixture models, we model thedistribution of three pulse features in the time series, namely radio-frequency in thepulse, duration of the pulse, and pulse repetition interval which is the time betweenpulses. We found that simple thresholds on the likelihood can effectively determine ifsamples are out-of-distribution or belong to one of the classes trained on. In addition,we present a simple method that can be used for deinterleaving/pulse classification andshow that it can robustly classify 100 interleaved signals and simultaneously determineif pulses are out-of-distribution. / Det här examensarbetet undersöker hur en maskininlärnings-modell kan anpassas för attkänna igen när pulserade radar-signaler inte tillhör samma fördelning som modellen är tränadmed men också känna igen om signalen tillhör en tidigare känd klass. Klassifieringsmodellensom används här är en naiv Bayesiansk klassifierare som använder sig av Gaussian mixturemodels och Dirichlet Process mixture models. Modellen skapar en fördelning av tidsseriedatan för pulserade radar-signaler och specifikt för frekvensen av varje puls, pulsens längd och tiden till nästa puls. Genom att sätta gränser i sannolikheten av varje puls eller sannolikhetenav en sekvens kan vi känna igen om datan är okänd eller tillhör en tidigare känd klass.Vi presenterar även en enkel metod för att klassifiera specifika pulser i sammanhang närflera signaler överlappar och att metoden kan användas för att robust avgöra om pulser ärokända.
|
33 |
Longitudinal Clustering via Mixtures of Multivariate Power Exponential DistributionsPatel, Nidhi January 2016 (has links)
A mixture model approach for clustering longitudinal data is introduced. The approach, which is based on mixtures of multivariate power exponential distributions, allows for varying tail-weight and peakedness in data. In the longitudinal setting, this corresponds to more or less concentration around the most central time course in a component. The models utilize a modified Cholesky decomposition of the component scale matrices and the associated maximum likelihood estimators are derived via a generalized expectation-maximization algorithm. / Thesis / Master of Science (MSc)
|
34 |
On Fractionally-Supervised Classification: Weight Selection and Extension to the Multivariate t-DistributionGallaugher, Michael P.B. January 2017 (has links)
Recent work on fractionally-supervised classification (FSC), an approach that allows classification to be carried out with a fractional amount of weight given to the unla- belled points, is extended in two important ways. First, and of fundamental impor- tance, the question over how to choose the amount of weight given to the unlabelled points is addressed. Then, the FSC approach is extended to mixtures of multivariate t-distributions. The first extension is essential because it makes FSC more readily applicable to real problems. The second, although less fundamental, demonstrates the efficacy of FSC beyond Gaussian mixture models. / Thesis / Master of Science (MSc)
|
35 |
Analysis of Otolith Microchemistry Using Bayesian Hierarchical Mixture ModelsPflugeisen, Bethann Mangel 01 September 2010 (has links)
No description available.
|
36 |
Topics in One-Way Supervised Biclustering Using Gaussian Mixture ModelsWong, Monica January 2017 (has links)
Cluster analysis identifies homogeneous groups that are relevant within a population. In model-based clustering, group membership is estimated using a parametric finite mixture model, commonly the mathematically tractable Gaussian mixture model. One-way clustering methods can be restrictive in cases where there are suspected relationships between the variables in each component, leading to the idea of biclustering, which refers to clustering both observations and variables simultaneously. When the relationships between the variables are known, biclustering becomes one-way supervised. To this end, this thesis focuses on a novel one-way supervised biclustering family based on the Gaussian mixture model. In cases where biclustering may be overestimating the number of components in the data, a model averaging technique utilizing Occam's window is applied to produce better clustering results. Automatic outlier detection is introduced into the biclustering family using mixtures of contaminated Gaussian mixture models. Algorithms for model-fitting and parameter estimation are presented for the techniques described in this thesis, and simulation and real data studies are used to assess their performance. / Thesis / Doctor of Philosophy (PhD)
|
37 |
Dimensionality Reduction with Non-Gaussian MixturesTang, Yang 11 1900 (has links)
Broadly speaking, cluster analysis is the organization of a data set into meaningful groups and mixture model-based clustering is recently receiving a wide interest in statistics. Historically, the Gaussian mixture model has dominated the model-based clustering literature. When model-based clustering is performed on a large number of observed variables, it is well known that Gaussian mixture models can represent an over-parameterized solution. To this end, this thesis focuses on the development of novel non-Gaussian mixture models for high-dimensional continuous and categorical data. We developed a mixture of joint generalized hyperbolic models (JGHM), which exhibits different marginal amounts of tail-weight. Moreover, it takes into account the cluster specific subspace and, therefore, limits the number of parameters to estimate. This is a novel approach, which is applicable to high, and potentially very- high, dimensional spaces and with arbitrary correlation between dimensions. Three different mixture models are developed using forms of the mixture of latent trait models to realize model-based clustering of high-dimensional binary data. A family of mixture of latent trait models with common slope parameters are developed to reduce the number of parameters to be estimated. This approach facilitates a low-dimensional visual representation of the clusters. We further developed the penalized latent trait models to facilitate ultra high dimensional binary data which performs automatic variable selection as well. For all models and families of models developed in this thesis, the algorithms used for model-fitting and parameter estimation are presented. Real and simulated data sets are used to assess the clustering ability of the models. / Thesis / Doctor of Philosophy (PhD)
|
38 |
Discriminant Analysis for Longitudinal DataMatira, Kevin January 2017 (has links)
Various approaches for discriminant analysis of longitudinal data are investigated,
with some focus on model-based approaches. The latter are typically based on the
modi ed Cholesky decomposition of the covariance matrix in a Gaussian mixture;
however, non-Gaussian mixtures are also considered. Where applicable, the Bayesian
information criterion is used to select the number of components per class. The
various approaches are demonstrated on real and simulated data. / Thesis / Master of Science (MSc)
|
39 |
Robust Bayesian Anomaly Detection Methods for Large Scale Sensor SystemsMerkes, Sierra Nicole 12 September 2022 (has links)
Sensor systems, such as modern wind tunnels, require continual monitoring to validate their quality, as corrupted data will increase both experimental downtime and budget and lead to inconclusive scientific and engineering results. One approach to validate sensor quality is monitoring individual sensor measurements' distribution. Although, in general settings, we do not know how to correct measurements should be distributed for each sensor system. Instead of monitoring sensors individually, our approach relies on monitoring the co-variation of the entire network of sensor measurements, both within and across sensor systems. That is, by monitoring how sensors behave, relative to each other, we can detect anomalies expeditiously. Previous monitoring methodologies, such as those based on Principal Component Analysis, can be heavily influenced by extremely outlying sensor anomalies. We propose two Bayesian mixture model approaches that utilize heavy-tailed Cauchy assumptions. First, we propose a Robust Bayesian Regression, which utilizes a scale-mixture model to induce a Cauchy regression. Second, we extend elements of the Robust Bayesian Regression methodology using additive mixture models that decompose the anomalous and non-anomalous sensor readings into two parametric compartments. Specifically, we use a non-local, heavy-tailed Cauchy component for isolating the anomalous sensor readings, which we refer to as the Modified Cauchy Net. / Doctor of Philosophy / Sensor systems, such as modern wind tunnels, require continual monitoring to validate their quality, as corrupted data will increase both experimental downtime and budget and lead to inconclusive scientific and engineering results. One approach to validate sensor quality is monitoring individual sensor measurements' distribution. Although, in general settings, we do not know how to correct measurements should be distributed for each sensor system. Instead of monitoring sensors individually, our approach relies on monitoring the co-variation of the entire network of sensor measurements, both within and across sensor systems. That is, by monitoring how sensors behave, relative to each other, we can detect anomalies expeditiously. We proposed two Bayesian monitoring approaches called the Robust Bayesian Regression and Modified Cauchy Net, which provide flexible, tunable models for detecting anomalous sensors with the historical data containing anomalous observations.
|
40 |
Attrition in Studies of Cognitive Aging / Bortfall i studier av kognitivt åldrandeJosefsson, Maria January 2013 (has links)
Longitudinal studies of cognition are preferred to cross-sectional stud- ies, since they offer a direct assessment of age-related cognitive change (within-person change). Statistical methods for analyzing age-related change are widely available. There are, however, a number of challenges accompanying such analyzes, including cohort differences, ceiling- and floor effects, and attrition. These difficulties challenge the analyst and puts stringent requirements on the statistical method being used. The objective of Paper I is to develop a classifying method to study discrepancies in age-related cognitive change. The method needs to take into account the complex issues accompanying studies of cognitive aging, and specifically work out issues related to attrition. In a second step, we aim to identify predictors explaining stability or decline in cognitive performance in relation to demographic, life-style, health-related, and genetic factors. In the second paper, which is a continuation of Paper I, we investigate brain characteristics, structural and functional, that differ between suc- cessful aging elderly and elderly with an average cognitive performance over 15-20 years. In Paper III we develop a Bayesian model to estimate the causal effect of living arrangement (living alone versus living with someone) on cog- nitive decline. The model must balance confounding variables between the two living arrangement groups as well as account for non-ignorable attrition. This is achieved by combining propensity score matching with a pattern mixture model for longitudinal data. In paper IV, the objective is to adapt and implement available impu- tation methods to longitudinal fMRI data, where some subjects are lost to follow-up. We apply these missing data methods to a real dataset, and evaluate these methods in a simulation study.
|
Page generated in 0.0401 seconds