Spelling suggestions: "subject:"dirichlet prior"" "subject:"dirichlet rior""
1 |
On the Dirichlet Prior and Bayesian RegularizationSteck, Harald, Jaakkola, Tommi S. 01 September 2002 (has links)
A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from high-throughput data sources. In this paper we examine how Bayesian regularization using a Dirichlet prior over the model parameters affects the learned model structure in a domain with discrete variables. Surprisingly, a weak prior in the sense of smaller equivalent sample size leads to a strong regularization of the model structure (sparse graph) given a sufficiently large data set. In particular, the empty graph is obtained in the limit of a vanishing strength of prior belief. This is diametrically opposite to what one may expect in this limit, namely the complete graph from an (unregularized) maximum likelihood estimate. Since the prior affects the parameters as expected, the prior strength balances a "trade-off" between regularizing the parameters or the structure of the model. We demonstrate the benefits of optimizing this trade-off in the sense of predictive accuracy.
|
2 |
Model-based clustering based on sparse finite Gaussian mixturesMalsiner-Walli, Gertraud, Frühwirth-Schnatter, Sylvia, Grün, Bettina January 2016 (has links) (PDF)
In the framework of Bayesian model-based clustering based on a finite mixture of Gaussian distributions, we present a joint approach to estimate the number of mixture components and identify cluster-relevant variables simultaneously as well as to obtain an identified model. Our approach consists in specifying sparse hierarchical priors on the mixture weights and component means. In a deliberately overfitting mixture model the sparse prior on the weights empties superfluous components during MCMC. A straightforward estimator for the true number of components is given by the most frequent number of non-empty components visited during MCMC sampling. Specifying a shrinkage prior, namely the normal gamma prior, on the component means leads to improved parameter estimates as well as identification of cluster-relevant variables. After estimating the mixture model using MCMC methods based on data augmentation and Gibbs sampling, an identified model is obtained by relabeling the MCMC output in the point process representation of the draws. This is performed using K-centroids cluster analysis based on the Mahalanobis distance. We evaluate our proposed strategy in a simulation setup with artificial data and by applying it to benchmark data sets. (authors' abstract)
|
3 |
From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clusteringFrühwirth-Schnatter, Sylvia, Malsiner-Walli, Gertraud January 2019 (has links) (PDF)
In model-based clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al. (Stat Comput 26:303-324, 2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with K components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than K with high probability. The number of clusters is then inferred a posteriori from the data. The present paper makes the following contributions in the context of sparse finite mixture modelling. First, it is illustrated that the concept of sparse finite mixture is very generic and easily extended to cluster various types of non-Gaussian data, in particular discrete data and continuous multivariate data arising from non-Gaussian clusters. Second, sparse finite mixtures are compared to Dirichlet process mixtures with respect to their ability to identify the number of clusters. For both model classes, a random hyper prior is considered for the parameters determining the weight distribution. By suitable matching of these priors, it is shown that the choice of this hyper prior is far more influential on the cluster solution than whether a sparse finite mixture or a Dirichlet process mixture is taken into consideration.
|
4 |
Identifying mixtures of mixtures using Bayesian estimationMalsiner-Walli, Gertraud, Frühwirth-Schnatter, Sylvia, Grün, Bettina January 2017 (has links) (PDF)
The use of a finite mixture of normal distributions in model-based clustering allows to
capture non-Gaussian data clusters. However, identifying the clusters from the normal components
is challenging and in general either achieved by imposing constraints on the model or
by using post-processing procedures.
Within the Bayesian framework we propose a different approach based on sparse finite
mixtures to achieve identifiability. We specify a hierarchical prior where the hyperparameters
are carefully selected such that they are reflective of the cluster structure aimed at. In addition,
this prior allows to estimate the model using standard MCMC sampling methods. In combination
with a post-processing approach which resolves the label switching issue and results in
an identified model, our approach allows to simultaneously (1) determine the number of clusters,
(2) flexibly approximate the cluster distributions in a semi-parametric way using finite
mixtures of normals and (3) identify cluster-specific parameters and classify observations. The
proposed approach is illustrated in two simulation studies and on benchmark data sets.
|
5 |
Verallgemeinerte Maximum-Likelihood-Methoden und der selbstinformative GrenzwertJohannes, Jan 16 December 2002 (has links)
Es sei X eine Zufallsvariable mit unbekannter Verteilung P. Zu den Hauptaufgaben der Mathematischen Statistik zählt die Konstruktion von Schätzungen für einen abgeleiteten Parameter theta(P) mit Hilfe einer Beobachtung X=x. Im Fall einer dominierten Verteilungsfamilie ist es möglich, das Maximum-Likelihood-Prinzip (MLP) anzuwenden. Eine Alternative dazu liefert der Bayessche Zugang. Insbesondere erweist sich unter Regularitätsbedingungen, dass die Maximum-Likelihood-Schätzung (MLS) dem Grenzwert einer Folge von Bayesschen Schätzungen (BSen) entspricht. Eine BS kann aber auch im Fall einer nicht dominierten Verteilungsfamilie betrachtet werden, was als Ansatzpunkt zur Erweiterung des MLPs genutzt werden kann. Weiterhin werden zwei Ansätze einer verallgemeinerten MLS (vMLS) von Kiefer und Wolfowitz sowie von Gill vorgestellt. Basierend auf diesen bekannten Ergebnissen definieren wir einen selbstinformativen Grenzwert und einen selbstinformativen a posteriori Träger. Im Spezialfall einer dominierten Verteilungsfamilie geben wir hinreichende Bedingungen an, unter denen die Menge der MLSen einem selbstinformativen a posteriori Träger oder, falls die MLS eindeutig ist, einem selbstinformativen Grenzwert entspricht. Das Ergebnis für den selbstinformativen a posteriori Träger wird dann auf ein allgemeineres Modell ohne dominierte Verteilungsfamilie erweitert. Insbesondere wird gezeigt, dass die Menge der vMLSen nach Kiefer und Wolfowitz ein selbstinformativer a posteriori Träger ist. Weiterhin wird der selbstinformative Grenzwert bzw. a posteriori Träger in einem Modell mit nicht identifizierbarem Parameter bestimmt. Im Mittelpunkt dieser Arbeit steht ein multivariates semiparametrisches lineares Modell. Zunächst weisen wir jedoch nach, dass in einem rein nichtparametrischen Modell unter der a priori Annahme eines Dirichlet Prozesses der selbstinformative Grenzwert existiert und mit der vMLS nach Kiefer und Wolfowitz sowie der nach Gill übereinstimmt. Anschließend untersuchen wir das multivariate semiparametrische lineare Modell und bestimmen die vMLSen nach Kiefer und Wolfowitz bzw. nach Gill sowie den selbstinformativen Grenzwert unter der a priori Annahme eines Dirichlet Prozesses und einer Normal-Wishart-Verteilung. Im Allgemeinen sind die so erhaltenen Schätzungen verschieden. Abschließend gehen wir dann auf den Spezialfall eines semiparametrischen Lokationsmodells ein, in dem die vMLSen nach Kiefer und Wolfowitz bzw. nach Gill und der selbstinformative Grenzwert wieder identisch sind. / We assume to observe a random variable X with unknown probability distribution. One major goal of mathematical statistics is the estimation of a parameter theta(P) based on an observation X=x. Under the assumption that P belongs to a dominated family of probability distributions, we can apply the maximum likelihood principle (MLP). Alternatively, the Bayes approach can be used to estimate the parameter. Under some regularity conditions it turns out that the maximum likelihood estimate (MLE) is the limit of a sequence of Bayes estimates (BE's). Note that BE's can even be defined in situations where no dominating measure exists. This allows us to derive an extension of the MLP using the Bayes approach. Moreover, two versions of a generalised MLE (gMLE) are presented, which have been introduced by Kiefer and Wolfowitz and Gill, respectively. Based on the known results, we define a selfinformative limit and a posterior carrier. In the special case of a model with dominated distribution family, we state sufficient conditions under which the set of MLE's is a selfinformative posterior carrier or, in the case of a unique MLE, a selfinformative limit. The result for the posterior carrier is extended to a more general model without dominated distributions. In particular we show that the set of gMLE's of Kiefer and Wolfowitz is a posterior carrier. Furthermore we calculate the selfinformative limit and posterior carrier, respectively, in the case of a model with possibly nonidentifiable parameters. In this thesis we focus on a multivariate semiparametric linear model. At first we show that, in the case of a nonparametric model, the selfinformative limit coincides with the gMLE of Kiefer and Wolfowitz as well as that of Gill, if a Dirichlet process serves as prior. Then we investigate both versions of gMLE's and the selfinformative limit in the multivariate semiparametric linear model, where the prior for the latter estimator is given by a Dirichlet process and a normal-Wishart distribution. In general the estimators are not identical. However, in the special case of a location model we find again that the three considered estimates coincide.
|
6 |
Inference for Gamma Frailty Models based on One-shot Device DataYu, Chenxi January 2024 (has links)
A device that is accompanied by an irreversible chemical reaction or physical destruction and could no longer function properly after performing its intended function is referred to as a one-shot device. One-shot device test data differ from typical data obtained by measuring lifetimes in standard life-tests. Due to the very nature of one-shot devices, actual lifetimes of one-shot devices under test cannot be observed, and they are either left- or right-censored. In addition, a one-shot device often has multiple components that could cause the failure of the device. The components are coupled together in the manufacturing process or assembly, resulting in the failure modes possessing latent heterogeneity and dependence. Frailty models enable us to describe the influence of common, but unobservable covariates, on the hazard function as a random effect in a model and also provide an easily understandable interpretation.
In this thesis, we develop some inferential results for one-shot device testing data with gamma frailty model. We first develop an efficient expectation-maximization (EM) algorithm for determining the maximum likelihood estimates of model parameters of a gamma frailty model with exponential lifetime distributions for components based on one-shot device test data with multiple failure modes, wherein the data are obtained from a constant-stress accelerated life-test. The maximum likelihood estimate of the mean lifetime of $k$-out-of-$M$ structured one-shot devices under normal operating conditions is also presented. In addition, the asymptotic variance–covariance matrix of the maximum likelihood estimates is derived, which is then used to construct asymptotic confidence intervals for the model parameters. The performance of the proposed inferential methods is finally evaluated through Monte Carlo simulations and then illustrated with a numerical example. A gamma frailty model with Weibull baseline hazards is considered next for fitting one-shot device testing data. The Weibull baseline hazards enable us to analyze time-varying failure rates more accurately, allowing for a deeper understanding of the dynamic nature of system's reliability. We develop an EM algorithm for estimating the model parameters utilizing the complete likelihood function. A detailed simulation study evaluates the performance of the Weibull baseline hazard model with that of the exponential baseline hazard model. The introduction of shape parameters in the component's lifetime distribution within the Weibull baseline hazard model offers enhanced flexibility in model fitting. Finally, Bayesian inference is then developed for the gamma frailty model with exponential baseline hazard for one-shot device testing data. We introduce the Bayesian estimation procedure using Markov chain Monte Carlo (MCMC) technique for estimating the model parameters as well as for developing credible intervals for those parameters. The performance of the proposed method is evaluated in a simulation study. Model comparison between independence model and the frailty model is made using Bayesian model selection criterion. / Thesis / Candidate in Philosophy
|
Page generated in 0.0621 seconds