Spelling suggestions: "subject:"fixture distribution"" "subject:"ixture distribution""
1 |
Modeling Distributions of Test Scores with Mixtures of Beta DistributionsFeng, Jingyu 08 November 2005 (has links) (PDF)
Test score distributions are used to make important instructional decisions about students. The test scores usually do not follow a normal distribution. In some cases, the scores appear to follow a bimodal distribution that can be modeled with a mixture of beta distributions. This bimodality may be due different levels of students' ability. The purpose of this study was to develop and apply statistical techniques for fitting beta mixtures and detecting bimodality in test score distributions. Maximum likelihood and Bayesian methods were used to estimate the five parameters of the beta mixture distribution for scores in four quizzes in a cell biology class at Brigham Young University. The mixing proportion was examined to draw conclusions about bimodality. We were successful in fitting the beta mixture to the data, but the methods were only partially successful in detecting bimodality.
|
2 |
Optimal Subsampling of Finite Mixture DistributionNeupane, Binod Prasad 05 1900 (has links)
<p> A mixture distribution is a compounding of statistical distributions, which arises when sampling from heterogeneous populations with a different probability density function in each component. A finite mixture has a finite number of components. In the past decade the extent and the potential of the applications of finite mixture models have widened considerably.</p> <p> The objective of this project is to add some functionalities to a package 'mixdist' developed by Du and Macdonald (Du 2002) and Gao (2004) in the R environment (R Development Core Team 2004) for estimating the parameters of a finite mixture distribution with data grouped in bins and conditional data. Mixed data together with conditional data will provide better estimates of parameters than do mixed data alone. Our main objective is to obtain the optimal sample size for each bin of the mixed data to obtain conditional data, given approximate values of parameters and the distributional form of the mixture for the given data. We have also replaced the dependence of the function mix upon the optimizer nlm to optimizer optim to provide the limits to the parameters.</p> <p> Our purpose is to provide easily available tools to modeling fish growth using mixture distribution. However, it has a number of applications in other areas as well.</p> / Thesis / Master of Science (MSc)
|
3 |
Density estimates of monarch butterflies overwintering in central MexicoThogmartin, Wayne E., Diffendorfer, Jay E., López-Hoffman, Laura, Oberhauser, Karen, Pleasants, John, Semmens, Brice X., Semmens, Darius, Taylor, Orley R., Wiederholt, Ruscena 26 April 2017 (has links)
Given the rapid population decline and recent petition for listing of the monarch butterfly (Danaus plexippus L.) under the Endangered Species Act, an accurate estimate of the Eastern, migratory population size is needed. Because of difficulty in counting individual monarchs, the number of hectares occupied by monarchs in the overwintering area is commonly used as a proxy for population size, which is then multiplied by the density of individuals per hectare to estimate population size. There is, however, considerable variation in published estimates of overwintering density, ranging from 6.9-60.9 million ha(-1). We develop a probability distribution for overwinter density of monarch butterflies from six published density estimates. The mean density among the mixture of the six published estimates was similar to 27.9 million butterflies ha(-1) (95% CI [2.4-80.7] million ha(-1)); the mixture distribution is approximately log-normal, and as such is better represented by the median (21.1 million butterflies ha(-1)). Based upon assumptions regarding the number of milkweed needed to support monarchs, the amount of milkweed (Asciepias spp.) lost (0.86 billion stems) in the northern US plus the amount of milkweed remaining (1.34 billion stems), we estimate >1.8 billion stems is needed to return monarchs to an average population size of 6 ha. Considerable uncertainty exists in this required amount of milkweed because of the considerable uncertainty occurring in overwinter density estimates. Nevertheless, the estimate is on the same order as other published estimates, The studies included in our synthesis differ substantially by year, location, method, and measures of precision. A better understanding of the factors influencing overwintering density across space and time would be valuable for increasing the precision of conservation recommendations.
|
4 |
Aplikace zobecněného lineárního modelu na směsi pravděpodobnostních rozdělení / Application of generalized linear model for mixture distributionsPokorný, Pavel January 2009 (has links)
This thesis is intent on using mixtures of probability distributions in generalized linear model. The theoretical part is divided into two parts. In the first chapter a generalized linear model (GLM) is defined as an alternative to the classical linear regression model. The second chapter describes the mixture of probability distributions and estimate of their parameters. At the end of the second chapter, the previous theories are connected into the finite mixture generalized linear model. The last third part is practical and shows concrete examples of these models.
|
5 |
Contributions to Distributed Detection and Estimation over Sensor NetworksWhipps, Gene Thomas January 2017 (has links)
No description available.
|
6 |
Three essays on agricultural and catastrophic risk managementChen, Shu-Ling 07 June 2007 (has links)
No description available.
|
7 |
Lois a priori non-informatives et la modélisation par mélange / Non-informative priors and modelization by mixturesKamary, Kaniav 15 March 2016 (has links)
L’une des grandes applications de la statistique est la validation et la comparaison de modèles probabilistes au vu des données. Cette branche des statistiques a été développée depuis la formalisation de la fin du 19ième siècle par des pionniers comme Gosset, Pearson et Fisher. Dans le cas particulier de l’approche bayésienne, la solution à la comparaison de modèles est le facteur de Bayes, rapport des vraisemblances marginales, quelque soit le modèle évalué. Cette solution est obtenue par un raisonnement mathématique fondé sur une fonction de coût.Ce facteur de Bayes pose cependant problème et ce pour deux raisons. D’une part, le facteur de Bayes est très peu utilisé du fait d’une forte dépendance à la loi a priori (ou de manière équivalente du fait d’une absence de calibration absolue). Néanmoins la sélection d’une loi a priori a un rôle vital dans la statistique bayésienne et par conséquent l’une des difficultés avec la version traditionnelle de l’approche bayésienne est la discontinuité de l’utilisation des lois a priori impropres car ils ne sont pas justifiées dans la plupart des situations de test. La première partie de cette thèse traite d’un examen général sur les lois a priori non informatives, de leurs caractéristiques et montre la stabilité globale des distributions a posteriori en réévaluant les exemples de [Seaman III 2012]. Le second problème, indépendant, est que le facteur de Bayes est difficile à calculer à l’exception des cas les plus simples (lois conjuguées). Une branche des statistiques computationnelles s’est donc attachée à résoudre ce problème, avec des solutions empruntant à la physique statistique comme la méthode du path sampling de [Gelman 1998] et à la théorie du signal. Les solutions existantes ne sont cependant pas universelles et une réévaluation de ces méthodes suivie du développement de méthodes alternatives constitue une partie de la thèse. Nous considérons donc un nouveau paradigme pour les tests bayésiens d’hypothèses et la comparaison de modèles bayésiens en définissant une alternative à la construction traditionnelle de probabilités a posteriori qu’une hypothèse est vraie ou que les données proviennent d’un modèle spécifique. Cette méthode se fonde sur l’examen des modèles en compétition en tant que composants d’un modèle de mélange. En remplaçant le problème de test original avec une estimation qui se concentre sur le poids de probabilité d’un modèle donné dans un modèle de mélange, nous analysons la sensibilité sur la distribution a posteriori conséquente des poids pour divers modélisation préalables sur les poids et soulignons qu’un intérêt important de l’utilisation de cette perspective est que les lois a priori impropres génériques sont acceptables, tout en ne mettant pas en péril la convergence. Pour cela, les méthodes MCMC comme l’algorithme de Metropolis-Hastings et l’échantillonneur de Gibbs et des approximations de la probabilité par des méthodes empiriques sont utilisées. Une autre caractéristique de cette variante facilement mise en œuvre est que les vitesses de convergence de la partie postérieure de la moyenne du poids et de probabilité a posteriori correspondant sont assez similaires à la solution bayésienne classique / One of the major applications of statistics is the validation and comparing probabilistic models given the data. This branch statistics has been developed since the formalization of the late 19th century by pioneers like Gosset, Pearson and Fisher. In the special case of the Bayesian approach, the comparison solution of models is the Bayes factor, ratio of marginal likelihoods, whatever the estimated model. This solution is obtained by a mathematical reasoning based on a loss function. Despite a frequent use of Bayes factor and its equivalent, the posterior probability of models, by the Bayesian community, it is however problematic in some cases. First, this rule is highly dependent on the prior modeling even with large datasets and as the selection of a prior density has a vital role in Bayesian statistics, one of difficulties with the traditional handling of Bayesian tests is a discontinuity in the use of improper priors since they are not justified in most testing situations. The first part of this thesis deals with a general review on non-informative priors, their features and demonstrating the overall stability of posterior distributions by reassessing examples of [Seaman III 2012].Beside that, Bayes factors are difficult to calculate except in the simplest cases (conjugate distributions). A branch of computational statistics has therefore emerged to resolve this problem with solutions borrowing from statistical physics as the path sampling method of [Gelman 1998] and from signal processing. The existing solutions are not, however, universal and a reassessment of the methods followed by alternative methods is a part of the thesis. We therefore consider a novel paradigm for Bayesian testing of hypotheses and Bayesian model comparison. The idea is to define an alternative to the traditional construction of posterior probabilities that a given hypothesis is true or that the data originates from a specific model which is based on considering the models under comparison as components of a mixture model. By replacing the original testing problem with an estimation version that focus on the probability weight of a given model within a mixture model, we analyze the sensitivity on the resulting posterior distribution of the weights for various prior modelings on the weights and stress that a major appeal in using this novel perspective is that generic improper priors are acceptable, while not putting convergence in jeopardy. MCMC methods like Metropolis-Hastings algorithm and the Gibbs sampler are used. From a computational viewpoint, another feature of this easily implemented alternative to the classical Bayesian solution is that the speeds of convergence of the posterior mean of the weight and of the corresponding posterior probability are quite similar.In the last part of the thesis we construct a reference Bayesian analysis of mixtures of Gaussian distributions by creating a new parameterization centered on the mean and variance of those models itself. This enables us to develop a genuine non-informative prior for Gaussian mixtures with an arbitrary number of components. We demonstrate that the posterior distribution associated with this prior is almost surely proper and provide MCMC implementations that exhibit the expected component exchangeability. The analyses are based on MCMC methods as the Metropolis-within-Gibbs algorithm, adaptive MCMC and the Parallel tempering algorithm. This part of the thesis is followed by the description of R package named Ultimixt which implements a generic reference Bayesian analysis of unidimensional mixtures of Gaussian distributions obtained by a location-scale parameterization of the model. This package can be applied to produce a Bayesian analysis of Gaussian mixtures with an arbitrary number of components, with no need to specify the prior distribution.
|
8 |
Optimisation stochastique avec contraintes en probabilités et applications / Chance constrained problem and its applicationsPeng, Shen 17 June 2019 (has links)
L'incertitude est une propriété naturelle des systèmes complexes. Les paramètres de certains modèles peuvent être imprécis; la présence de perturbations aléatoires est une source majeure d'incertitude pouvant avoir un impact important sur les performances du système. Dans cette thèse, nous étudierons les problèmes d’optimisation avec contraintes en probabilités dans les cas suivants : Tout d’abord, nous passons en revue les principaux résultats relatifs aux contraintes en probabilités selon trois perspectives: les problèmes liés à la convexité, les reformulations et les approximations de ces contraintes, et le cas de l’optimisation distributionnellement robuste. Pour les problèmes d’optimisation géométriques, nous étudions les programmes avec contraintes en probabilités jointes. A l’aide d’hypothèses d’indépendance des variables aléatoires elliptiquement distribuées, nous déduisons une reformulation des programmes avec contraintes géométriques rectangulaires jointes. Comme la reformulation n’est pas convexe, nous proposons de nouvelles approximations convexes basées sur la transformation des variables ainsi que des méthodes d’approximation linéaire par morceaux. Nos résultats numériques montrent que nos approximations sont asymptotiquement serrées. Lorsque les distributions de probabilité ne sont pas connues à l’avance, le calcul des bornes peut être très utile. Par conséquent, nous développons quatre bornes supérieures pour les contraintes probabilistes individuelles, et jointes dont les vecteur-lignes de la matrice des contraintes sont indépendantes. Sur la base des inégalités de Chebyshev, Chernoff, Bernstein et de Hoeffding, nous proposons des approximations déterministes. Des conditions suffisantes de convexité. Pour réduire la complexité des calculs, nous reformulons les approximations sous forme de problèmes d'optimisation convexes solvables basés sur des approximations linéaires et tangentielles par morceaux. Enfin, des expériences numériques sont menées afin de montrer la qualité des approximations étudiées sur des données aléatoires. Dans certains systèmes complexes, la distribution des paramètres aléatoires n’est que partiellement connue. Pour traiter les incertitudes dans ces cas, nous proposons un ensemble d'incertitude basé sur des données obtenues à partir de distributions mixtes. L'ensemble d'incertitude est construit dans la perspective d'estimer simultanément des moments d'ordre supérieur. Ensuite, nous proposons une reformulation du problème robuste avec contraintes en probabilités en utilisant des données issues d’échantillonnage. Comme la reformulation n’est pas convexe, nous proposons des approximations convexes serrées basées sur la méthode d’approximation linéaire par morceaux sous certaines conditions. Pour le cas général, nous proposons une approximation DC pour dériver une borne supérieure et une approximation convexe relaxée pour dériver une borne inférieure pour la valeur de la solution optimale du problème initial. Enfin, des expériences numériques sont effectuées pour montrer que les approximations proposées sont efficaces. Nous considérons enfin un jeu stochastique à n joueurs non-coopératif. Lorsque l'ensemble de stratégies de chaque joueur contient un ensemble de contraintes linéaires stochastiques, nous modélisons ces contraintes sous la forme de contraintes en probabilité jointes. Pour chaque joueur, nous formulons les contraintes en probabilité dont les variables aléatoires sont soit normalement distribuées, soit elliptiquement distribuées, soit encore définies dans le cadre de l’optimisation distributionnellement robuste. Sous certaines conditions, nous montrons l’existence d’un équilibre de Nash pour ces jeux stochastiques. / Chance constrained optimization is a natural and widely used approaches to provide profitable and reliable decisions under uncertainty. And the topics around the theory and applications of chance constrained problems are interesting and attractive. However, there are still some important issues requiring non-trivial efforts to solve. In view of this, we will systematically investigate chance constrained problems from the following perspectives. As the basis for chance constrained problems, we first review some main research results about chance constraints in three perspectives: convexity of chance constraints, reformulations and approximations for chance constraints and distributionally robust chance constraints. For stochastic geometric programs, we formulate consider a joint rectangular geometric chance constrained program. With elliptically distributed and pairwise independent assumptions for stochastic parameters, we derive a reformulation of the joint rectangular geometric chance constrained programs. As the reformulation is not convex, we propose new convex approximations based on the variable transformation together with piecewise linear approximation methods. Our numerical results show that our approximations are asymptotically tight. When the probability distributions are not known in advance or the reformulation for chance constraints is hard to obtain, bounds on chance constraints can be very useful. Therefore, we develop four upper bounds for individual and joint chance constraints with independent matrix vector rows. Based on the one-side Chebyshev inequality, Chernoff inequality, Bernstein inequality and Hoeffding inequality, we propose deterministic approximations for chance constraints. In addition, various sufficient conditions under which the aforementioned approximations are convex and tractable are derived. To reduce further computational complexity, we reformulate the approximations as tractable convex optimization problems based on piecewise linear and tangent approximations. Finally, based on randomly generated data, numerical experiments are discussed in order to identify the tight deterministic approximations. In some complex systems, the distribution of the random parameters is only known partially. To deal with the complex uncertainties in terms of the distribution and sample data, we propose a data-driven mixture distribution based uncertainty set. The data-driven mixture distribution based uncertainty set is constructed from the perspective of simultaneously estimating higher order moments. Then, with the mixture distribution based uncertainty set, we derive a reformulation of the data-driven robust chance constrained problem. As the reformulation is not a convex program, we propose new and tight convex approximations based on the piecewise linear approximation method under certain conditions. For the general case, we propose a DC approximation to derive an upper bound and a relaxed convex approximation to derive a lower bound for the optimal value of the original problem, respectively. We also establish the theoretical foundation for these approximations. Finally, simulation experiments are carried out to show that the proposed approximations are practical and efficient. We consider a stochastic n-player non-cooperative game. When the strategy set of each player contains a set of stochastic linear constraints, we model the stochastic linear constraints of each player as a joint chance constraint. For each player, we assume that the row vectors of the matrix defining the stochastic constraints are pairwise independent. Then, we formulate the chance constraints with the viewpoints of normal distribution, elliptical distribution and distributionally robustness, respectively. Under certain conditions, we show the existence of a Nash equilibrium for the stochastic game.
|
9 |
From OLS to Multilevel Multidimensional Mixture IRT: A Model Refinement Approach to Investigating Patterns of Relationships in PISA 2012 DataGurkan, Gulsah January 2021 (has links)
Thesis advisor: Henry I. Braun / Secondary analyses of international large-scale assessments (ILSA) commonly characterize relationships between variables of interest using correlations. However, the accuracy of correlation estimates is impaired by artefacts such as measurement error and clustering. Despite advancements in methodology, conventional correlation estimates or statistical models not addressing this problem are still commonly used when analyzing ILSA data. This dissertation examines the impact of both the clustered nature of the data and heterogeneous measurement error on the correlations reported between background data and proficiency scales across countries participating in ILSA. In this regard, the operating characteristics of competing modeling techniques are explored by means of applications to data from PISA 2012. Specifically, the estimates of correlations between math self-efficacy and math achievement across countries are the principal focus of this study. Sequentially employing four different statistical techniques, a step-wise model refinement approach is used. After each step, the changes in the within-country correlation estimates are examined in relation to (i) the heterogeneity of distributions, (ii) the amount of measurement error, (iii) the degree of clustering, and (iv) country-level math performance. The results show that correlation estimates gathered from two-dimensional IRT models are more similar across countries in comparison to conventional and multilevel linear modeling estimates. The strength of the relationship between math proficiency and math self-efficacy is moderated by country mean math proficiency and this was found to be consistent across all four models even when measurement error and clustering were taken into account. Multilevel multidimensional mixture IRT modeling results support the hypothesis that low-performing groups within countries have a lower correlation between math self-efficacy and math proficiency. A weaker association between math self-efficacy and math proficiency in lower achieving groups is consistently seen across countries. A multilevel mixture IRT modeling approach sheds light on how this pattern emerges from greater randomness in the responses of lower performing groups. The findings from this study demonstrate that advanced modeling techniques not only are more appropriate given the characteristics of the data, but also provide greater insight about the patterns of relationships across countries. / Thesis (PhD) — Boston College, 2021. / Submitted to: Boston College. Lynch School of Education. / Discipline: Educational Research, Measurement and Evaluation.
|
10 |
Inference for Birnbaum-Saunders, Laplace and Some Related Distributions under Censored DataZhu, Xiaojun 06 May 2015 (has links)
The Birnbaum-Saunders (BS) distribution is a positively skewed distribution and is a popular model for analyzing lifetime data. In this thesis, we first develop an improved method of estimation for the BS distribution and the corresponding inference. Compared to the maximum likelihood estimators (MLEs) and the modified moment estimators (MMEs), the proposed method results in estimators with smaller bias, but having the same mean squared errors (MSEs) as these two estimators. Next, the existence and uniqueness of the MLEs of the parameters of BS distribution are discussed based on Type-I, Type-II and hybrid censored samples. In the case of five-parameter bivariate Birnbaum-Saunders (BVBS) distribution, we use the distributional relationship between the bivariate normal and BVBS distributions to propose a simple and efficient method of estimation based on Type-II censored samples. Regression analysis is commonly used in the analysis of life-test data when
some covariates are involved. For this reason, we consider the regression problem based on BS and BVBS distributions and develop the associated inferential methods.
One may generalize the BS distribution by using Laplace kernel in place of the normal kernel, referred to as the Laplace BS (LBS) distribution, and it is one of the generalized Birnbaum-Saunders (GBS) distributions. Since the LBS distribution has a close relationship with the Laplace distribution, it becomes necessary to first carry out a detailed study of inference for the Laplace distribution before studying the LBS distribution. Several inferential results have been developed in the literature for the Laplace distribution based on complete samples. However, research on Type-II censored samples is somewhat scarce and in fact there is no work on Type-I censoring. For this reason, we first start with MLEs of the location and scale parameters of Laplace distribution based on Type-II and Type-I censored samples. In the case of Type-II censoring, we derive the exact joint and marginal moment generating functions (MGF) of the MLEs. Then, using these expressions, we derive the exact conditional marginal and joint density functions of the MLEs and utilize them to develop exact confidence intervals (CIs) for some life parameters of interest. In the case of Type-I censoring, we first derive explicit expressions for the MLEs of the parameters, and then derive the exact conditional joint and marginal MGFs and use them to derive the exact conditional marginal and joint density functions of the MLEs. These densities are used in turn to develop marginal and joint CIs for some quantities of interest.
Finally, we consider the LBS distribution and formally show the different kinds of shapes of the probability density function (PDF) and the hazard function. We then derive the MLEs of the parameters and prove that they always exist and are unique. Next, we propose the MMEs, which can be used as initial values in the numerical computation of the MLEs. We also discuss the interval estimation of parameters. / Thesis / Doctor of Science (PhD)
|
Page generated in 0.1015 seconds