Spelling suggestions: "subject:"istatistical decision."" "subject:"bystatistical decision.""
341 |
Essays on Statistical Decision Theory and EconometricsDe Albuquerque Furtado, Bruno January 2023 (has links)
This dissertation studies statistical decision making in various guises. I start by providing a general decision theoretic model of statistical behavior, and then analyze two particular instances which fit in that framework.
Chapter 1 studies statistical decision theory (SDT), a class of models pioneered by Abraham Wald to analyze how agents use data when making decisions under uncertainty. Despite its prominence in information economics and econometrics, SDT has not been given formal choice-theoretic or behavioral foundations. This chapter axiomatizes preferences over decision rules and experiments for a broad class of SDT models. The axioms show how certain seemingly-natural decision rules are incompatible with this broad class of SDT models. Using those representation result, I then develop a methodology to translate axioms from classical decision-theory, a la Anscombe and Aumann (1963), to the SDT framework. The usefulness of this toolkit is then illustrated by translating various classical axioms, which serve to refine my baseline framework into more specific statistical decision theoretic models, some of which are novel to SDT. I also discuss foundations for SDT under other kinds of choice data.
Chapter 2 studies statistical identifiability of finite mixture models. If a model is not identifiable, multiple combinations of its parameters can lead to the same observed distribution of the data, which greatly complicates, if not invalidates, causal inference based on the model. High-dimensional latent parameter models, which include finite mixtures, are widely used in economics, but are only guaranteed to be identifiable under specific conditions. Since these conditions are usually stated in terms of the hidden parameters of the model, they are seldom testable using noisy data. This chapter provides a condition which, when imposed on the directly observable mixture distribution, guarantees that a finite mixture model is non-parametrically identifiable. Since the condition relates to an observable quantity, it can be used to devise a statistical test of identification for the model. Thus I propose a Bayesian test of whether the model is close to being identified, which the econometrician may apply before estimating the parameters of the model. I also show that, when the model is identifiable, approximate non-negative matrix factorization provides a consistent, likelihood-free estimator of mixture weights.
Chapter 3 studies the robustness of pricing strategies when a firm is uncertain about the distribution of consumers' willingness-to-pay. When the firm has access to data to estimate this distribution, a simple strategy is to implement the mechanism that is optimal for the estimated distribution. We find that such an empirically optimal mechanism boasts strong profit and regret guarantees. Moreover, we provide a toolkit to evaluate the robustness properties of different mechanisms, showing how to consistently estimate and conduct valid inference on the profit generated by any one mechanism, which enables one to evaluate and compare their probabilistic revenue guarantees.
|
342 |
The effect of alternate information structures on probability revisions /Dickhaut, John Wilson January 1970 (has links)
No description available.
|
343 |
Some aspects of dimensionality and sample size problems in statistical pattern recognition /Jain, Anil Kumar January 1973 (has links)
No description available.
|
344 |
Bayesian analysis of particle tracking data using hierarchical models for characterization and designDhatt-Gauthier, Kiran January 2022 (has links)
This dissertation explores the intersection between the fields of colloid science and statistical inference where the stochastic trajectories of colloidal particles are captured by video microscopy, reconstructed using particle tracking algorithms, and analyzed using physics-based models and probabilistic programming techniques. Although these two fields may initially seem disparate, the dynamics of micro- and nano-sized particles dispersed in liquids at room temperature are inherently stochastic due to Brownian motion.
Further, both the particles under observation and their environment are heterogeneous, leading to variability between particles as well. We use Bayesian data analysis to infer the uncertain parameters of physics-based models that describe the observed trajectories, explicitly modeling the hierarchical structure of the noise under a set of varying experimental conditions.
We set the stage in Chapter 1 by introducing Robert Brown's curious observation of incessantly diffusing pollen grains and Albert Einstein's statistical physics model that describes their motion. We analyze Jean Baptiste Perrin's data from Les Atomes using a probabilistic model to infer the uncertain diffusivities of the colloids. We show how the Bayesian paradigm allows us to assign and update our credences, before and after observing this data and quantify the information gained by the observation.
In Chapter 2, we build on these concepts to provide insight on the phenomenon of enhanced enzyme diffusion, whereby enzymes are purported to diffuse faster in the presence of their substrate. We develop a hierarchical model of enzyme diffusion that describes the stochastic dynamics of individual enzymes drawn from a dispersed population. Using this model, we analyze single molecule imaging data of urease enzymes to infer their uncertain diffusivities for different substrate concentrations. Our analysis emphasizes the important role of model criticism for establishing self-consistency between experimental observations and model predictions; moreover, we caution against drawing strong conclusions when such consistency cannot be established.
In Chapter 3, we automate, and optimize the data acquisition process, tuning a resonant acoustic cell using minimal experimental resources. By iterating a cycle of observation, inference, and design, we select the frequency the applied signal and the framerate of the data acquisition, garnering the same amount of information as a grid search approach with a fraction of the data.
Finally, in Chapter 4, we discuss the role of Bayesian inference and design to optimize functional goals and discuss selected examples on where black-box techniques may prove useful. We review the current state of the art for magnetically actuated colloids and pose the search for autonomous magnetic behaviors as a design problem, offering insight as we seek to augment and accelerate the capabilities of micron scale magnetically actuated colloids using modern computational techniques.
|
345 |
Bayesian collocation tempering and generalized profiling for estimation of parameters from differential equation modelsCampbell, David Alexander. January 2007 (has links)
No description available.
|
346 |
The problem of classifying members of a population into groupsFlora, Roger Everette January 1965 (has links)
A model is assumed in which individuals are to be classified into groups as to their "potential” with respect to a given characteristic. For example, one may wish to classify college applicants into groups with respect to their ability to succeed in college. Although actual values for the “potential,” or underlying variable of classification, may be unobservable, it is assumed possible to divide the individuals into groups with respect to this characteristic. Division into groups may be accomplished either by fixing the boundaries of the underlying variable of classification or by fixing the proportion of the individuals which may belong to a given group.
For discriminating among the different groups, a set of measurements is obtained for each individual. In the example above, for instance, classification might be based on test scores achieved by the applicants on a set of tests administered to them.
Since the value of the underlying variable of classification is unobservable, we may assign, in place of this variable, a characteristic random variable to each individual. The characteristic variable will be the same for every member of a given group.
We then consider a choice of characteristic random variable and a linear combination of the observed measurements such that the correlation between the two is a maximum with respect to both the coefficients of the different measurements and the characteristic variable. If a significant correlation is found, one may then use a discriminant for a randomly selected individual the linear combination obtained by using the coefficients found by the above procedure.
In order to facilitate a test of validity for the proposed discriminant function, the distribution of a suitable function of the above correlation coefficient is found under the null hypothesis of no correlation between the underlying variable of classification and the observed measurements. A test procedure based on the statistic for which the null distribution is found is then described.
Special consideration is given in the study to the case of only two classification groups with the proportion of individuals to belong to each group fixed. For this case, in addition to obtaining the null distribution, the distribution of the test statistic is also considered under the alternative hypothesis. Low order momenta of the test criterion are obtained, and the approximate power of the proposed test is found for specific cases of the model by fitting an appropriate density to the moments derived. A general consideration of the power function and its behavior as the sample size increases and as the population multiple correlation between the underlying variable of classification and the observed measurements increases is also investigated.
Finally, the probability of misclassification, or “the problem of shrinkage" as it is often called, is considered. Possible approaches to the problem and aaae of the difficulties in investigating this problem are indicated. / Ph. D.
|
347 |
The Problem of classifying members of a population on a continuous scaleBarnett, Frederic Charles January 1964 (has links)
Having available a vector of measurements for each individual in a random sample from a multivariate population, we assume in addition that these individuals can be ranked on some criterion of interest. As an example of this situation, we may have measured certain physiological characteristics (blood pressure, amounts of certain chemical substances in the blood, etc.) in a random sample of schizophrenics. After a series of treatments (perhaps shock treatments, doses of a tranquillizer, etc.) these individuals might be ranked on the basis of favorable response to treatment. We shall in general be interested in predicting which individuals in a new group would respond most favorably. Thus, in the example, we should wish to know·which individuals would most likely benefit from the series of treatments.
Some difficulties in applying the classical discriminant function analysis to problems of this type are noted.
We have chosen to use the multiple correlation coefficient of ranks with measured variates as a statistic in testing whether ranks are associated with measurements. We give to this coefficient the name "quasi-rank multiple correlation coefficient", and proceed to find its first four exact moments under the assumption that the underlying probability distribution is multivariate normal.
Two methods are used to approximate the power of tests based on the quasi-rank multiple correlation coefficient in the case of just one measured variate. The agreement for a sample size of twenty is quite good.
The asymptotic relative efficiency of the squared quasi-rank coefficient vis-a-vis the squared standard multiple correlation coefficient is 9/π² , a result which does not depend on the number of measured variates.
If the null hypothesis that ranks are not associated with measurements is rejected, it is appropriate to use the measurements in some way to predict the ranks. The quasi-rank multiple correlation coefficient is, however, the maximized simple correlation of ranks with linear combinations of the measured variates. The maximizing linear combination of measured variates is taken as a discriminant function, and its values for subsequently chosen individuals is used to rank these individuals in order of merit.
A demonstration study is included in which we employ a random sample of size twenty from a six-variate normal distribution of known structure (for which the population multiple correlation coefficient is .655). The null hypothesis of no association of ranks with measurements is rejected in a two-sided size .05 test. The discriminant function is obtained and is used to "predict" the true ranks of the twenty individuals in the sample. The predicted ranks represent the true ranks rather well, with no predicted rank more than four places from the true rank. For other populations in which the population multiple correlation coefficient is greater than .655 we should expect to obtain even better sets of predicted ranks.
In developing the moments of the quasi-rank multiple correlation coefficient it was necessary to obtain exact moments of a certain linear combination of quasi-ranges in a random sample from a normal population. Since this quasi-range statistic may be useful in other investigations, we include also its moment generating function and some derivatives of this moment generating function. / Ph. D.
|
348 |
Investigation of the rate of convergence in the two sample nonparametric empirical Bayes approach to an estimation problemWang, Alan Then-Kang January 1965 (has links)
In this thesis we consider the following. We choose the random variable θ, which has some fixed but unknown distribution with a finite second moment. We observe the value x, of a preliminary random variable X, which has an unknown distribution which is conditional on θ. Using x and our past experience we are asked to estimate the value of θ with a squared error loss function. After we have made our decision we are given the value, y, of a detailed random variable Y, which has an unknown distribution conditional on θ. The random variable X and Y are assumed independent given a particular θ. Our past experience is made up of the values of preliminary and detailed random variables from previous decision problems which are independent of but similar to the present one.
With the risk defined in the usual way the Bayes decision function is the expected value of θ given that X = x. Since the distributions are unknown, the use of the two sample nonparametric empirical Bayes decision function is proposed. With the regret defined in the usual way it can be shown that the two sample nonparametric empirical Bayes decision function is asymptotically optimal, i.e. for a large number of past decision problems, the regret in using the two nonparametric empirical Bayes decision function tends to zero, and it is the main purpose of this thesis to verify this property by using a hypothetical numerical example. / Master of Science
|
349 |
A comparison of a supplementary sample non-parametric empirical Bayes estimator with the classical estimator in a quality control situationGabbert, James Tate January 1968 (has links)
The purpose of this study was to compare the effectiveness of the classical estimator with that of a supplementary sample non-parametric empirical Bayes estimator in detecting an out-of-control situation arising in statistical quality control work. The investigation was accomplished through Monte Carlo simulation on the IBM-7040/1401 system at the Virginia Polytechnic Institute Computing Center, Blacksburg, Virginia.
In most cases considered in this study, the sole criterion for accepting or rejecting the hypothesis that the industrial process is in control was the location of the estimate on the control chart for fraction defectives. If an estimate fell outside the 30 control limits, that particular batch was said to have been produced by an out-of-control system. In other cases the concept of "runs" was included as an additional criterion for acceptance or rejection.
Also considered were various parameters, such as the mean in-control fraction defectives, the mean out-of-control fraction defectives, the~first sample size, the standard deviation of the supplementary sample estimates, and the number of past experiences used in computing the empirical Bayes estimator.
The Monte Carlo studies showed that, for almost any set of parameter values, the empirical Bayes estimator is much more effective in detecting an out-of-control situation than is the classical estimator. The most notable advantage gained by using the empirical Bayes estimator is that long-range lack of detection is virtually impossible. / M.S.
|
350 |
Comparison of Bayes' and minimum variance unbiased estimators of reliability in the extreme value life testing modelGodbold, James Homer January 1970 (has links)
The purpose of this study is to consider two different types of estimators for reliability using the extreme value distribution as the life-testing model. First the unbiased minimum variance estimator is derived. Then the Bayes' estimators for the uniform, exponential, and inverted gamma prior distributions are obtained, and these results are extended to a whole class of exponential failure models. Each of the Bayes' estimators is compared with the unbiased minimum variance estimator in a Monte Carlo simulation where it is shown that the Bayes' estimator has smaller squared error loss in each case.
The problem of obtaining estimators with respect to an exponential type loss function is also considered. The difficulties in such an approach are demonstrated. / Master of Science
|
Page generated in 0.1191 seconds