Spelling suggestions: "subject:"bumphunter."" "subject:"bowhunting.""
1 |
Semiparametric Estimation of Unimodal DistributionsLooper, Jason K 20 August 2003 (has links)
One often wishes to understand the probability distribution of stochastic data from experiment or computer simulations. However, where no model is given, practitioners must resort to parametric or non-parametric methods in order to gain information about the underlying distribution. Others have used initially a nonparametric estimator in order to understand the underlying shape of a set of data, and then later returned with a parametric method to locate the peaks. However they are interested in estimating spectra, which may have multiple peaks, where in this work we are interested in approximating the peak position of a single-peak probability distribution.
One method of analyzing a distribution of data is by fitting a curve to, or smoothing them. Polynomial regression and least-squares fit are examples of smoothing methods. Initial understanding of the underlying distribution can be obscured depending on the degree of smoothing. Problems such as under and oversmoothing must be addressed in order to determine the shape of the underlying distribution. Furthermore, smoothing of skewed data can give a biased estimation of the peak position.
We propose two new approaches for statistical mode estimation based on the assumption that the underlying distribution has only one peak. The first method imposes the global constraint of unimodality locally, by requiring negative curvature over some domain. The second method performs a search that assumes a position of the distribution's peak and requires positive slope to the left, and negative slope to the right. Each approach entails a constrained least-squares fit to the raw cumulative probability distribution.
We compare the relative efficiencies [12] of finding the peak location of these two estimators for artificially generated data from known families of distributions Weibull, beta, and gamma. Within each family a parameter controls the skewness or kurtosis, quantifying the shapes of the distributions for comparison. We also compare our methods with other estimators such as the kernel-density estimator, adaptive histogram, and polynomial regression. By comparing the effectiveness of the estimators, we can determine which estimator best locates the peak position.
We find that our estimators do not perform better than other known estimators. We also find that our estimators are biased. Overall, an adaptation of kernel estimation proved to be the most efficient.
The results for the work done in this thesis will be submitted, in a different form, for publication by D.A. Rabson and J.K. Looper.
|
2 |
Semiparametric estimation of unimodal distributions [electronic resource] / by Jason K. Looper.Looper, Jason K. January 2003 (has links)
Title from PDF of title page. / Document formatted into pages; contains 93 pages. / Thesis (M.S.)--University of South Florida, 2003. / Includes bibliographical references. / Text (Electronic thesis) in PDF format. / ABSTRACT: One often wishes to understand the probability distribution of stochastic data from experiment or computer simulations. However, where no model is given, practitioners must resort to parametric or non-parametric methods in order to gain information about the underlying distribution. Others have used initially a nonparametric estimator in order to understand the underlying shape of a set of data, and then later returned with a parametric method to locate the peaks. However they are interested in estimating spectra, which may have multiple peaks, where in this work we are interested in approximating the peak position of a single-peak probability distribution. One method of analyzing a distribution of data is by fitting a curve to, or smoothing them. Polynomial regression and least-squares fit are examples of smoothing methods. Initial understanding of the underlying distribution can be obscured depending on the degree of smoothing. / ABSTRACT: Problems such as under and oversmoothing must be addressed in order to determine the shape of the underlying distribution.Furthermore, smoothing of skewed data can give a biased estimation of the peak position. We propose two new approaches for statistical mode estimation based on the assumption that the underlying distribution has only one peak. The first method imposes the global constraint of unimodality locally, by requiring negative curvature over some domain. The second method performs a search that assumes a position of the distribution's peak and requires positive slope to the left, and negative slope to the right. / ABSTRACT: Each approach entails a constrained least-squares fit to the raw cumulative probability distribution.We compare the relative efficiencies [12] of finding the peak location of these two estimators for artificially generated data from known families of distributions Weibull, beta, and gamma. Within each family a parameter controls the skewness or kurtosis, quantifying the shapes of the distributions for comparison. We also compare our methods with other estimators such as the kernel-density estimator, adaptive histogram, and polynomial regression. By comparing the effectiveness of the estimators, we can determine which estimator best locates the peak position. We find that our estimators do not perform better than other known estimators. We also find that our estimators are biased. / ABSTRACT: Overall, an adaptation of kernel estimation proved to be the most efficient.The results for the work done in this thesis will be submitted, in a different form, for publication by D.A. Rabson and J.K. Looper. / System requirements: World Wide Web browser and PDF reader. / Mode of access: World Wide Web.
|
3 |
Validation and Inferential Methods for Distributional Form and ShapeMayorov, Kirill January 2017 (has links)
This thesis investigates some problems related to the form and shape of statistical distributions with the main focus on goodness of fit and bump hunting. A bump is a distinctive characteristic of distributional shape. A search for bumps, or bump hunting, in a probability density function (PDF) has long been an important topic in statistical research. We introduce a new definition of a bump which relies on the notion of the curvature of a planar curve. We then propose a new method for bump hunting which is based on a kernel density estimator of the unknown PDF. The method gives not only the number of bumps but also the location of their centers and base points. In quantitative risk applications, the selection of distributions that properly capture upper tail behavior is essential for accurate modeling. We study tests of distributional form, or goodness-of-fit (GoF) tests, that assess simple hypotheses, i.e., when the parameters of the hypothesized distribution are completely specified. From theoretical and practical perspectives, we analyze the limiting properties of a family of weighted Cramér-von Mises GoF statistics W2 with weight function psi(t)=1/(1-t)^beta (for beta<=2) which focus on the upper tail. We demonstrate that W2 has no limiting distribution. For this reason, we provide a normalization of W2 that leads to a non-degenerate limiting distribution. Further, we study W2 for composite hypotheses, i.e., when distributional parameters must be estimated from a sample at hand. When the hypothesized distribution is heavy-tailed, we examine the finite sample properties of W2 under the Chen-Balakrishnan transformation that reduces the original GoF test (the direct test) to a test for normality (the indirect test). In particular, we compare the statistical level and power of the pairs of direct and indirect tests. We observe that decisions made by the direct and indirect tests agree well, and in many cases they become independent as sample size grows. / Thesis / Doctor of Philosophy (PhD)
|
Page generated in 2.1597 seconds