Spelling suggestions: "subject:"probability "" "subject:"aprobability ""
21 |
Testing covariance structure of spatial stream network data using Torgegram components and subsamplingLiu, Zhijiang (Van) 01 August 2019 (has links)
Researchers analyzing data collected from a stream network need to determine the data's second-order dependence structure before identifying an appropriate model and predicting the values of response variable at unsampled sites (via Kriging) or the average value of the variable over a stream interval (via Block Kriging). The Torgegram enables us to graphically detect the dependence structure of stream network data, but formal tests that resemble those in Euclidean space have not yet been developed on stream networks. The objective of this thesis is to construct nonparametric tests for pure tail-down and pure tail-up dependence on stream networks. These tests are based on the characteristics of some Torgegram components under specific types of dependence structure: the test for tail-down dependence relies on the fact that Type-0 and Type-1 subsemivariances at the same lag are equal when the covariance structure is tail-down, while the test for tail-up dependence takes advantage of a "flat" FUSD semivariogram under tail-up dependence.
Several test statistics are proposed to test for pure tail-down and tail-up dependence. The general form of these test statistics is a fraction, whose numerator is the sample size multiplied by the squared difference of specific semivariances (or subsemivariances) and denominator is a consistent estimator of the variance of the square root of the numerator. The asymptotic behaviors of the vectors of semivariances or subsemivariances are proved on regular rooted binary trees with an increasing number of levels, so that these test statistics converge in distribution to a chi-squared random variable with one degree of freedom. In order to obtain consistent estimators of the variance-covariance matrices of the vectors of semivariances or subsemivariances, two methods are introduced in this thesis: the plug-in method, which computes estimators as functions of linear combinations of FCSD and FUDJ semivariances, and the subsampling method, which computes estimators based on the semivariances or subsemivariances from overlapping subsamples. Then, those test statistics for tail-down and tail-up dependence are extended to irregular stream networks, based on Torgegram components and subsampling estimators. Simulation studies on regular rooted binary trees and real stream networks (from two real datasets) demonstrate the good performance of these tests: the larger the sample size or the stronger the spatial dependence, the higher the powers of the tests. However, the test for tail-down dependence requires Type-0 subsemivariances estimated from a sufficient number of pairs of sites on the same stream segments, which not every stream network dataset has.
|
22 |
Regularized skewness parameter estimation for multivariate skew normal and skew t distributionsWang, Sheng 01 May 2019 (has links)
The skewed normal (SN) distribution introduced by Azzalini has opened a new era for analyzing skewed data. The idea behind it is that it incorporates a new parameter regulating shape and skewness on the symmetric Gaussian distribution. This idea was soon extended to other symmetric distributions such as the Student's t distribution, resulting in the invention of the skew t (ST) distribution.
The multivariate versions of the two distributions, i.e. the multivariate skew normal (MSN) and multivariate skew t (MST) distributions, have received considerable attention because of their ability to t skewed data, together with some other properties such as mathematical tractability. While many researchers focus on tting the MSN and MST dis- tributions to data, in this thesis we address another important aspect of statistical modeling using those two distributions, i.e. skewness selection and estimation. Skewness selection, as we discuss it here, means identifying which components of the skewness parameter in the MSN and MST distributions are zero.
In this thesis, we begin by reviewing some important properties of the two distributions and then we describe the obstacles that block us from doing skewness selection in the direct parameterizations of the two distributions. Then, to circumvent those obstacles, we intro- duce a new parameterization to use for skewness selection. The nice properties of this new parameterization are also summarized.
After introduction of the new parameterization, we discuss our proposed methods to reach the goal of skewness selection. Particularly, we consider adding appropriate penalties to the loss functions of the MSN and MST distributions, represented in the new parameterization of the two distributions. Technical details such as initial value selection and tuning parameter selection are also discussed. Asymptotic consistency and oracle property of some of our methods are constructed.
In the later part of the thesis, we include results from some simulation studies in order to assess the performance of our proposed methods. Also, we apply our methods to three data sets. Lastly, some drawbacks and potential future work are discussed.
|
23 |
Penalized likelihood estimation of a fixed-effect and a mixed-effect transfer function modelHansen, Elizabeth Ann 01 January 2006 (has links)
Motivated by the need of estimating the main spawning period of North Sea cod, we develop a common transfer function model with a panel of contemporaneously correlated times series data. This model incorporates (i) the smoothness on the parameters by assuming that the second differences are small and (ii) the contemporaneous correlation by assuming that the errors have a general variance-covariance matrix. Penalized likelihood estimation of this model requires an iterative procedure that is developed in this work. We develop three methods for determining confidence bands: frequentist, Bayesian, and bootstrap (both nonparametric and parametric). A simulation study on the frequentist and Bayesian confidence bands motivated by the cod spawning data is conducted and the results of those simulations are compared. The model is then used on the cod spawning data, with all confidence bands computed. The results of this analysis are discussed. We then delve further into our model by discussing the theory behind this model. We prove a theorem that shows that the estimated regression parameter vector is a consistent estimate of the true regression parameter. We further prove that this estimated regression parameter vector has an asymptotic normal distribution. Both theorems are proved while assuming mild conditions.
We further develop our model by incorporating between-series variation in the transfer function, with the random effect assumed to have a normal distribution with a "smooth" mean vector. We implement the EM algorithm to do the penalized likelihood estimation. We consider five different specifications of the variance-covariance matrix of the random transfer function model, namely, a general variance-covariance matrix, a diagonal matrix, a multiple of the identity matrix, an autoregressive matrix of order one, and a multiplicative error specification. Since the computation of confidence bands would lead to numerical problems, we introduce a bootstrap approach for estimating the confidence bands. We consider both the nonparametric and parametric bootstrap approaches. We then apply this model to estimate the cod spawning period, while also looking into the different specifications of the variance-covariance matrix of the random effect, the two types of bootstrapped confidence bands, and model checking.
|
24 |
Statistical analysis of non-linear diffusion processSu, Fei 01 December 2011 (has links)
In this paper, we study the problem of statistical inference of continuous-time diffusion processes and their higher-order analogues, and develop methods for modeling threshold diffusion processes in particular. The limiting properties of such estimators are also discussed. We also proposed the likelihood ratio test statistics for testing threshold diffusion process against its linear alternative. We begin in Chapter 1 with an introduction of continuous-time non-linear diffusion processes where I summarized the literature on model estimation. The most natural extension from affine to non-linear model would be piecewise linear diffusion process with piecewise constant variance functions. It can also be considered as a continuous-time threshold autoregressive model (CTAR), the continuous-time analogue of AR model for discrete-time time-series data. The order-one CTAR model is discussed in detail. The discussion is directed more toward the estimation techniques other than the mathematical details. Existing inferential methods (estimation and testing) generally assume known functional form of the (instantaneous) variance function. In practice, the functional form of the variance function is hardly known. So, it is important to develop new methods for estimating a diffusion model that does not rely on knowledge on the functional form of the variance function. In the second Chapter, we propose the quasi-likelihood method to estimate the parameters indexing the mean function of a threshold diffusion model without prior knowledge of its instantaneous variance structure. (and apply to other nonlinear diffusion models, which will be further investigated later.) We also explore the limiting properties of the quasi-likelihood estimators. We focus on estimating the mean function, after which the functional form of the instantaneous variance function can be explored and subsequently estimated from quadratic variation considerations. We show that, under mild regularity conditions, the quasi-likelihood estimators of the parameters in the linear mean function of each regime are consistent and are asymptotically normal, whereas the threshold parameter is super consistent and weakly converges to some non-Gaussian continuous distribution. A notable feature is that the limiting distribution of the threshold parameter admits a closed-form probability density function, which enables the construction of its confidence interval; in contrast, for the discrete-time TAR models, the construction of the confidence interval for the threshold parameter has, so far, not been practically solved. A simulation study is provided to illustrate the asymptotic results. We also use the threshold model to estimate the term structure of a long time series of US interest rates. It is also of theoretical and practical interest that whether the observed process indeed satisfy the threshold model. In Chapter 3, we propose a likelihood ratio test scheme to test the existence of thresholds. It can test for non-linearity. Most importantly, we shall study how to price and predict value processes with nonlinear diffusion processes.be shown, under the null hypothesis of no threshold, the test statistics converges to a central Gaussian process asymptotically. Also the test is asymptotically powerful and the asymptotic distribution of the test statistic under the alternative hypothesis converge to a non-central Gaussian distribution. Further, the limiting distribution is the same as that of its discrete analogues for testing TAR(1) model against autoregressive model. Thus the upper percentage points of the asymptotic statistics for the discrete case are immediately applicable for our tests. Simulation studies are also conducted to show the empirical size and power of the tests. The application of our current method leads to more future work briefly discussed in Chapter 4. For example, we would like to extend our estimation methods to higher order and higher dimensional cases, use more general underlying mean processes, and most importantly, we shall study how to price and predict value processes with nonlinear diffusion processes.
|
25 |
Approximations to Continuous Dynamical Processes in Hierarchical ModelsCangelosi, Amanda 01 December 2008 (has links)
Models for natural nonlinear processes, such as population dynamics, have been given much attention in applied mathematics. For example, species competition has been extensively modeled by differential equations. Often, the scientist has preferred to model the underlying dynamical processes (i.e., theoretical mechanisms) in continuous-time. It is of both scientific and mathematical interest to implement such models in a statistical framework to quantify uncertainty associated with the models in the presence of observations. That is, given discrete observations arising from the underlying continuous process, the unobserved process can be formally described while accounting for multiple sources of uncertainty (e.g., measurement error, model choice, and inherent stochasticity of process parameters). In addition to continuity, natural processes are often bounded; specifically, they tend to have non-negative support. Various techniques have been implemented to accommodate non-negative processes, but such techniques are often limited or overly compromising. This study offers an alternative to common differential modeling practices by using a bias-corrected truncated normal distribution to model the observations and latent process, both having bounded support. Parameters of an underlying continuous process are characterized in a Bayesian hierarchical context, utilizing a fourth-order Runge-Kutta approximation.
|
26 |
“Reliability Analysis of Oriented Strand Board’s Strength with a Simulation Study of the Median Censored Method for Estimating of Lower Percentile StrengthWang, Yang 01 August 2007 (has links)
Oriented Strand Board (OSB), an engineered wood product, has gained increased market acceptance as a construction material. Because of its growing market, the product’s manufacturing and performance have become the focus of much research. Internal Bond (IB) and Parallel and Perpendicular Elasticity Indices (EI), are important strength metrics of OSB and are analyzed in this thesis using statistical reliability methods.
The data for this thesis consists of 529 destructive tests of OSB panels. They were tested from July 2005 to January 2006. These OSB panels came from a modern OSB manufacture in the Southeastern United States with the wood furnish being primarily Southern Pine (Pinus spp.). The 529 records are for 7/16” thickness OSB strength, which is rated for roof sheathing (i.e., 7/16” RS).
Descriptive statistics of IB and EI are summarized including mean, median, standard deviation, Interquartile range, skewness etc. Visual tools such as histograms and box plots are utilized to identify outliers and improve the understanding of the data. Survival plots or Kaplan-Meier curves are important methods for conducting nonparametric analyses of life (or strength) reliability data and are used in this thesis to estimate the strength survival function of the IB and EI of OSB. Probability Plots and Information Criteria are used to determine the best underlying distribution or probability density function. The OSB data used in this thesis fit the lognormal distribution best for both IB and EI. One outlier is excluded for the IB data and six outliers are excluded for the EI data.
Estimation of lower percentiles is very important for quality assurance. In many reliability studies, there is great interest in estimating the lower percentiles of life or strength. In OSB, the lower percentiles of strength may result in catastrophic failures during installation of OSB panels. Catastrophic failure of 7/16” RS OSB, which is used primarily for residential construction of roofs, may result in severe injury or death of construction workers. The liability and risk to OSB manufacturers from severe injury or death to construction workers from an OSB panel failure during construction can result in extreme loss of market share and significant financial losses.
In reliability data, “multiple failure modes” is common. Simulated data of mixed distribution of the two two-parameter Weibull distribution is produced to mimic the multiple failure modes for the reliability data A forced median censored method is adopted to estimate lower percentiles of the simulated data. Results of the simulation indicate that the estimated lower percentiles median censored method is relatively close to the true parametric percentiles when compared to not using the median censored method. I conclude that the median censoring method is a useful tool for improving estimation of the lower percentiles for OSB panel failure.
|
27 |
An Applied Statistical Reliability Analysis of the Modulus of Elasticity and the Modulus of Rupture for Wood-Plastic CompositesPerhac, Diane Goodman 01 August 2007 (has links)
Wood-plastic composites (WPC) are materials comprised of wood fiber within a thermoplastic matrix and are a growing and important source of alternative wood products in the forest products industry. WPC is gaining market share in the building industry because of durability/maintenance advantages of WPC over traditional wood products and because of the removal of chromated copper arsenate (CCA) pressuretreated wood from the market. The reliability methods outlined in this thesis can be used to improve the quality of WPC and lower manufacturing costs by reducing raw material inputs and minimizing WPC waste. Statistical methods are described for analyzing both tensile strength and bending measures of WPC. These key measures include stiffness (tangent modulus of elasticity: MOE) and flexural strength (modulus of rupture: MOR) results from both tensile strength and bending tests. As with any real data analysis, the possibility of outliers is assessed and addressed. With this data, different WPC subsets are evaluated with and without the presence of a coupling agent. Separate subsets without outliers are also reviewed. Descriptive statistics, histograms, probability plots, and survival curves from these test data are presented and interpreted. To provide a more objective assessment of appropriate parametric modeling, Akaike’s Information Criterion is used in conjunction with probability plotting. Selection of the best underlying distribution for the data is an important result that may be used to further explore and analyze the given data. In this thesis, these underlying distributional assumptions are utilized to better understand the product’s lower percentiles.
These lower percentiles provide practitioners with an evaluation of the product’s early failures along with providing information for specification limits, warranty, and cost analysis. Estimation of lower percentiles is sometimes difficult, since substantive data is often sparse in the lower tails. Bootstrap techniques provide important solutions for confidence interval assessments of these percentiles. Bootstrapping is a computer intensive resampling method that may be used for both parametric and nonparametric models. This thesis briefly describes several bootstrapping methods and applies these methods to appraise MOE and MOR test results on sampled WPC. The reliability and bootstrapping methods outlined in this thesis may directly benefit WPC manufacturers through a better evaluation of strength and stiffness measures, which can lead to process improvements with enhanced reliability, thereby creating greater manufacturer and customer satisfaction.
|
28 |
Examining Regression Analysis Beyond the Mean of the Distribution using Quantile Regression: A Case Study of Modeling the Internal Bond of Medium Density Fiberboard using Multiple Linear Regression and Quantile Regression with an Example of Reliability Methods using R SoftwareShaffer, Leslie Brooke 01 August 2007 (has links)
The thesis examines the causality of the central tendency of the Internal Bond (IB) of Medium Density Fiberboard (MDF) with predictor variables from the MDF manufacturing process. Multiple linear regression (MLR) models are developed using a best model criterion for all possible subsets of IB for four MDF thickness products reported in inches, e.g., 0.750”, 0.625”, 0.6875”, and 0.500”. Quantile Regression (QR) models of the median IB are also developed.
The adjusted coefficient of determination (R2 a) of the MLR models range from 72% with 53 degrees of freedom to 81% with 42 degrees of freedom, respectively. The Root Mean Square Errors (RMSE) range from 6.05 pounds per square inch (p.s.i.) to 6.23 p.s.i. A common independent variable for the 0.750” and 0.625” products is “Refiner Resin Scavenger %”. QR models for 0.750” and 0.625” have similar slopes for the median and average but different slopes for the 5th and 95th percentiles. “Face Humidity” is a common predictor for the 0.6875” and 0.500” products. QR models for 0.6875” and 0.500” indicate different slopes for the median and average with different slopes for the outer 5th and 95th percentiles.
The MLR and QR validation models for the 0.750”, 0.625” and 0.6875” product types have coefficients of determination for the validation data set ( R2validation ) ranging from 40% to 60% and RMSEP ranging from 26.5 p.s.i. to 27.85 p.s.i.. The MLR validation model for the 0.500” product has a R2validation and RMSEP of 64% and 23.63 p.s.i. while the QR validation model has a R2validation and RMSEP of 66% and 19.18 p.s.i. The IB for 0.500” has departure from normality that is reflected in the results of the validation models. The thesis results provide further evidence that QR is a more defendable method for modeling the central tendency of a response variable when the response variable departs from normality.
The use of QR provides MDF manufacturers with an opportunity to examine causality beyond the mean of the distribution. Examining the lower and upper percentiles of a distribution may provide significant insight for identifying process variables that influence IB failure or extreme IB strength. Keywords.
|
29 |
Investigating students' understandings of probability : a study of a grade 7 classroomAbu-Bakare, Veda 11 1900 (has links)
This research study probes students’ understandings of various aspects of probability in a 3-week Probability unit in a Grade 7 classroom. Informing this study are the perspectives of constructivism and sociocultural theory which underpin the contemporary reform in mathematics
education as codified in the NCTM standards and orient much of the teaching and learning of mathematics in today’s classrooms. Elements of culturally responsive pedagogy were also adopted within the research design.
The study was carried out in an urban school where I collaborated with the teacher and students as co-teacher and researcher. As the population of this school was predominantly Aboriginal, the lessons included discussion of the tradition and significance of Aboriginal games of chance and an activity based on one of these games. Data sources included the responses in the pre- and post-tests, fleidnotes of the lessons, and audiotapes of student interviews.
The key findings of the study are that the students had some understanding of formal
probability theory with strongly-held persistent alternative thinking, some of which did not fit the informal conceptions of probability noted in the literature such as the outcome approach and the gambler’s fallacy. This has led to the proposal of a Personal Probability model in which the determination of a probability or a probability decision is a weighting of components such as experience, intuition and judgment, some normative thinicing, and personal choice, beliefs and attitudes. Though the alternative understandings were explored in interviews and resolved to some degree, the study finds that the probability understandings of students in this study are still fragile and inconsistent. Students demonstrated marked interest in tasks that combined mathematics, culture and community.
This study presents evidence that the current prescribed learning outcomes in the
elementary grades are too ambitious and best left to the higher grades. The difficulties in the teaching and learning of the subject induced by the nuances and challenges of the subject as well as the dearth of time that is needed for an adequate treatment further direct that instructional resources at this level be focused on deepening and strengthening the basic ideas.
|
30 |
Probabilistic models and reliability analysis of scour depth around bridge piersBolduc, Laura Christine 02 June 2009 (has links)
Scour at a bridge pier is the formation of a hole around the pier due to the erosion
of soil by flowing water; this hole in the soil reduces the carrying capacity of the
foundation and the pier. Excessive scour can cause a bridge pier to fail without warning.
Current predictions of the depth of the scour hole around a bridge pier are based on
deterministic models. This paper considers two alternative deterministic models to
predict scour depth. For each deterministic model, a corresponding probabilistic model
is constructed using a Bayesian statistical approach and available field and experimental
data. The developed probabilistic models account for the estimate bias in the
deterministic models and for the model uncertainty. Parameters from both prediction
models are compared to determine their accuracy. The developed probabilistic models
are used to estimate the probability of exceedance of scour depth around bridge piers.
The method is demonstrated on an example bridge pier. The values of the model
parameters suggest that the maximum sour depth predicted by the deterministic HEC-18
Sand and HEC-18 Clay models tend to be conservative. Evidence is also found that the
applicability of the HEC-18 Clay method is not limited to clay but can also be used for other soil types. The main advantage of the HEC-18 Clay method with respect to the
HEC-18 Sand method is that it predicts the depth of scour as a function of time and can
be used to estimate the final scour at the end of the design life of a structure. The paper
addresses model uncertainties for given hydrologic variables. Hydrologic uncertainties
have been presented in a separate paper.
|
Page generated in 0.0557 seconds