Global ETD Search

1	A STUDY OF SHUFFLING CARDS AND STOPPING TIMES FOR RANDOMNESS Lin, Chia-Hui 19 July 2006 (has links) In this paper we analyze how many shuffles are necessary to get close to ran- domness for a deck of n cards. Aldous (1983) shows that approximately 8.55 (n=52) shuffles are necessary when n is large. Bayer and Diaconis (1992) use the variation distance as a measure of randomness to analyze the most commonly used method of shuffling cards, and claim that 7 shuffles are enough when n=52. We provide another idea to measure the distance from randomness for repeated shuffles. The proposed method consists of a goodness of fit test and a simple simulation. Simulation results show that we have a similar conclusion to that of Bayer and Diaconis. randomness goodness of fit test shuffle times simulation
2	Goodness-of-Fit and Change-Point Tests for Functional Data Gabrys, Robertas 01 May 2010 (has links) A test for independence and identical distribution of functional observations is proposed in this thesis. To reduce dimension, curves are projected on the most important functional principal components. Then a test statistic based on lagged cross--covariances of the resulting vectors is constructed. We show that this dimension reduction step introduces asymptotically negligible terms, i.e. the projections behave asymptotically as iid vector--valued observations. A complete asymptotic theory based on correlations of random matrices, functional principal component expansions, and Hilbert space techniques is developed. The test statistic has chi-square asymptotic null distribution. Two inferential tests for error correlation in the functional linear model are put forward. To construct them, finite dimensional residuals are computed in two different ways, and then their autocorrelations are suitably defined. From these autocorrelation matrices, two quadratic forms are constructed whose limiting distributions are chi--squared with known numbers of degrees of freedom (different for the two forms). A test for detecting a change point in the mean of functional observations is developed. The null distribution of the test statistic is asymptotically pivotal with a well-known asymptotic distribution. A comprehensive asymptotic theory for the estimation of a change--point in the mean function of functional observations is developed. The procedures developed in this thesis can be readily computed using the R package fda. All theoretical insights obtained in this thesis are confirmed by simulations and illustrated by real life-data examples. Correlation test Functional data analysis Goodness-of-fit test Statistics and Probability
3	Survival analysis in the presence of independent or dependent censoring Arachchige, Sakie Jaladha 13 December 2024 (has links) (PDF) This dissertation has three parts. The first part proposes a two-stage estimation procedure for a copula-based model with semi-competing risks data, where the nonterminal event is subject to dependent censoring by the terminal event. Under a copula-based model, the marginal survival functions of individual event times are specified by semiparametric transformation models, and a parametric copula function specifies the between-event dependence. The parameters associated with the marginal of the terminal event are first estimated, and the marginal parameters for the non-terminal event time and the copula parameter are second estimated via maximizing a pseudo-likelihood function based on the joint distribution of the bivariate event times. We derived the asymptotic properties of the proposed estimator and provided an analytic variance estimator for inference. We showed that our approach leads to consistent estimates with less computational cost and more robustness compared to the one-stage procedure developed by Chen (2012). In addition, our approach demonstrates more desirable finite-sample performances over another existing two-stage estimation method proposed by Zhu et al. (2021). The second part develops a goodness-of-fit t est f or t he copula specification under semi-parametric copula models with semi-competing risks data. We constructed an information ratio (IR) statistic by comparing consistent estimates of the two information matrices, the sensitivity matrix and the variability matrix. The information matrices are derived from the log-likelihood function, which is a function of the marginal distribution of the terminal event time, the marginal distribution of the time to the first event, and the copula parameter. We established the asymptotic distribution of the IR statistic and examined the finite-sample performance of the IR test via a simulation study. The third part develops a class of models to characterize the effects of factors that vary with the age at baseline and the age at the event. This project is motivated by the Childhood Cancer Survivor Study. The age-specific effects of the covariates are estimated via an inverse probability weighted kernel smoothing method. We conducted simulation studies to evaluate the performance of the proposed estimator.
4	The Distribution of Cotton Fiber Length Belmasrour, Rachid 05 August 2010 (has links) By testing a fiber beard, certain cotton fiber length parameters can be obtained rapidly. This is the method used by the High Volume Instrument (HVI). This study is aimed to explore the approaches and obtain the inference of length distributions of HVI beard sam- ples in order to develop new methods that can help us find the distribution of original fiber lengths and further improve HVI length measurements. At first, the mathematical functions were searched for describing three different types of length distributions related to the beard method as used in HVI: cotton fiber lengths of the original fiber population before picked by the HVI Fibrosampler, fiber lengths picked by HVI Fibrosampler, and fiber beard's pro-jecting portion that is actually scanned by HVI. Eight sets of cotton samples with a wide range of fiber lengths are selected and tested on the Advanced Fiber Information System (AFIS). The measured single fiber length data is used for finding the underlying theoreti-cal length distributions, and thus can be considered as the population distributions of the cotton samples. In addition, fiber length distributions by number and by weight are dis- cussed separately. In both cases a mixture of two Weibull distributions shows a good fit to their fiber length data. To confirm the findings, Kolmogorov-Smirnov goodness-of-fit tests were conducted. Furthermore, various length parameters such as Mean Length (ML) and Upper Half Mean Length (UHML) are compared between the original distribution from the experimental data and the fitted distributions. The results of these obtained fiber length distributions are discussed by using Partial Least Squares (PLS) regression, where the dis-tribution of the original fiber length from the distribution of the projected one is estimated. Fiber Beard Komogorov-Simirnov goodness-of-fit test Mixture ofWeibull Distributions Partial Least Squares
5	A Novel Approach to the Analysis of Nonlinear Time Series with Applications to Financial Data Lee, Jun Bum 2012 May 1900 (has links) The spectral analysis method is an important tool in time series analysis and the spectral density plays a crucial role on the spectral analysis. However, one of limitations of the spectral density is that the spectral density reflects only the covariance structure among several dependence measures in the time series data. To overcome this restriction, we define two spectral densities, the quantile spectral density and the association spectral density. The quantile spectral density can model the pairwise dependence structure and provide identification of nonlinear time series and the association spectral density allows detecting periodicities on different parts of the domain of the time series. We propose the estimators for the quantile spectral density and the association spectral density and derive their sampling properties including asymptotic normality. Furthermore, we use the quantile spectral density to develop a goodness-of-fit tests for time series and explain how this test can be used for comparing the sequential dependence structure of two time series. The asymptotic sampling properties of the test statistic are derived under the null and alternative hypothesis, and a bootstrap procedure is suggested to obtain finite sample approximation. The method is illustrated with simulations and some real data examples. Besides the exploration of the new spectral densities, we consider general quadratic forms of alpha-mixing time series and derive asymptotic normality of these forms under the relatively weak assumptions. goodness-of-fit test nonlinear time series quantile spectral density association spectral density
6	Evaluating Variance of the Model Credibility Index Xiao, Yan 30 November 2007 (has links) Model credibility index is defined to be a sample size under which the power of rejection equals 0.5. It applies goodness-of-fit testing thinking and uses a one-number summary statistic as an assessment tool in a false model world. The estimation of the model credibility index involves a bootstrap resampling technique. To assess the consistency of the estimator of model credibility index, we instead study the variance of the power achieved at a fixed sample size. An improved subsampling method is proposed to obtain an unbiased estimator of the variance of power. We present two examples to interpret the mechanics of building model credibility index and estimate its error in model selection. One example is two-way independent model by Pearson Chi-square test, and another example is multi-dimensional logistic regression model using likelihood ratio test. Consistency of the estimation Bootstrap resampling Model credibility index Goodness-of-fit test Mathematics
7	Power Comparison of Some Goodness-of-fit Tests Liu, Tianyi 06 July 2016 (has links) There are some existing commonly used goodness-of-fit tests, such as the Kolmogorov-Smirnov test, the Cramer-Von Mises test, and the Anderson-Darling test. In addition, a new goodness-of-fit test named G test was proposed by Chen and Ye (2009). The purpose of this thesis is to compare the performance of some goodness-of-fit tests by comparing their power. A goodness-of-fit test is usually used when judging whether or not the underlying population distribution differs from a specific distribution. This research focus on testing whether the underlying population distribution is an exponential distribution. To conduct statistical simulation, SAS/IML is used in this research. Some alternative distributions such as the triangle distribution, V-shaped triangle distribution are used. By applying Monte Carlo simulation, it can be concluded that the performance of the Kolmogorov-Smirnov test is better than the G test in many cases, while the G test performs well in some cases. Goodness-of-fit test Exponential distribution Power comparison Monte-Carlo simulation Statistical Methodology
8	Distributed Inference for Degenerate U-Statistics with Application to One and Two Sample Test Atta-Asiamah, Ernest January 2020 (has links) In many hypothesis testing problems such as one-sample and two-sample test problems, the test statistics are degenerate U-statistics. One of the challenges in practice is the computation of U-statistics for a large sample size. Besides, for degenerate U-statistics, the limiting distribution is a mixture of weighted chi-squares, involving the eigenvalues of the kernel of the U-statistics. As a result, it’s not straightforward to construct the rejection region based on this asymptotic distribution. In this research, we aim to reduce the computation complexity of degenerate U-statistics and propose an easy-to-calibrate test statistic by using the divide-and-conquer method. Specifically, we randomly partition the full n data points into kn even disjoint groups, and compute U-statistics on each group and combine them by averaging to get a statistic Tn. We proved that the statistic Tn has the standard normal distribution as the limiting distribution. In this way, the running time is reduced from O(n^m) to O( n^m/km_n), where m is the order of the one sample U-statistics. Besides, for a given significance level , it’s easy to construct the rejection region. We apply our method to the goodness of fit test and two-sample test. The simulation and real data analysis show that the proposed test can achieve high power and fast running time for both one and two-sample tests. degenerate and non degenerate divide-and-conquer goodness-of-fit test hypothesis testing maximum mean discrepancy U-statistics
9	Statistical Inferences under a semiparametric finite mixture model Zhang, Shiju January 2005 (has links) No description available. Statistics biased sampling, EM algorithm, empirical likelihood, finite mixture model, goodness-of-fit test, partial likelihood
10	An Alternative Goodness-of-fit Test for Normality with Unknown Parameters Shi, Weiling 14 November 2014 (has links) Goodness-of-fit tests have been studied by many researchers. Among them, an alternative statistical test for uniformity was proposed by Chen and Ye (2009). The test was used by Xiong (2010) to test normality for the case that both location parameter and scale parameter of the normal distribution are known. The purpose of the present thesis is to extend the result to the case that the parameters are unknown. A table for the critical values of the test statistic is obtained using Monte Carlo simulation. The performance of the proposed test is compared with the Shapiro-Wilk test and the Kolmogorov-Smirnov test. Monte-Carlo simulation results show that proposed test performs better than the Kolmogorov-Smirnov test in many cases. The Shapiro Wilk test is still the most powerful test although in some cases the test proposed in the present research performs better. Goodness-of-fit test for normality Monte Carlo Simulation G test Applied Statistics Statistical Models Statistical Theory Statistics and Probability

Search results