Spelling suggestions: "subject:"balanced ampling"" "subject:"balanced campling""
1 |
A New Method Of Resampling Testing Nonparametric Hypotheses: Balanced Randomization TestsJanuary 2014 (has links)
Background: Resampling methods such as the Monte Carlo (MC) and Bootstrap Approach (BA) are very flexible tools for statistical inference. They are used in general in experiments with small sample size or where the parametric test assumptions are not met. They are also used in situations where expressions for properties of complex estimators are statistically intractable. However, the MC and BA methods require relatively large random samples to estimate the parameters of the full permutation (FP) or exact distribution. Objective: The objective of this research study was to develop an efficient statistical computational resampling method that compares two population parameters, using a balanced and controlled sampling design. The application of the new method, the balanced randomization (BR) method, is discussed using microarray data where sample sizes are generally small. Methods: Multiple datasets were simulated from real data to compare the accuracy and efficiency of the methods (BR, MC, and BA). Datasets, probability distributions, parameters, and sample sizes were varied in the simulation. The correlation between the exact p-value and the p-values generated by simulation provide a measure of accuracy/consistency to compare methods. Sensitivity, specificity, power function, false negative and positive rates using graphical and multivariate analyses were used to compare methods. Results and Discussions: The correlation between the exact p-value and those estimated from simulation are higher for BR and MC, (increasing somewhat with increasing sample size), much less for BA, and most pronounced for skewed distributions (lognormal, exponential). Furthermore, the relative proportion of 95%/99% CI containing the true p-value for BR vs. MC=3%/1.3% (p<0.0001) and BR vs. BA=20%/15% (p<0.0001). The sensitivity, specificity and power function of the BR method were shown to have a slight advantage compared to those of MC and BA in most situations. As an example, the BR method was applied to a microarray study to discuss significantly differentially expressed genes. / acase@tulane.edu
|
2 |
Contributions to the theory of unequal probability samplingLundquist, Anders January 2009 (has links)
This thesis consists of five papers related to the theory of unequal probability sampling from a finite population. Generally, it is assumed that we wish to make modelassisted inference, i.e. the inclusion probability for each unit in the population is prescribed before the sample is selected. The sample is then selected using some random mechanism, the sampling design. Mostly, the thesis is focused on three particular unequal probability sampling designs, the conditional Poisson (CP-) design, the Sampford design, and the Pareto design. They have different advantages and drawbacks: The CP design is a maximum entropy design but it is difficult to determine sampling parameters which yield prescribed inclusion probabilities, the Sampford design yields prescribed inclusion probabilities but may be hard to sample from, and the Pareto design makes sample selection very easy but it is very difficult to determine sampling parameters which yield prescribed inclusion probabilities. These three designs are compared probabilistically, and found to be close to each other under certain conditions. In particular the Sampford and Pareto designs are probabilistically close to each other. Some effort is devoted to analytically adjusting the CP and Pareto designs so that they yield inclusion probabilities close to the prescribed ones. The result of the adjustments are in general very good. Some iterative procedures are suggested to improve the results even further. Further, balanced unequal probability sampling is considered. In this kind of sampling, samples are given a positive probability of selection only if they satisfy some balancing conditions. The balancing conditions are given by information from auxiliary variables. Most of the attention is devoted to a slightly less general but practically important case. Also in this case the inclusion probabilities are prescribed in advance, making the choice of sampling parameters important. A complication which arises in the context of choosing sampling parameters is that certain probability distributions need to be calculated, and exact calculation turns out to be practically impossible, except for very small cases. It is proposed that Markov Chain Monte Carlo (MCMC) methods are used for obtaining approximations to the relevant probability distributions, and also for sample selection. In general, MCMC methods for sample selection does not occur very frequently in the sampling literature today, making it a fairly novel idea.
|
Page generated in 0.0507 seconds