1 |
On Methods for Real Time Sampling and Distributions in SamplingMeister, Kadri January 2004 (has links)
This thesis is composed of six papers, all dealing with the issue of sampling from a finite population. We consider two different topics: real time sampling and distributions in sampling. The main focus is on Papers A–C, where a somewhat special sampling situation referred to as real time sampling is studied. Here a finite population passes or is passed by the sampler. There is no list of the population units available and for every unit the sampler should decide whether or not to sample it when he/she meets the unit. We focus on the problem of finding suitable sampling methods for the described situation and some new methods are proposed. In all, we try not to sample units close to each other so often, i.e. we sample with negative dependencies. Here the correlations between the inclusion indicators, called sampling correlations, play an important role. Some evaluation of the new methods are made by using a simulation study and asymptotic calculations. We study new methods mainly in comparison to standard Bernoulli sampling while having the sample mean as an estimator for the population mean. Assuming a stationary population model with decreasing autocorrelations, we have found the form for the nearly optimal sampling correlations by using asymptotic calculations. Here some restrictions on the sampling correlations are used. We gain most in efficiency using methods that give negatively correlated indicator variables, such that the correlation sum is small and the sampling correlations are equal for units up to lag m apart and zero afterwards. Since the proposed methods are based on sequences of dependent Bernoulli variables, an important part of the study is devoted to the problem of how to generate such sequences. The correlation structure of these sequences is also studied. The remainder of the thesis consists of three diverse papers, Papers D–F, where distributional properties in survey sampling are considered. In Paper D the concern is with unified statistical inference. Here both the model for the population and the sampling design are taken into account when considering the properties of an estimator. In this paper the framework of the sampling design as a multivariate distribution is used to outline two-phase sampling. In Paper E, we give probability functions for different sampling designs such as conditional Poisson, Sampford and Pareto designs. Methods to sample by using the probability function of a sampling design are discussed. Paper F focuses on the design-based distributional characteristics of the π-estimator and its variance estimator. We give formulae for the higher-order moments and cumulants of the π-estimator. Formulae of the design-based variance of the variance estimator, and covariance of the π-estimator and its variance estimator are presented.
|
2 |
On unequal probability sampling designsGrafström, Anton January 2010 (has links)
The main objective in sampling is to select a sample from a population in order to estimate some unknown population parameter, usually a total or a mean of some interesting variable. When the units in the population do not have the same probability of being included in a sample, it is called unequal probability sampling. The inclusion probabilities are usually chosen to be proportional to some auxiliary variable that is known for all units in the population. When unequal probability sampling is applicable, it generally gives much better estimates than sampling with equal probabilities. This thesis consists of six papers that treat unequal probability sampling from a finite population of units. A random sample is selected according to some specified random mechanism called the sampling design. For unequal probability sampling there exist many different sampling designs. The choice of sampling design is important since it determines the properties of the estimator that is used. The main focus of this thesis is on evaluating and comparing different designs. Often it is preferable to select samples of a fixed size and hence the focus is on such designs. It is also important that a design has a simple and efficient implementation in order to be used in practice by statisticians. Some effort has been made to improve the implementation of some designs. In Paper II, two new implementations are presented for the Sampford design. In general a sampling design should also have a high level of randomization. A measure of the level of randomization is entropy. In Paper IV, eight designs are compared with respect to their entropy. A design called adjusted conditional Poisson has maximum entropy, but it is shown that several other designs are very close in terms of entropy. A specific situation called real time sampling is treated in Paper III, where a new design called correlated Poisson sampling is evaluated. In real time sampling the units pass the sampler one by one. Since each unit only passes once, the sampler must directly decide for each unit whether or not it should be sampled. The correlated Poisson design is shown to have much better properties than traditional methods such as Poisson sampling and systematic sampling.
|
Page generated in 0.0774 seconds