1 |
Contributions to the theory of unequal probability samplingLundquist, Anders January 2009 (has links)
This thesis consists of five papers related to the theory of unequal probability sampling from a finite population. Generally, it is assumed that we wish to make modelassisted inference, i.e. the inclusion probability for each unit in the population is prescribed before the sample is selected. The sample is then selected using some random mechanism, the sampling design. Mostly, the thesis is focused on three particular unequal probability sampling designs, the conditional Poisson (CP-) design, the Sampford design, and the Pareto design. They have different advantages and drawbacks: The CP design is a maximum entropy design but it is difficult to determine sampling parameters which yield prescribed inclusion probabilities, the Sampford design yields prescribed inclusion probabilities but may be hard to sample from, and the Pareto design makes sample selection very easy but it is very difficult to determine sampling parameters which yield prescribed inclusion probabilities. These three designs are compared probabilistically, and found to be close to each other under certain conditions. In particular the Sampford and Pareto designs are probabilistically close to each other. Some effort is devoted to analytically adjusting the CP and Pareto designs so that they yield inclusion probabilities close to the prescribed ones. The result of the adjustments are in general very good. Some iterative procedures are suggested to improve the results even further. Further, balanced unequal probability sampling is considered. In this kind of sampling, samples are given a positive probability of selection only if they satisfy some balancing conditions. The balancing conditions are given by information from auxiliary variables. Most of the attention is devoted to a slightly less general but practically important case. Also in this case the inclusion probabilities are prescribed in advance, making the choice of sampling parameters important. A complication which arises in the context of choosing sampling parameters is that certain probability distributions need to be calculated, and exact calculation turns out to be practically impossible, except for very small cases. It is proposed that Markov Chain Monte Carlo (MCMC) methods are used for obtaining approximations to the relevant probability distributions, and also for sample selection. In general, MCMC methods for sample selection does not occur very frequently in the sampling literature today, making it a fairly novel idea.
|
2 |
Comparaison empirique des méthodes bootstrap dans un contexte d'échantillonnage en population finie.Dabdoubi, Oussama 08 1900 (has links)
No description available.
|
3 |
On unequal probability sampling designsGrafström, Anton January 2010 (has links)
The main objective in sampling is to select a sample from a population in order to estimate some unknown population parameter, usually a total or a mean of some interesting variable. When the units in the population do not have the same probability of being included in a sample, it is called unequal probability sampling. The inclusion probabilities are usually chosen to be proportional to some auxiliary variable that is known for all units in the population. When unequal probability sampling is applicable, it generally gives much better estimates than sampling with equal probabilities. This thesis consists of six papers that treat unequal probability sampling from a finite population of units. A random sample is selected according to some specified random mechanism called the sampling design. For unequal probability sampling there exist many different sampling designs. The choice of sampling design is important since it determines the properties of the estimator that is used. The main focus of this thesis is on evaluating and comparing different designs. Often it is preferable to select samples of a fixed size and hence the focus is on such designs. It is also important that a design has a simple and efficient implementation in order to be used in practice by statisticians. Some effort has been made to improve the implementation of some designs. In Paper II, two new implementations are presented for the Sampford design. In general a sampling design should also have a high level of randomization. A measure of the level of randomization is entropy. In Paper IV, eight designs are compared with respect to their entropy. A design called adjusted conditional Poisson has maximum entropy, but it is shown that several other designs are very close in terms of entropy. A specific situation called real time sampling is treated in Paper III, where a new design called correlated Poisson sampling is evaluated. In real time sampling the units pass the sampler one by one. Since each unit only passes once, the sampler must directly decide for each unit whether or not it should be sampled. The correlated Poisson design is shown to have much better properties than traditional methods such as Poisson sampling and systematic sampling.
|
4 |
Estimation de synchrones de consommation électrique par sondage et prise en compte d'information auxiliaire / Estimate the mean electricity consumption curve by survey and take auxiliary information into accountLardin, Pauline 26 November 2012 (has links)
Dans cette thèse, nous nous intéressons à l'estimation de la synchrone de consommation électrique (courbe moyenne). Etant donné que les variables étudiées sont fonctionnelles et que les capacités de stockage sont limitées et les coûts de transmission élevés, nous nous sommes intéressés à des méthodes d'estimation par sondage, alternatives intéressantes aux techniques de compression du signal. Nous étendons au cadre fonctionnel des méthodes d'estimation qui prennent en compte l'information auxiliaire disponible afin d'améliorer la précision de l'estimateur de Horvitz-Thompson de la courbe moyenne de consommation électrique. La première méthode fait intervenir l'information auxiliaire au niveau de l'estimation, la courbe moyenne est estimée à l'aide d'un estimateur basé sur un modèle de régression fonctionnelle. La deuxième l'utilise au niveau du plan de sondage, nous utilisons un plan à probabilités inégales à forte entropie puis l'estimateur de Horvitz-Thompson fonctionnel. Une estimation de la fonction de covariance est donnée par l'extension au cadre fonctionnel de l'approximation de la covariance donnée par Hájek. Nous justifions de manière rigoureuse leur utilisation par une étude asymptotique. Pour chacune de ces méthodes, nous donnons, sous de faibles hypothèses sur les probabilités d'inclusion et sur la régularité des trajectoires, les propriétés de convergence de l'estimateur de la courbe moyenne ainsi que de sa fonction de covariance. Nous établissons également un théorème central limite fonctionnel. Afin de contrôler la qualité de nos estimateurs, nous comparons deux méthodes de construction de bande de confiance sur un jeu de données de courbes de charge réelles. La première repose sur la simulation de processus gaussiens. Une justification asymptotique de cette méthode sera donnée pour chacun des estimateurs proposés. La deuxième utilise des techniques de bootstrap qui ont été adaptées afin de tenir compte du caractère fonctionnel des données / In this thesis, we are interested in estimating the mean electricity consumption curve. Since the study variable is functional and storage capacities are limited or transmission cost are high survey sampling techniques are interesting alternatives to signal compression techniques. We extend, in this functional framework, estimation methods that take into account available auxiliary information and that can improve the accuracy of the Horvitz-Thompson estimator of the mean trajectory. The first approach uses the auxiliary information at the estimation stage, the mean curve is estimated using model-assisted estimators with functional linear regression models. The second method involves the auxiliary information at the sampling stage, considering πps (unequal probability) sampling designs and the functional Horvitz-Thompson estimator. Under conditions on the entropy of the sampling design the covariance function of the Horvitz-Thompson estimator can be estimated with the Hájek approximation extended to the functional framework. For each method, we show, under weak hypotheses on the sampling design and the regularity of the trajectories, some asymptotic properties of the estimator of the mean curve and of its covariance function. We also establish a functional central limit theorem.Next, we compare two methods that can be used to build confidence bands. The first one is based on simulations of Gaussian processes and is assessed rigorously. The second one uses bootstrap techniques in a finite population framework which have been adapted to take into account the functional nature of the data
|
Page generated in 0.0766 seconds