Global ETD Search

1	Integrating remotely sensed data into forest resource inventories / The impact of model and variable selection on estimates of precision Mundhenk, Philip Henrich 26 May 2014 (has links) Die letzten zwanzig Jahre haben gezeigt, dass die Integration luftgestützter Lasertechnologien (Light Detection and Ranging; LiDAR) in die Erfassung von Waldressourcen dazu beitragen kann, die Genauigkeit von Schätzungen zu erhöhen. Um diese zu ermöglichen, müssen Feldaten mit LiDAR-Daten kombiniert werden. Diverse Techniken der Modellierung bieten die Möglichkeit, diese Verbindung statistisch zu beschreiben. Während die Wahl der Methode in der Regel nur geringen Einfluss auf Punktschätzer hat, liefert sie unterschiedliche Schätzungen der Genauigkeit. In der vorliegenden Studie wurde der Einfluss verschiedener Modellierungstechniken und Variablenauswahl auf die Genauigkeit von Schätzungen untersucht. Der Schwerpunkt der Arbeit liegt hierbei auf LiDAR Anwendungen im Rahmen von Waldinventuren. Die Methoden der Variablenauswahl, welche in dieser Studie berücksichtigt wurden, waren das Akaike Informationskriterium (AIC), das korrigierte Akaike Informationskriterium (AICc), und das bayesianische (oder Schwarz) Informationskriterium. Zudem wurden Variablen anhand der Konditionsnummer und des Varianzinflationsfaktors ausgewählt. Weitere Methoden, die in dieser Studie Berücksichtigung fanden, umfassen Ridge Regression, der least absolute shrinkage and selection operator (Lasso), und der Random Forest Algorithmus. Die Methoden der schrittweisen Variablenauswahl wurden sowohl im Rahmen der Modell-assistierten als auch der Modell-basierten Inferenz untersucht. Die übrigen Methoden wurden nur im Rahmen der Modell-assistierten Inferenz untersucht. In einer umfangreichen Simulationsstudie wurden die Einflüsse der Art der Modellierungsmethode und Art der Variablenauswahl auf die Genauigkeit der Schätzung von Populationsparametern (oberirdische Biomasse in Megagramm pro Hektar) ermittelt. Hierzu wurden fünf unterschiedliche Populationen genutzt. Drei künstliche Populationen wurden simuliert, zwei weitere basierten auf in Kanada und Norwegen erhobenen Waldinveturdaten. Canonical vine copulas wurden genutzt um synthetische Populationen aus diesen Waldinventurdaten zu generieren. Aus den Populationen wurden wiederholt einfache Zufallsstichproben gezogen und für jede Stichprobe wurden der Mittelwert und die Genauigkeit der Mittelwertschätzung geschäzt. Während für das Modell-basierte Verfahren nur ein Varianzschätzer untersucht wurde, wurden für den Modell-assistierten Ansatz drei unterschiedliche Schätzer untersucht. Die Ergebnisse der Simulationsstudie zeigten, dass das einfache Anwenden von schrittweisen Methoden zur Variablenauswahl generell zur Überschätzung der Genauigkeiten in LiDAR unterstützten Waldinventuren führt. Die verzerrte Schätzung der Genauigkeiten war vor allem für kleine Stichproben (n = 40 und n = 50) von Bedeutung. Für Stichproben von größerem Umfang (n = 400), war die Überschätzung der Genauigkeit vernachlässigbar. Gute Ergebnisse, im Hinblick auf Deckungsraten und empirischem Standardfehler, zeigten Ridge Regression, Lasso und der Random Forest Algorithmus. Aus den Ergebnissen dieser Studie kann abgeleitet werden, dass die zuletzt genannten Methoden in zukünftige LiDAR unterstützten Waldinventuren Berücksichtigung finden sollten. 634 Light detection and ranging (LiDAR) Generalized regression estimator Model uncertainty Design-based inference Model-based inference Forstwirtschaft (PPN621305413)
2	Inférence basée sur le plan pour l'estimation de petits domaines / Design-based inference for small area estimation Randrianasolo, Toky 18 November 2013 (has links) La forte demande de résultats à un niveau géographique fin, notamment à partir d'enquêtes nationales, a mis en évidence la fragilité des estimations sur petits domaines. Cette thèse propose d'y remédier avec des méthodes spécifiques basées sur le plan de sondage. Celles-ci reposent sur la constructionde nouvelles pondérations pour chaque unité statistique. La première méthode consiste à optimiser le redressement du sous-échantillon d'une enquête inclusdans un domaine. La deuxième repose sur la construction de poids dépendant à la fois des unités statistiques et des domaines. Elle consiste à scinder les poids de sondage de l'estimateur global tout en respectant deux contraintes : 1/ la somme des estimations sur toute partition en domaines est égale à l'estimation globale ; 2/ le système de pondération pour un domaine particulier satisfait les propriétés de calage sur les variables auxiliaires connues pour le domaine. L'estimateur par scission ainsi obtenu se comporte de manière quasi analogue au célèbre estimateur blup (meilleur prédicteur linéaire sans biais). La troisième méthode propose une réécriture de l'estimateur blup sous la forme d'un estimateur linéaire homogène, en adoptant une approche basée sur le plan de sondage, bien que l'estimateur dépende d'un modèle. De nouveaux estimateurs blup modifiés sont obtenus. Leur précision, estimée par simulation avec application sur des données réelles, est assez proche de celle de l'estimateur blup standard. Les méthodes développées dans cette thèse sont ensuite appliquées à l'estimation d'indicateurs de la mobilité locale à partir de l'Enquête Nationale sur les Transports et les Déplacements 2007-2008. Lorsque la taille d'un domaine est faible dans l'échantillon, les estimations obtenues avec la première méthode perdent en précision, alors que la précision reste satisfaisante pour les deux autres méthodes. / The strong demand for results at a detailed geographic level, particularly from national surveys, has raised the problem of the fragility of estimates for small areas. This thesis addresses this issue with specific methods based on the sample design. These ones consist of building new weights for each statistical unit. The first method consists of optimizing the re-weighting of a subsample survey included in an area. The second one is based on the construction of weights that depend on the statistical units as well as the areas. It consists of splitting the sampling weights of the overall estimator while satisfying two constraints : 1/ the sum of the estimates on every partition into areas is equal to the overall estimate ; 2/ the system of weights for a given area satisfies calibration properties on known auxiliary variables at the level of the area. The split estimator thus obtained behaves almost similarly as the well-known blup (best linear unbiased predictor) estimator. The third method proposes a rewriting of the blup estimator, although model-based, in the form of a homogenous linear estimator from a design-based approach. New modified blup estimators are obtained. Their precision, estimated by simulation with an application to real data, is quite close to that of the standard blup estimator. Then, the methods developed in this thesis are applied to the estimation of local mobility indicators from the 2007-2008 French National Travel Survey. When the size of an area is small in the sample, the estimates obtained with the first method are not precise enough whereas the precision remains satisfactory for the two other methods. Sondage Estimation sur petits domaines Inférence basée sur le plan de sondage Poids Survey sampling Small area estimation Design-Based inference Weights
3	Estimation multi-robuste efficace en présence de données influentes Michal, Victoire 08 1900 (has links) No description available. Robustesse Imputation multi-robuste Biais conditionnel Inférence basée sur le plan de sondage Unités influentes Non-réponse Robustness Multiply robust imputation Conditional bias Design-based inference Influential units Item nonresponse
4	On Methods for Real Time Sampling and Distributions in Sampling Meister, Kadri January 2004 (has links) This thesis is composed of six papers, all dealing with the issue of sampling from a finite population. We consider two different topics: real time sampling and distributions in sampling. The main focus is on Papers A–C, where a somewhat special sampling situation referred to as real time sampling is studied. Here a finite population passes or is passed by the sampler. There is no list of the population units available and for every unit the sampler should decide whether or not to sample it when he/she meets the unit. We focus on the problem of finding suitable sampling methods for the described situation and some new methods are proposed. In all, we try not to sample units close to each other so often, i.e. we sample with negative dependencies. Here the correlations between the inclusion indicators, called sampling correlations, play an important role. Some evaluation of the new methods are made by using a simulation study and asymptotic calculations. We study new methods mainly in comparison to standard Bernoulli sampling while having the sample mean as an estimator for the population mean. Assuming a stationary population model with decreasing autocorrelations, we have found the form for the nearly optimal sampling correlations by using asymptotic calculations. Here some restrictions on the sampling correlations are used. We gain most in efficiency using methods that give negatively correlated indicator variables, such that the correlation sum is small and the sampling correlations are equal for units up to lag m apart and zero afterwards. Since the proposed methods are based on sequences of dependent Bernoulli variables, an important part of the study is devoted to the problem of how to generate such sequences. The correlation structure of these sequences is also studied. The remainder of the thesis consists of three diverse papers, Papers D–F, where distributional properties in survey sampling are considered. In Paper D the concern is with unified statistical inference. Here both the model for the population and the sampling design are taken into account when considering the properties of an estimator. In this paper the framework of the sampling design as a multivariate distribution is used to outline two-phase sampling. In Paper E, we give probability functions for different sampling designs such as conditional Poisson, Sampford and Pareto designs. Methods to sample by using the probability function of a sampling design are discussed. Paper F focuses on the design-based distributional characteristics of the π-estimator and its variance estimator. We give formulae for the higher-order moments and cumulants of the π-estimator. Formulae of the design-based variance of the variance estimator, and covariance of the π-estimator and its variance estimator are presented. Mathematical statistics Finite population sampling inferential issues real time sampling sequential sampling methods negative sampling correlations model-design-based inference Matematisk statistik Mathematical statistics Matematisk statistik
5	Finite population inference for population with a large number of zero-valued observations Nolet-Pigeon, Isabelle 08 1900 (has links) Dans certaines enquêtes auprès des entreprises, il n'est pas rare de s'intéresser à estimer le total ou la moyenne d'une variable qui, par sa nature, prend souvent une valeur nulle. En présence d'une grande proportion de valeurs nulles, les estimateurs usuels peuvent s'avérer inefficaces. Dans ce mémoire, nous étudions les propriétés des estimateurs habituels pour des populations exhibant une grande proportion de zéros. Dans un contexte d'une approche fondée sur le modèle, nous présentons des prédicteurs robustes à la présence de valeurs influentes pour ce type de populations. Finalement, nous effectuons des études par simulation afin d'évaluer la performance de divers estimateurs/prédicteurs en termes de biais et d'efficacité. / In business surveys, we are often interested in estimating population means or totals of variables which, by nature, will often take a value of zero. In the presence of a large proportion of zero-valued observations, the customary estimators may be unstable. In this thesis, we study the properties of commonly used estimators for populations exhibiting a large proportion of zero-valued observations. In a model-based framework, we present some robust predictors in the presence of influential units. Finally, we perform simulation studies to evaluate the performance of several estimators in terms of bias and efficiency. Robustesse Unités influentes Inférence basée sur le modèle Inférence basée sur le plan de sondage Biais conditionnel Robustness Influential units Model-based inference Design-based inference Conditional bias

1

Page generated in 0.1678 seconds