31 |
On unequal probability sampling designsGrafström, Anton January 2010 (has links)
The main objective in sampling is to select a sample from a population in order to estimate some unknown population parameter, usually a total or a mean of some interesting variable. When the units in the population do not have the same probability of being included in a sample, it is called unequal probability sampling. The inclusion probabilities are usually chosen to be proportional to some auxiliary variable that is known for all units in the population. When unequal probability sampling is applicable, it generally gives much better estimates than sampling with equal probabilities. This thesis consists of six papers that treat unequal probability sampling from a finite population of units. A random sample is selected according to some specified random mechanism called the sampling design. For unequal probability sampling there exist many different sampling designs. The choice of sampling design is important since it determines the properties of the estimator that is used. The main focus of this thesis is on evaluating and comparing different designs. Often it is preferable to select samples of a fixed size and hence the focus is on such designs. It is also important that a design has a simple and efficient implementation in order to be used in practice by statisticians. Some effort has been made to improve the implementation of some designs. In Paper II, two new implementations are presented for the Sampford design. In general a sampling design should also have a high level of randomization. A measure of the level of randomization is entropy. In Paper IV, eight designs are compared with respect to their entropy. A design called adjusted conditional Poisson has maximum entropy, but it is shown that several other designs are very close in terms of entropy. A specific situation called real time sampling is treated in Paper III, where a new design called correlated Poisson sampling is evaluated. In real time sampling the units pass the sampler one by one. Since each unit only passes once, the sampler must directly decide for each unit whether or not it should be sampled. The correlated Poisson design is shown to have much better properties than traditional methods such as Poisson sampling and systematic sampling.
|
32 |
Efficient Sequential Sampling for Neural Network-based Surrogate ModelingPavankumar Channabasa Koratikere (15353788) 27 April 2023 (has links)
<p>Gaussian Process Regression (GPR) is a widely used surrogate model in efficient global optimization (EGO) due to its capability to provide uncertainty estimates in the prediction. The cost of creating a GPR model for large data sets is high. On the other hand, neural network (NN) models scale better compared to GPR as the number of samples increase. Unfortunately, the uncertainty estimates for NN prediction are not readily available. In this work, a scalable algorithm is developed for EGO using NN-based prediction and uncertainty (EGONN). Initially, two different NNs are created using two different data sets. The first NN models the output based on the input values in the first data set while the second NN models the prediction error of the first NN using the second data set. The next infill point is added to the first data set based on criteria like expected improvement or prediction uncertainty. EGONN is demonstrated on the optimization of the Forrester function and a constrained Branin function and is compared with EGO. The convergence criteria is based on the maximum number of infill points in both cases. The algorithm is able to reach the optimum point within the given budget. The EGONN is extended to handle constraints explicitly and is utilized for aerodynamic shape optimization of the RAE 2822 airfoil in transonic viscous flow at a free-stream Mach number of 0.734 and a Reynolds number of 6.5 million. The results obtained from EGONN are compared with the results from gradient-based optimization (GBO) using adjoints. The optimum shape obtained from EGONN is comparable to the shape obtained from GBO and is able to eliminate the shock. The drag coefficient is reduced from 200 drag counts to 114 and is close to 110 drag counts obtained from GBO. The EGONN is also extended to handle uncertainty quantification (uqEGONN) using prediction uncertainty as an infill method. The convergence criteria is based on the relative change of summary statistics such as mean and standard deviation of an uncertain quantity. The uqEGONN is tested on Ishigami function with an initial sample size of 100 samples and the algorithm terminates after 70 infill points. The statistics obtained from uqEGONN (using only 170 function evaluations) are close to the values obtained from directly evaluating the function one million times. uqEGONN is demonstrated on to quantifying the uncertainty in the airfoil performance due to geometric variations. The algorithm terminates within 100 computational fluid dynamics (CFD) analyses and the statistics obtained from the algorithm are close to the one obtained from 1000 direct CFD based evaluations.</p>
|
Page generated in 0.0722 seconds