1 |
Sequential Procedures for the "Selection" Problems in Discrete Simulation OptimizationWenyu Wang (7491243) 17 October 2019 (has links)
<div>The simulation optimization problems refer to the nonlinear optimization problems whose objective function can be evaluated through stochastic simulations. We study two significant discrete simulation optimization problems in this thesis: Ranking and Selection (R&S) and Factor Screening (FS). Both R&S and FS are the "selection" problems defined upon a finite set of candidate systems or factors. They vary mainly in their objectives: the R&S problems is to find the "best" system(s) among all alternatives; whereas the FS is to select important factors that are critical to the stochastic systems. </div><div><br></div><div>In this thesis, we develop efficient sequential procedures for these two problems. For the R&S problem, we propose fully-sequential procedures for selecting the "best" systems with a guaranteed probability of correct selection (PCS). The main features of the stated methods are: (1) a Bonferroni-free model, these procedures overcome the conservativeness of the Bonferroni correction and deliver the exact probabilistic guarantee without overshooting; (2) asymptotic optimality, these procedures achieve the lower bound of average sample size asymptotically; (3) an indifference-zone-flexible formulation, these procedures bridge the gap between the indifference-zone formulation and the indifference-zone-free formulation so that the indifference-zone parameter is not indispensable but could be helpful if provided. We establish the validity and asymptotic efficiency for the proposed procedure and conduct numerical studies to investigates the performance under multiple configurations.</div><div><br></div><div>We also consider the multi-objective R&S (MOR&S) problem. To the best of our knowledge, the procedure proposed is the first frequentist approach for MOR&S. These procedures identify the Pareto front with a guaranteed probability of correct selection (PCS). In particular, these procedures are fully sequential using the test statistics built upon the Generalized Sequential Probability Ratio Test (GSPRT). The main features are: 1) an objective-dimension-free model, the performance of these procedures do not deteriorate as the number of objectives increases, and achieve the same efficiency as KN family procedures for single-objective ranking and selection problem; 2) an indifference-zone-flexible formulation, the new methods eliminate the necessity of indifference-zone parameter while makes use of the indifference-zone information if provided. A numerical evaluation demonstrates the validity efficiency of the new procedure.</div><div><br></div><div>For the FS problem, our objective is to identify important factors for simulation experiments with controlled Family-Wise Error Rate. We assume a Multi-Objective first-order linear model where the responses follow a multivariate normal distribution. We offer three fully-sequential procedures: Sum Intersection Procedure (SUMIP), Sort Intersection Procedure (SORTIP), and Mixed Intersection procedure (MIP). SUMIP uses the Bonferroni correction to adjust for multiple comparisons; SORTIP uses the Holms procedure to overcome the conservative of the Bonferroni method, and MIP combines both SUMIP and SORTIP to work efficiently in the parallel computing environment. Numerical studies are provided to demonstrate the validity and efficiency, and a case study is presented.</div>
|
2 |
Bayesian Semiparametric Models for Heterogeneous Cross-platform Differential Gene ExpressionDhavala, Soma Sekhar 2010 December 1900 (has links)
We are concerned with testing for differential expression and consider three different
aspects of such testing procedures. First, we develop an exact ANOVA type
model for discrete gene expression data, produced by technologies such as a Massively
Parallel Signature Sequencing (MPSS), Serial Analysis of Gene Expression (SAGE)
or other next generation sequencing technologies. We adopt two Bayesian hierarchical
models—one parametric and the other semiparametric with a Dirichlet process
prior that has the ability to borrow strength across related signatures, where a signature
is a specific arrangement of the nucleotides. We utilize the discreteness of the
Dirichlet process prior to cluster signatures that exhibit similar differential expression
profiles. Tests for differential expression are carried out using non-parametric
approaches, while controlling the false discovery rate. Next, we consider ways to
combine expression data from different studies, possibly produced by different technologies
resulting in mixed type responses, such as Microarrays and MPSS. Depending
on the technology, the expression data can be continuous or discrete and can have different
technology dependent noise characteristics. Adding to the difficulty, genes can
have an arbitrary correlation structure both within and across studies. Performing
several hypothesis tests for differential expression could also lead to false discoveries.
We propose to address all the above challenges using a Hierarchical Dirichlet process
with a spike-and-slab base prior on the random effects, while smoothing splines model the unknown link functions that map different technology dependent manifestations
to latent processes upon which inference is based. Finally, we propose an algorithm
for controlling different error measures in a Bayesian multiple testing under generic
loss functions, including the widely used uniform loss function. We do not make
any specific assumptions about the underlying probability model but require that
indicator variables for the individual hypotheses are available as a component of the
inference. Given this information, we recast multiple hypothesis testing as a combinatorial
optimization problem and in particular, the 0-1 knapsack problem which
can be solved efficiently using a variety of algorithms, both approximate and exact in
nature.
|
3 |
The performance of multiple hypothesis testing procedures in the presence of dependenceClarke, Sandra Jane January 2010 (has links)
Hypothesis testing is foundational to the discipline of statistics. Procedures exist which control for individual Type I error rates and more global or family-wise error rates for a series of hypothesis tests. However, the ability of scientists to produce very large data sets with increasing ease has led to a rapid rise in the number of statistical tests performed, often with small sample sizes. This is seen particularly in the area of biotechnology and the analysis of microarray data. This thesis considers this high-dimensional context with particular focus on the effects of dependence on existing multiple hypothesis testing procedures. / While dependence is often ignored, there are many existing techniques employed currently to deal with this context but these are typically highly conservative or require difficult estimation of large correlation matrices. This thesis demonstrates that, in this high-dimensional context when the distribution of the test statistics is light-tailed, dependence is not as much of a concern as in the classical contexts. This is achieved with the use of a moving average model. One important implication of this is that, when this is satisfied, procedures designed for independent test statistics can be used confidently on dependent test statistics. / This is not the case however for heavy-tailed distributions, where we expect an asymptotic Poisson cluster process of false discoveries. In these cases, we estimate the parameters of this process along with the tail-weight from the observed exceedences and attempt to adjust procedures. We consider both conservative error rates such as the family-wise error rate and more popular methods such as the false discovery rate. We are able to demonstrate that, in the context of DNA microarrays, it is rare to find heavy-tailed distributions because most test statistics are averages.
|
4 |
Sdílení investičních nápadu: Rola štěstí a dovednosti / Sharing investment ideas: Role of luck and skillTurlík, Tomáš January 2021 (has links)
i Abstract In the environment of a large group of analysts who are willing to share their investment ideas publicly, it is a challenging task to find the ones who have a great skill and whose recommendations generate abnormal returns. We explore one such famous group, Value Investors Club, consisting of 1223 analysts be- tween the years 2000 and 2019. We separate the analysts into multiple groups, each representing their inherent abilities. The commonly used method of single hypothesis testing cannot be used as we test many analysts at once, and the multiple hypothesis testing methods need to be employed. Using these meth- ods, we are able to detect the subgroup of analysts who have abnormal returns from the Fama-French 4 factor portfolio. However, different methods lead to different groups of analysts deemed to be skilled. An overall portfolio consist- ing of all analysts generates large abnormal returns, which diminish with the increases in the holding period. Furthermore, analyses from analysts estimated to be skilled are used to form portfolios. We find that there are methods that have significantly larger abnormal returns compared to the overall portfolio; however, the methods are not consistent at producing such portfolios. Keywords multiple hypothesis testing, luck and skill, in- vestment ideas Title...
|
5 |
Sensitivity to Distributional Assumptions in Estimation of the ODP Thresholding FunctionBunn, Wendy Jill 06 July 2007 (has links) (PDF)
Recent technological advances in fields like medicine and genomics have produced high-dimensional data sets and a challenge to correctly interpret experimental results. The Optimal Discovery Procedure (ODP) (Storey 2005) builds on the framework of Neyman-Pearson hypothesis testing to optimally test thousands of hypotheses simultaneously. The method relies on the assumption of normally distributed data; however, many applications of this method will violate this assumption. This thesis investigates the sensitivity of this method to detection of significant but nonnormal data. Overall, estimation of the ODP with the method described in this thesis is satisfactory, except when the nonnormal alternative distribution has high variance and expectation only one standard deviation away from the null distribution.
|
6 |
Towards a Human Genomic Coevolution NetworkSavel, Daniel M. 04 June 2018 (has links)
No description available.
|
7 |
A Monte Carlo Study of Several Alpha-Adjustment Procedures Used in Testing Multiple Hypotheses in Factorial AnovaAn, Qian 20 July 2010 (has links)
No description available.
|
8 |
Multiple Hypothesis Testing Approach to Pedestrian Inertial Navigation with Non-recursive Bayesian Map-matchingKoroglu, Muhammed Taha 22 September 2020 (has links)
No description available.
|
Page generated in 0.1364 seconds