Sequential experimentation, especially for factorial treatment structures, becomes important when one or more of the following, conditions exist: observations become available quickly, observations are costly to obtain, experimental results need to be evaluated quickly, adjustments in experimental set-up may be desirable, a quick screening of the importance of various factors is important. The designs discussed in this study are suitable for these situations. Two approaches to sequential factorial experimentation are considered: one-run-at-a-time (ORAT) plans and one-block-at-a-time (OBAT) plans. For 2ⁿ experiments, saturated non-orthogonal 2ᵥⁿ fractions to be carried out as ORAT plans are reported. In such ORAT plans, only one factor level is changed between any two successive runs. Such plans are useful and economical for situations in which it is costly to change simultaneously more than one factor level at a given time. The estimable effects and the alias structure after each run have been provided. Formulas for the estimates of main-effects and two-factor interactions have been derived. Such formulas can be used for assessing the significance of their estimates. For 3<sup>m</sup> and 2ⁿ3<sup>m</sup> experiments, Webb's (1965) saturated non-orthogonal expansible-contractible <0, 1, 2> - 2ᵥⁿ designs have been generalized and new saturated non-orthogonal expansible-contractible 3ᵥ<sup>m</sup> and 2ⁿ3ᵥ<sup>m</sup> designs have been reported. Based on these 2ᵥⁿ, 3ᵥ<sup>m</sup> and 2ⁿ3ᵥ<sup>m</sup> designs, we have reported new OBAT 2ᵥⁿ, 3ᵥ<sup>m</sup> and 2ⁿ3ᵥ<sup>m</sup> plans which will eventually lead to the estimation of all main-effects and all two-factor interactions. The OBAT 2ⁿ, 3<sup>m</sup> and 2ⁿ3<sup>m</sup> plans have been constructed according to two strategies: Strategy I OBAT plans are carried out in blocks of very small sizes, i.e. 2 and 3, and factor effects are estimated one at a time whereas Strategy II OBAT plans involve larger block sizes where factors are assumed to fall into disjoint sets and each block investigates the effects of the factors of a particular set. Strategy I OBAT plans are appropriate when severe time trends in the response may be present. Formulas for estimates of main-effects and two-factor interactions at the various stages of strategy I OBAT 2ⁿ, 3<sup>m</sup> and 2ⁿ3<sup>m</sup> plans are reported. / Ph. D.
Shaparenko, Raymond Allen
A comparison is made between Wald's Sequential Probability Ratio sampling plan, several generalized attributes acceptance sampling plans, and a curtailed single sampling plan. The evaluation of the plans is based upon a cost function based upon a combination of the Average Sampling Number (ASN) and the variance of an estimator for the proportion of defective items in a lot. Using a numerical calculation of the defined cost function, the curtailed single sampling plan and also a generalized attributes acceptance sampling plan are shown to be better than Wald's SPR in a number of instances for representative operating characteristics. However, strictly in terms of ASN Wald's SPR is shown to be better. A computer program is devised which gives a good approximation of the variance of the estimator used for Wald's SPR. / M.S.
Multilingual literature in a Swedish classroom : A sequential analysis regarding code-switching in This Is How You Lose Her by Junot DiazMohamad, Aso January 2020 (has links)
This essay explores sociolinguistic implications in the novel This Is How You Lose Her by Junot Diaz. I investigate this literary work of Diaz in terms of the usage of code-switching by applying an adaptation of conversation analysis and a theoretical framework provided by Brown and Levinson (1999) that suggests that code-switching can be used to achieve interactional goals with other speakers. Also, I argue for widespread support of allowing multilingualism to be a more significant part of learning in the Swedish classroom. The conclusion drawn from this study tells us more about how politeness and code-switching can be applied in literary form and that different switches are used in different speech acts depending on which face is being threatened. I have also presented examples of how teachers can use Diaz’s novel to conduct a literary or linguistic project using multilingual literature to raise awareness of sociolinguistics and language variations in alignment with The Swedish National Agency for Education (2011) directives
01 May 1972
The application of a sequential test, the sequential probability ratio test, for the tolerances of noxious weed seeds is studied. It is proved that the sequential test can give a similar power curve to that of the current fixed sample test if the test parameters are properly chosen. The average sample size required by a sequential test, in general, is smaller than that of the existing test. However, in some cases it requires relatively a larger sample than current test. As a solution to the problem a method of truncation is considered. A kind of mixed procedure is suggested. This procedure gives almost an identical power curve to the standard one with great savings in sample size. The sample size is always less than that of the current test procedure.
Kent, James Richard
Using a variance-stabilizing transformation of the non-centralχ² distribution and Wald's sequential probability ratio test, procedures have been developed for sequential analysis of categorical data group-wise. These procedures' enables (i) a simple hypothesis to be used for the alternative hypothesis instead of the composite hypothesis commonly used in goodness-of-fit tests, contingency tables, and Mood's non-parametric generalization of the one-way analysis of variance, (ii) calculation or a power function, and (iii) calculation of the greatest expected ASR's and the non-centrality parameter requiring this sample size in addition to the ASN's when the null or alternative hypothesis is true. Application of these procedures to the three types of analysis given in (i) give the right decisions with sample sizes near the calculated ASN’s. The ASN's for when the expected number of groups equals one compare favorably with those obtained by Jackson (1959) using Bhate’s conjecture and those obtained empirically by Appleby (1960). In general, the sequential approach will require smaller sample sizes than fixed sampling if the non-centrality parameter is equal to or less than the group size and the group size is large enough to meet minimum expectation requirements. / M.S.
Sequential Inference and Nonparametric Goodness-of-Fit Tests for Certain Types of Skewed DistributionsOpperman, Logan J. 07 August 2019 (has links)
No description available.
Anderson, Michael P.
Doctor of Philosophy / Department of Statistics / Suzanne Dubnicka / DNA barcodes are short strands of nucleotide bases taken from the cytochrome c oxidase subunit 1 (COI) of the mitochondrial DNA (mtDNA). A single barcode may have the form C C G G C A T A G T A G G C A C T G . . . and typically ranges in length from 255 to around 700 nucleotide bases. Unlike nuclear DNA (nDNA), mtDNA remains largely unchanged as it is passed from mother to offspring. It has been proposed that these barcodes may be used as a method of differentiating between biological species (Hebert, Ratnasingham, and deWaard 2003). While this proposal is sharply debated among some taxonomists (Will and Rubinoff 2004), it has gained momentum and attention from biologists. One issue at the heart of the controversy is the use of genetic distance measures as a tool for species differentiation. Current methods of species classification utilize these distance measures that are heavily dependent on both evolutionary model assumptions as well as a clearly defined "gap" between intra- and interspecies variation (Meyer and Paulay 2005). We point out the limitations of such distance measures and propose a character-based method of species classification which utilizes an application of Bayes' rule to overcome these deficiencies. The proposed method is shown to provide accurate species-level classification. The proposed methods also provide answers to important questions not addressable with current methods.
Ramdas, Aaditya Kumar
01 July 2015
This thesis makes fundamental computational and statistical advances in testing and estimation, making critical progress in theory and application of classical statistical methods like classification, regression and hypothesis testing, and understanding the relationships between them. Our work connects multiple fields in often counter-intuitive and surprising ways, leading to new theory, new algorithms, and new insights, and ultimately to a cross-fertilization of varied fields like optimization, statistics and machine learning. The first of three thrusts has to do with active learning, a form of sequential learning from feedback-driven queries that often has a provable statistical advantage over passive learning. We unify concepts from two seemingly different areas—active learning and stochastic firstorder optimization. We use this unified view to develop new lower bounds for stochastic optimization using tools from active learning and new algorithms for active learning using ideas from optimization. We also study the effect of feature noise, or errors-in-variables, on the ability to actively learn. The second thrust deals with the development and analysis of new convex optimization algorithms for classification and regression problems. We provide geometrical and convex analytical insights into the role of the margin in margin-based classification, and develop new greedy primal-dual algorithms for non-linear classification. We also develop a unified proof for convergence rates of randomized algorithms for the ordinary least squares and ridge regression problems in a variety of settings, with the purpose of investigating which algorithm should be utilized in different settings. Lastly, we develop fast state-of-the-art numerically stable algorithms for an important univariate regression problem called trend filtering with a wide variety of practical extensions. The last thrust involves a series of practical and theoretical advances in nonparametric hypothesis testing. We show that a smoothedWasserstein distance allows us to connect many vast families of univariate and multivariate two sample tests. We clearly demonstrate the decreasing power of the families of kernel-based and distance-based two-sample tests and independence tests with increasing dimensionality, challenging existing folklore that they work well in high dimensions. Surprisingly, we show that these tests are automatically adaptive to simple alternatives and achieve the same power as other direct tests for detecting mean differences. We discover a computation-statistics tradeoff, where computationally more expensive two-sample tests have a provable statistical advantage over cheaper tests. We also demonstrate the practical advantage of using Stein shrinkage for kernel independence testing at small sample sizes. Lastly, we develop a novel algorithmic scheme for performing sequential multivariate nonparametric hypothesis testing using the martingale law of the iterated logarithm to near-optimally control both type-1 and type-2 errors. One perspective connecting everything in this thesis involves the closely related and fundamental problems of linear regression and classification. Every contribution in this thesis, from active learning to optimization algorithms, to the role of the margin, to nonparametric testing fits in this picture. An underlying theme that repeats itself in this thesis, is the computational and/or statistical advantages of sequential schemes with feedback. This arises in our work through comparing active with passive learning, through iterative algorithms for solving linear systems instead of direct matrix inversions, and through comparing the power of sequential and batch hypothesis tests.
Ogilvie, William Fraser
The space of compile-time transformations and or run-time options which can improve the performance of a given code is usually so large as to be virtually impossible to search in any practical time-frame. Thus, heuristics are leveraged which can suggest good but not necessarily best configurations. Unfortunately, since such heuristics are tightly coupled to processor architecture performance is not portable; heuristics must be tuned, traditionally manually, for each device in turn. This is extremely laborious and the result is often outdated heuristics and less effective optimisation. Ideally, to keep up with changes in hardware and run-time environments a fast and automated method to generate heuristics is needed. Recent works have shown that machine learning can be used to produce mathematical models or rules in their place, which is automated but not necessarily fast. This thesis proposes the use of active machine learning, sequential analysis, and active feature acquisition to accelerate the training process in an automatic way, thereby tackling this timely and substantive issue. First, a demonstration of the efficiency of active learning over the previously standard supervised machine learning technique is presented in the form of an ensemble algorithm. This algorithm learns a model capable of predicting the best processing device in a heterogeneous system to use per workload size, per kernel. Active machine learning is a methodology which is sensitive to the cost of training; specifically, it is able to reduce the time taken to construct a model by predicting how much is expected to be learnt from each new training instance and then only choosing to learn from those most profitable examples. The exemplar heuristic is constructed on average 4x faster than a baseline approach, whilst maintaining comparable quality. Next, a combination of active learning and sequential analysis is presented which reduces both the number of samples per training example as well as the number of training examples overall. This allows for the creation of models based on noisy information, sacrificing accuracy per training instance for speed, without having a significant affect on the quality of the final product. In particular, the runtime of high-performance compute kernels is predicted from code transformations one may want to apply using a heuristic which was generated up to 26x faster than with active learning alone. Finally, preliminary work demonstrates that an automated system can be created which optimises both the number of training examples as well as which features to select during training to further substantially accelerate learning, in cases where each feature value that is revealed comes at some cost.
Chow, Edward Yik
Thesis (Sc.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1981. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING. / Includes bibliographical references. / by Edward Yik Chow. / Sc.D.
Page generated in 0.1307 seconds