Global ETD Search

81	Sharpening the Boundaries of the Sequential Probability Ratio Test Krantz, Elizabeth 01 May 2012 (has links) In this thesis, we present an introduction to Wald’s Sequential Probability Ratio Test (SPRT) for binary outcomes. Previous researchers have investigated ways to modify the stopping boundaries that reduce the expected sample size for the test. In this research, we investigate ways to further improve these boundaries. For a given maximum allowable sample size, we develop a method intended to generate all possible sets of boundaries. We then find the one set of boundaries that minimizes the maximum expected sample size while still preserving the nominal error rates. Once the satisfying boundaries have been created, we present the results of simulation studies conducted on these boundaries as a means for analyzing both the expected number of observations and the amount of variability in the sample size required to make a decision in the test. sequential testing hypothesis testing Statistics and Probability
82	COMPARISON OF TWO SAMPLES BY A NONPARAMETRIC LIKELIHOOD-RATIO TEST Barton, William H. 01 January 2010 (has links) In this dissertation we present a novel computational method, as well as its software implementation, to compare two samples by a nonparametric likelihood-ratio test. The basis of the comparison is a mean-type hypothesis. The software is written in the R-language [4]. The two samples are assumed to be independent. Their distributions, which are assumed to be unknown, may be discrete or continuous. The samples may be uncensored, right-censored, left-censored, or doubly-censored. Two software programs are offered. The first program covers the case of a single mean-type hypothesis. The second program covers the case of multiple mean-type hypotheses. For the first program, an approximate p-value for the single hypothesis is calculated, based on the premise that -2log-likelihood-ratio is asymptotically distributed as χ2(1). For the second program, an approximate p-value for the p hypotheses is calculated, based on the premise that -2log-likelihood-ratio is asymptotically distributed as χ2(p). In addition we present a proof relating to use of a hazard-type hypothesis as the basis of comparison. We show that -2log-likelihood-ratio is asymptotically distributed as χ2(1) for this hypothesis. The R programs we have developed can be downloaded free-of-charge on the internet at the Comprehensive R Archive Network (CRAN) at http://cran.r-project.org, package name emplik2. The R-language itself is also available free-of-charge at the same site. Likelihood Ratio Nonparametric Hypothesis Hazard Statistics and Probability
83	On concomitants of order statistics Wang, Ke, January 2008 (has links) Thesis (Ph. D.)--Ohio State University, 2008. / Title from first page of PDF file. Includes bibliographical references (p. 115-120).
84	Inference on correlation from incomplete bivariate samples He, Qinying. January 2007 (has links) Thesis (Ph. D.)--Ohio State University, 2007. / Title from first page of PDF file. Includes bibliographical references (p. 133-137).
85	Statistical Models for Predicting College Success Nunez, Yelen 13 November 2013 (has links) Colleges base their admission decisions on a number of factors to determine which applicants have the potential to succeed. This study utilized data for students that graduated from Florida International University between 2006 and 2012. Two models were developed (one using SAT as the principal explanatory variable and the other using ACT as the principal explanatory variable) to predict college success, measured using the student’s college grade point average at graduation. Some of the other factors that were used to make these predictions were high school performance, socioeconomic status, major, gender, and ethnicity. The model using ACT had a higher R^2 but the model using SAT had a lower mean square error. African Americans had a significantly lower college grade point average than graduates of other ethnicities. Females had a significantly higher college grade point average than males. Statistical Models academic success Education Statistics and Probability
86	Penalized methods in genome-wide association studies Liu, Jin 01 July 2011 (has links) Penalized regression methods are becoming increasingly popular in genome-wide association studies (GWAS) for identifying genetic markers associated with disease. However, standard penalized methods such as the LASSO do not take into account the possible linkage disequilibrium between adjacent markers. We propose a novel penalized approach for GWAS using a dense set of single nucleotide polymorphisms (SNPs). The proposed method uses the minimax concave penalty (MCP) for marker selection and incorporates linkage disequilibrium (LD) information by penalizing the difference of the genetic effects at adjacent SNPs with high correlation. A coordinate descent algorithm is derived to implement the proposed method. This algorithm is efficient and stable in dealing with a large number of SNPs. A multi-split method is used to calculate the p-values of the selected SNPs for assessing their significance. We refer to the proposed penalty function as the smoothed MCP (SMCP) and the proposed approach as the SMCP method. Performance of the proposed SMCP method and its comparison with a LASSO approach are evaluated through simulation studies, which demonstrate that the proposed method is more accurate in selecting associated SNPs. Its applicability to real data is illustrated using data from a GWAS on rheumatoid arthritis. Based on the idea of SMCP, we propose a new penalized method for group variable selection in GWAS with respect to the correlation between adjacent groups. The proposed method uses the group LASSO for encouraging group sparsity and a quadratic difference for adjacent group smoothing. We call it smoothed group LASSO, or SGL for short. Canonical correlations between two adjacent groups of SNPS are used as the weights in the quadratic difference penalty. Principal components are used to reduced dimensionality locally within groups. We derive a group coordinate descent algorithm for computing the solution path of the SGL. Simulation studies are used to evaluate the finite sample performance of the SGL and group LASSO. We also demonstrate its applicability on rheumatoid arthritis data. GWAS Linkage disequilibrium Penalized regression Statistics and Probability
87	Robust Computational Tools for Multiple Testing with Genetic Association Studies Welbourn, William L., Jr. 01 May 2012 (has links) Resolving the interplay of the genetic components of a complex disease is a challenging endeavor. Over the past several years, genome-wide association studies (GWAS) have emerged as a popular approach at locating common genetic variation within the human genome associated with disease risk. Assessing genetic-phenotype associations upon hundreds of thousands of genetic markers using the GWAS approach, introduces the potentially high number of false positive signals and requires statistical correction for multiple hypothesis testing. Permutation tests are considered the gold standard for multiple testing correction in GWAS, because they simultaneously provide unbiased Type I error control and high power. However, they demand heavy computational effort, especially with large-scale data sets of modern GWAS. In recent years, the computational problem has been circumvented by using approximations to permutation tests, but several studies have posed sampling conditions in which these approximations are suggestive to be biased. We have developed an optimized parallel algorithm for the permutation testing approach to multiple testing correction in GWAS, whose implementation essentially abates the computational problem. When introduced to GWAS data, our algorithm yields rapid, precise, and powerful multiplicity adjustment, many orders of magnitude faster than existing employed GWAS statistical software. Although GWAS have identified many potentially important genetic associations which will advance our understanding of human disease, the common variants with modest effects on disease risk discovered through this approach likely account for a small proportion of the heritability in complex disease. On the other hand, interactions between genetic and environmental factors could account for a substantial proportion of the heritability in a complex disease and are overlooked within the GWAS approach. We have developed an efficient and easily implemented tool for genetic association studies, whose aim is identifying genes involved in a gene-environment interaction. Our approach is amenable to a wide range of association studies and assorted densities in sampled genetic marker panels, and incorporates resampling for multiple testing correction. Within the context of a case-control study design we demonstrate by way of simulation that our proposed method offers greater statistical power to detect gene-environment interaction, when compared to several competing approaches to assess this type of interaction. computation multiple testing genetic association Statistics and Probability
88	A Comparison of Estimation Procedures for the Beta Distribution Yan, Huey 01 May 1991 (has links) The beta distribution may be used as a stochastic model for continuous proportions in many situations in applied statistics. This thesis was concerned with estimation of the parameters of the beta distribution in three different situations. Three different estimation procedures-the method of moments, maximum likelihood, and a hybrid of these two methods, which we call the one-step improvement-were compared by computer simulation, for beta data and beta data contaminated by zeros and ones. We also evaluated maximum likelihood estimation in the context of censored data, and Newton's method as a numerical procedure for solving the likelihood equations for censored beta data. Comparison estimation beta distribution Statistics and Probability
89	Statistical Analysis of Linear Analog Circuits Using Gaussian Message Passing in Factor Graphs Phadnis, Miti 01 December 2009 (has links) This thesis introduces a novel application of factor graphs to the domain of analog circuits. It proposes a technique of leveraging factor graphs for performing statistical yield analysis of analog circuits that is much faster than the standard Monte Carlo/Simulation Program With Integrated Circuit Emphasis (SPICE) simulation techniques. We have designed a tool chain to model an analog circuit and its corresponding factor graph and then use a Gaussian message passing approach along the edges of the graph for yield calculation. The tool is also capable of estimating unknown parameters of the circuit given known output statistics through backward message propagation in the factor graph. The tool builds upon the concept of domain-specific modeling leveraged for modeling and interpreting different kinds of analog circuits. Generic Modeling Environment (GME) is used to design modeling environment for analog circuits. It is a configurable tool set that supports creation of domain-specific design environments for different applications. This research has developed a generalized methodology that could be applied towards design automation of different kinds of analog circuits, both linear and nonlinear. The tool has been successfully used to model linear amplifier circuits and a nonlinear Metal Oxide Semiconductor Field Effect Transistor (MOSFET) circuit. The results obtained by Monte Carlo simulations performed on these circuits are used as a reference in the project to compare against the tool's results. The tool is tested for its efficiency in terms of time and accuracy against the standard results. Factor Graphs GME Statistical Analysis Statistics and Probability
90	Implementing the Use of Personal Activity Data in an Introductory Statistics Course Christensen, Lacy 01 August 2018 (has links) Integrating real data into a classroom is one of the recommendations in the Guidelines for Assessment and Instruction in Statistics Education (GAISE) college report which lays out guidelines for an introductory statistics course (Committee, GAISE College Report ASA Revision, 2016). In order to assess the effect of using real data in a classroom, the students received physical activity trackers to wear during an undergraduate introductory statistics course taught in the summer. This tracker, a Fitbit, enabled students to monitor and record their steps, calories, and active time throughout the class. Collecting personal activity data (PAD) creates a large database which students can then analyze and use to build statistical thinking. Since the students are intimately familiar with the data they gathered, they could focus on the patterns they saw in the data based on their own personal experiences. With this data, the students completed tasks that asked them to analyze their physical activity using methods including summary statistics and bivariate analysis. These projects encouraged students to think about problems that arise from data collection and analysis in real life situations. We saw that using PAD helped the tasks become more personal, increased interest and engagement, and reinforced the material taught in class. data statistics education technology Statistics and Probability

Search results