In this dissertation, we develop tools from non-parametric and semi-parametric statistics to perform estimation and inference. In the first chapter, we propose a new method called Non-Parametric Outlier Identification and Smoothing (NOIS), which robustly smooths stock prices, automatically detects outliers and constructs pointwise confidence bands around the resulting curves. In real- world examples of high-frequency data, NOIS successfully detects erroneous prices as outliers and uncovers borderline cases for further study. NOIS can also highlight notable features and reveal new insights in inter-day chart patterns. In the second chapter, we focus on a method for non-parametric inference called empirical likelihood (EL). Computation of EL in the case of a fixed parameter vector is a convex optimization problem easily solved by Lagrange multipliers. In the case of a composite empirical likelihood (CEL) test where certain components of the parameter vector are free to vary, the optimization problem becomes non-convex and much more difficult. We propose a new algorithm for the CEL problem named the BI-Linear Algorithm for Composite EmPirical Likelihood (BICEP). We extend the BICEP framework by introducing a new method called Robust Empirical Likelihood (REL) that detects outliers and greatly improves the inference in comparison to the non-robust EL. The REL method is combined with CEL by the TRI-Linear Algorithm for Composite EmPirical Likelihood (TRICEP). We demonstrate the efficacy of the proposed methods on simulated and real world datasets. We present a novel semi-parametric method for variable selection with interesting biological applications in the final chapter. In bioinformatics datasets the experimental units often have structured relationships that are non-linear and hierarchical. For example, in microbiome data the individual taxonomic units are connected to each other through a phylogenetic tree. Conventional techniques for selecting relevant taxa either do not account for the pairwise dependencies between taxa, or assume linear relationships. In this work we propose a new framework for variable selection called Semi-Parametric Affinity Based Selection (SPAS), which has the flexibility to utilize struc- tured and non-parametric relationships between variables. In synthetic data experiments SPAS outperforms existing methods and on real world microbiome datasets it selects taxa according to their phylogenetic similarities. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester 2018. / April 19, 2018. / Bioinformatics, Empirical likelihood, Finance, Non-parametric, Outlier detection, Variable selection / Includes bibliographical references. / Yiyuan She, Professor Directing Dissertation; Giray Okten, University Representative; Eric Chicken, Committee Member; Xufeng Niu, Committee Member; Minjing Tao, Committee Member.
Identifer | oai:union.ndltd.org:fsu.edu/oai:fsu.digital.flvc.org:fsu_653516 |
Contributors | Tran, Hoang Trong (author), She, Yiyuan (professor directing dissertation), Ökten, Giray (university representative), Chicken, Eric, 1963- (committee member), Niu, Xufeng, 1954- (committee member), Tao, Minjing (committee member), Florida State University (degree granting institution), College of Arts and Sciences (degree granting college), Department of Statistics (degree granting departmentdgg) |
Publisher | Florida State University |
Source Sets | Florida State University |
Language | English, English |
Detected Language | English |
Type | Text, text, doctoral thesis |
Format | 1 online resource (125 pages), computer, application/pdf |
Page generated in 0.0024 seconds