Compositional data, where measurements are vectors with each component constituting a percentage of a whole, is abundant throughout many disciplines of science. Consequently, there is a strong need to establish valid statistical procedures for this type of data. In this work the basic theory of the compositional sample space is presented and through simulation studies and a case study on data from industrial applications, the current available methods for regression as applied to compositional data are evaluated. The main focus of this work is to establish linear regression in a way compatible with compositional data sets and compare this approach with the alternative of applying standard multivariate regression methods on raw compositional data. It is found that for several data sets, the difference between 'naive' multivariate linear regression and compositional linear regression is negligible; while for others (in particular where the dependence of covariates is not strictly linear) the compositional regression methods are shown to be stronger.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:umu-138463 |
Date | January 2017 |
Creators | Långström, Christoffer |
Publisher | Umeå universitet, Institutionen för matematik och matematisk statistik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.002 seconds