This paper considers four approaches to ordinal predictors in linear regression to evaluate how these contrast with respect to predictive accuracy. The two most typical treatments, namely, dummy coding and classic linear regression on assigned level scores are compared with two improved methods; penalized smoothed coefficients and a generalized additive model with cubic splines. A simulation study is conducted to assess all on the basis of predictive performance. Our results show that the dummy based methods surpass the numeric at low sample sizes. Although, as sample size increases, differences between the methods diminish. Tendencies of overfitting are identified among the dummy methods. We conclude by stating that the choice of method not only ought to be context driven, but done in the light of all characteristics.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-273958 |
Date | January 2016 |
Creators | Modin Larsson, Jim |
Publisher | Uppsala universitet, Statistiska institutionen |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0023 seconds