In this study, a panel data set of flights made by employees at the Royal Institute of Technology (KTH) in Sweden is analyzed using generalized linear modeling approaches, with the aim to create a model with high predictive capability of the quarterly CO2 emission and the number of flights, for a year not included in the model estimation. A Zero-inflated Gamma regression model is fitted to the CO2 emission variable and a Zero-inflated Negative Binomial regression model is used for the number of flights. To build the models, cross-validation is performed with the observations from 2018 as the training set and the observations from the next year, 2019, as the test set. One at a time, the variable that best improves the prediction of the test set data (either as included in the count model or the zero-inflation model) is selected until an additional variable turns out insignificant on a 5% significance level in the estimated model. In addition to the variables in the data, three lags of the dependent variables (CO2 emission and flights) were included, as well as transformed versions of the continuous variables, and a random intercept each for the categorical variables indicating quarter and department at KTH, respectively. Neither model selected through the cross-validation process turned out to be particularly good at predicting the values for the upcoming year, but a number of variables were proven to have a statistically significant association with the respective dependent variable.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-532067 |
Date | January 2024 |
Creators | Artman, Arvid |
Publisher | Uppsala universitet, Statistiska institutionen |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0101 seconds