The aim of the study was to evaluate dierent types of logistic regression to find the optimal model to predict the gender of hospitalized COVID-19 patients. The models were based on COVID-19 case series data from Pakistan using a set of 18 explanatory variables out of which patient age and BMI were numerical and the rest were categorical variables, expressing symptoms and previous health issues. Compared were a logistic regression using all variables, a logistic regression that used stepwise variable selection with 4 explanatory variables, a logistic Ridge regression model, a logistic Lasso regression model and a logistic Elastic Net regression model. Based on several metrics assessing the goodness of fit of the models and the evaluation of predictive power using the area under the ROC curve the Elastic Net that was only using the Lasso penalty had the best result and was able to predict 82.5% of the test cases correctly.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kau-85642 |
Date | January 2021 |
Creators | Schwarz, Patrick |
Publisher | Karlstads universitet, Handelshögskolan (from 2013) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.002 seconds