This study uncovered the pattern and spatial relationships between socio-economic factors and aggregated COVID-19 rates in Ottawa, Canada, from July 2020 to December 2021 at the neighbourhood scale. Both top-down and bottom-up data mining approaches were used to predict COVID-19 rates. The top-down approach employed ordinary least squares regression (OLS), spatial error model (SEM), geographically weighted regression (GWR) and multi-scale geographically weighted regression (MGWR). Model intercomparison was also undertaken. The pattern of COVID-19 in Ottawa exhibited a significant moderately positive spatial structure among neighbourhoods (Moran's I = 0.39; p = 0.0001). Local Moran's analysis identified areas of low and high COVID-19 clustering, interspersed with cold spots. The OLS model used determinants based on a literature review. Determinants were tested for normality using the Shapiro-Wilks test with those that failed the test had transformatoins to normality applied. Next, an OLS-based backward stepwise approach was used to select the optimal set of determinants based on goodness of fit, selecting the model with the lowest Akaike Information Criterion (AIC). The percentage of people who take public transit to work, percentage of people with no high school diploma, percentage of people over 65 years old, and percentage of people with a Bachelor level degree or above comprised the final set of determinants. A SEM model was created to account for residual spatial autocorrelation in the OLS model's residuals and yielded an adjusted R² = 0.63. Based on the SEM, a one-unit increase in the square root of the percentage of people with a bachelor's degree or above was associated with a 3.2% increase in COVID-19 rates, while the same unit increase in the square root of the percentage of people with no high school diploma was associated with a 10.6% increase in COVID-19 rates. Conversely, a one percent increase in the percentage of people aged 65 and older was linked to a 34.6% decrease in COVID-19 rates. To examine local variations in the relationships between the determinants and COVID-19, a MGWR with a Bisquare kernel and an adaptive bandwidth was used to improve upon the overall explained variance of the SEM model. The residuals of the MGWR model exhibited no significant spatial autocorrelation (Moran's I = -0.04; p = 0.62) and residuals were approximately normal (W = 0.98; p > 0.25). The MGWR model yielded an adjusted R² = 0.75. Taking a data mining and bottom-up approach, an optimized Random Forest model provided a very different set of determinants as important when compared to the top-down regression approaches and accounted for 47.34% of the COVID-19 variance.
Identifer | oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/45740 |
Date | 15 December 2023 |
Creators | Laadhar, Brahim |
Contributors | Sawada, Michael C. |
Publisher | Université d'Ottawa / University of Ottawa |
Source Sets | Université d’Ottawa |
Language | English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Page generated in 0.002 seconds