One of the main classes of spatial epidemiological studies is disease mapping, where the main aim is to describe the overall disease distribution on a map, for example, to highlight areas of elevated or lowered mortality or morbidity risk, or to identify important social or environmental risk factors adjusting for the spatial distribution of the disease. This thesis focused and proposed solutions to two most commonly seen obstacles in disease mapping applications, the changing census boundaries due to long study period and data aggregation for patients' confidentiality.
In disease mapping, when target diseases have low prevalence, the study usually covers a long time period to accumulate sufficient cases.
However, during this period, numerous irregular changes in the census regions on which population is reported may occur, which complicates inferences.
A new model was developed for the case when the exact location of the cases is available, consisting of a continuous random spatial surface and fixed effects for time and ages of individuals.
The process is modelled on a fine grid, approximating the underlying continuous risk surface with Gaussian Markov Random Field and Bayesian inference is performed using integrated nested Laplace approximations. The model was applied to clinical data on the location of residence at the time of diagnosis of new Lupus cases in Toronto, Canada, for the 40 years to 2007, with the aim of finding areas of abnormally high risk. Predicted risk surfaces and posterior exceedance probabilities are produced for Lupus and, for comparison, Psoriatic Arthritis data from the same clinic.
Simulation studies are also carried out to better understand the performance of the proposed model as well as to compare with existing methods.
When the exact locations of the cases are not known, inference is complicated by the uncertainty of case locations due to data aggregation on census regions for confidentiality.
Conventional modelling relies on the census boundaries that are unrelated to the biological process being modelled, and may result in stronger spatial dependence in less populated regions which dominate the map. A new model was developed consisting of a continuous random spatial surface with aggregated responses and fixed covariate effects on census region levels.
The continuous spatial surface was approximated by Markov random field, greatly reduces the computational complexity.
The process was modelled on a lattice of fine grid cells and Bayesian inference was performed using Markov Chain Monte Carlo with data augmentation.
Simulation studies were carried out to assess performance of the proposed model and to compare with the conventional Besag-York-Molli\'e model
as well as model assuming exact locations are known. Receiver operating characteristic curves and Mean Integrated Squared Errors were used as measures
of performance. For the application, surveillance data on the locations of residence at the time of diagnosis of syphilis cases in North Carolina, for the 9 years to 2007 are modelled with the aim of finding areas of abnormally high risk. Predicted risk surfaces and posterior exceedance probabilities are also produced, identifying Lumberton as a ``syphilis hotspot".
Identifer | oai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/36290 |
Date | 16 August 2013 |
Creators | Li, Ye |
Contributors | Brown, Patrick, Stafford, Jamie |
Source Sets | University of Toronto |
Language | en_ca |
Detected Language | English |
Type | Thesis |
Page generated in 0.0103 seconds