Bayesian approaches to prediction and the assessment of predictive uncertainty in generalized linear models are often based on averaging predictions over different models, and this requires methods for accounting for model uncertainty. In this thesis we describe computational methods for Bayesian inference and model selection for generalized linear models, which improve on existing techniques. These methods are applied to the building of flexible models for gamma ray count data (data measuring the natural radioactivity of rocks) at the Castlereagh Waste Management Centre, which served as a hazardous waste disposal facility for the Sydney region between March 1978 and August 1998. Bayesian model selection methods for generalized linear models enable us to approach problems of smoothing, change point detection and spatial prediction for these data within a common methodological and computational framework, by considering appropriate basis expansions of a mean function. The data at Castlereagh were collected in the following way. A number of boreholes were drilled at the site, and for each borehole a gamma ray detector recorded gamma ray emissions at different depths as the detector was raised gradually from the bottom of the borehole to ground level. The profile of intensity of gamma counts can be informative about the geology at each location, and estimation of intensity profiles raises problems of smoothing and change point detection for count data. The gamma count profiles can also be modelled spatially, to inform the geological profile across the site. Understanding the geological structure of the site is important for modelling the transport of chemical contaminants beneath the waste disposal area. The structure of the thesis is as follows. Chapter 1 describes the Castlereagh hazardous waste site and the geophysical data, which motivated the methodology developed in this research. We summarise the principles of Gamma Ray (GR) logging, a method routinely employed by geophysicists and environmental engineers in the detailed evaluation of hazardous site geology, and detail the use of the Castlereagh data in this research. In Chapter 2 we review some fundamental ideas of Bayesian inference and computation and discuss them in the context of generalised linear models. Chapter 3 details the theoretical basis of our work. Here we give a new Markov chain Monte Carlo sampling scheme for Bayesian variable selection in generalized linear models, which is analogous to the well-known Swendsen-Wang algorithm for the Ising model. Special cases of this sampling scheme are used throughout the rest of the thesis. In Chapter 4 we discuss the use of methods for Bayesian model selection in generalized linear models in two specific applications, which we implement on the Castlereagh data. First, we consider smoothing problems where we flexibly estimate the dependence of a response variable on one or more predictors, and we apply these ideas to locally adaptive smoothing of gamma ray count data. Second, we discuss how the problem of multiple change point detection can be cast as one of model selection in a generalized linear model, and consider application to change point detection for gamma ray count data. In Chapter 5 we consider spatial models based on partitioning a spatial region of interest into cells via a Voronoi tessellation, where the number of cells and the positions of their centres is unknown, and show how these models can be formulated in the framework of established methods for Bayesian model selection in generalized linear models. We implement the spatial partition modelling approach to the spatial analysis of gamma ray data, showing how the posterior distribution of the number of cells, cell centres and cell means provides us with an estimate of the mean response function describing spatial variability across the site. Chapter 6 presents some conclusions and suggests directions for future research. A paper based on the work of Chapter 3 has been accepted for publication in the Journal of Computational and Graphical Statistics, and a paper based on the work in Chapter 4 has been accepted for publication in Mathematical Geology. A paper based on the spatial modelling of Chapter 5 is in preparation and will be submitted for publication shortly. The work in this thesis was collaborative, to a smaller or larger extent in its various components. I authored Chapters 1 and 2 entirely, including definition of the problem in the context of the CWMC site, data gathering and preparation for analysis, review of the literature on computational methods for Bayesian inference and model selection for generalized linear models. I also authored Chapters 4 and 5 and benefited from some of Dr Nott's assistance in developing the algorithms. In Chapter 3, Dr Nott led the development of sampling scheme B (corresponding to having non-zero interaction parameters in our Swendsen-Wang type algorithm). I developed the algorithm for sampling scheme A (corresponding to setting all algorithm interaction parameters to zero in our Swendsen-Wang type algorithm), and performed the comparison of the performance of the two sampling schemes. The final discussion in Chapter 6 and the direction for further research in the case study context is also my work.
Identifer | oai:union.ndltd.org:ADTP/187846 |
Date | January 2003 |
Creators | Leonte, Daniela, School of Mathematics, UNSW |
Publisher | Awarded by:University of New South Wales. School of Mathematics |
Source Sets | Australiasian Digital Theses Program |
Language | English |
Detected Language | English |
Rights | Copyright Daniela Leonte, http://unsworks.unsw.edu.au/copyright |
Page generated in 0.0017 seconds