Many classification algorithms are designed on the assumption that the population of interest is stationary, i.e. it does not change over time. However, there are many real-world problems where this assumption is not appropriate. In this thesis, we develop a classifier for non-stationary populations which is based on a multiple logistic model for the conditional class probabilities and incorporates a linear combination of the outputs of a number of pre-determined component classifiers. The final classifier is able to adjust to changes in the population by sequential updating of the coefficients of the linear combination, which are the parameters of the model. The model we use is motivated by the relatively good classification performance which has been achieved by classification rules based on combining classifier outputs. However, in some cases such classifiers can also perform relatively poorly, and in general the mechanisms behind such results are little understood. For the model we propose, which is a generalisation of several existing models for stationary classification problems, we show there exists a simple relationship between the component classifiers which are used, the sign of the parameters and the decision boundaries of the final classifier. This relationship can be used to guide the choice of component classifiers, and helps with understanding the conditions necessary for the classifier to perform well. We compare several "on-line" algorithms for implementing the classification model, where the classifier is updated as new labelled observations become available. The predictive approach to classification is adopted, so each algorithm is based on updating the posterior distribution of the parameters as new information is received. Specifically, we compare a method which assumes the posterior distribution is Gaussian, a more general method developed for the class of Dynamic Generalised Linear Models, and a method based on a sequential Monte Carlo approximation of the posterior. The relationship between the model used for parameter evolution, the bias of the parameter estimates and the error of the classifier is explored.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:497123 |
Date | January 2008 |
Creators | Tomas, Amber Nede |
Contributors | Ripley, Brian D. |
Publisher | University of Oxford |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://ora.ox.ac.uk/objects/uuid:0a0273fa-9d47-4626-a758-6b5f03722cd0 |
Page generated in 0.0014 seconds