Return to search

The statistical mechanics of Bayesian model selection

In this thesis we examine the question of model selection in systems which learn input-output mappings from a data set of examples. The models we consider are inspired by feed-forward architectures used within the artificial neural networks community. The approach taken here is to elucidate the properties of various <I>model selection </I>criteria by calculation of relevant quantities derived in a Bayesian framework. These calculations make the assumption that examples are generated from some underlying rule or <I>teacher</I> by randomly sampling the input space and are performed using techniques borrowed from statistical mechanics. Such an approach allows for the comparison of different approaches on the basis of the resultant ability of the system to <I>generalize</I> to novel examples. Broadly stated, the model selection problem is the following. Given only a limited set of examples, which model, or <I>student</I>, should one choose from a set of candidates in order to achieve the highest level of generalization? We consider four model selection criteria. A penalty based method utilising a quantity derived from Bayesian statistics termed the <I>evidence</I>, and two methods based on estimates of the generalization performance namely, the <I>test error</I> and the <I>cross validation error</I>. The fourth method, less widely used, is based on the <I>noise sensitivity </I>of he models. In a simple scenario we demonstrate that model selection based on the evidence is susceptible to misspecification of the student. Our analysis is conducted in the <I>thermodynamic limit</I> where the system size is taken to be arbitrarily large. In particular we examine the <I>evidence procedure</I> assignments of the <I>hyperparameters</I> which control the learning algorithm. We find that, where the student is not sufficiently powerful to fully model the teacher, despite being sub-optimal this procedure is remarkably robust towards such misspecifications. In a scenario in which the student is more than able to represent the teacher we find the evidence procedure is optimal.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:657315
Date January 1996
CreatorsMarion, Glenn
PublisherUniversity of Edinburgh
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/1842/15264

Page generated in 0.0021 seconds