Return to search

Robustness and generalisation : tangent hyperplanes and classification trees

The issue of robust training is tackled for fixed multilayer feedforward architectures. Several researchers have proved the theoretical capabilities of Multilayer Feedforward networks but in practice the robust convergence of standard methods like standard backpropagation, conjugate gradient descent and Quasi-Newton methods may be poor for various problems. It is suggested that the common assumptions about the overall surface shape break down when many individual component surfaces are combined and robustness suffers accordingly. A new method to train Multilayer Feedforward networks is presented in which no particular shape is assumed for the surface and where an attempt is made to optimally combine the individual components of a solution for the overall solution. The method is based on computing Tangent Hyperplanes to the non-linear solution manifolds. At the core of the method is a mechanism to minimise the sum of squared errors and as such its use is not limited to Neural Networks. The set of tests performed for Neural Networks show that the method is very robust regarding convergence of training and has a powerful ability to find good directions in weight space. Generalisation is also a very important issue in Neural Networks and elsewhere. Neural Networks are expected to provide sensible outputs for unseen inputs. A framework for hyperplane based classifiers is presented for improving average generalisation. The framework attempts to establish a trained boundary so that there is an optimal overall spacing from the boundary to training points closest to this boundary. The framework is shown to provide results consistent with the theoretical expectations.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:741907
Date January 1997
CreatorsFernandes, Antonio Ramires
ContributorsWeir, Michael
PublisherUniversity of St Andrews
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/10023/13468

Page generated in 0.0025 seconds