Global ETD Search

Return to search

Multi-layer Perceptron Error Surfaces: Visualization, Structure and Modelling

The Multi-Layer Perceptron (MLP) is one of the most widely applied and researched Artificial Neural Network model. MLP networks are normally applied to performing supervised learning tasks, which involve iterative training methods to adjust the connection weights within the network. This is commonly formulated as a multivariate non-linear optimization problem over a very high-dimensional space of possible weight configurations. Analogous to the field of mathematical optimization, training an MLP is often described as the search of an error surface for a weight vector which gives the smallest possible error value. Although this presents a useful notion of the training process, there are many problems associated with using the error surface to understand the behaviour of learning algorithms and the properties of MLP mappings themselves. Because of the high-dimensionality of the system, many existing methods of analysis are not well-suited to this problem. Visualizing and describing the error surface are also nontrivial and problematic. These problems are specific to complex systems such as neural networks, which contain large numbers of adjustable parameters, and the investigation of such systems in this way is largely a developing area of research. In this thesis, the concept of the error surface is explored using three related methods. Firstly, Principal Component Analysis (PCA) is proposed as a method for visualizing the learning trajectory followed by an algorithm on the error surface. It is found that PCA provides an effective method for performing such a visualization, as well as providing an indication of the significance of individual weights to the training process. Secondly, sampling methods are used to explore the error surface and to measure certain properties of the error surface, providing the necessary data for an intuitive description of the error surface. A number of practical MLP error surfaces are found to contain a high degree of ultrametric structure, in common with other known configuration spaces of complex systems. Thirdly, a class of global optimization algorithms is also developed, which is focused on the construction and evolution of a model of the error surface (or search spa ce) as an integral part of the optimization process. The relationships between this algorithm class, the Population-Based Incremental Learning algorithm, evolutionary algorithms and cooperative search are discussed. The work provides important practical techniques for exploration of the error surfaces of MLP networks. These techniques can be used to examine the dynamics of different training algorithms, the complexity of MLP mappings and an intuitive description of the nature of the error surface. The configuration spaces of other complex systems are also amenable to many of these techniques. Finally, the algorithmic framework provides a powerful paradigm for visualization of the optimization process and the development of parallel coupled optimization algorithms which apply knowledge of the error surface to solving the optimization problem.