Return to search

Minimum description length, regularisation and multi-modal data

Conventional feed forward Neural Networks have used the sum-of-squares cost function for training. A new cost function is presented here with a description length interpretation based on Rissanen's Minimum Description Length principle. It is a heuristic that has a rough interpretation as the number of data points fit by the model. Not concerned with finding optimal descriptions, the cost function prefers to form minimum descriptions in a naive way for computational convenience. The cost function is called the Naive Description Length cost function. Finding minimum description models will be shown to be closely related to the identification of clusters in the data. As a consequence the minimum of this cost function approximates the most probable mode of the data rather than the sum-of-squares cost function that approximates the mean. The new cost function is shown to provide information about the structure of the data. This is done by inspecting the dependence of the error to the amount of regularisation. This structure provides a method of selecting regularisation parameters as an alternative or supplement to Bayesian methods. The new cost function is tested on a number of multi-valued problems such as a simple inverse kinematics problem. It is also tested on a number of classification and regression problems. The mode-seeking property of this cost function is shown to improve prediction in time series problems. Description length principles are used in a similar fashion to derive a regulariser to control network complexity.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:388578
Date January 1995
CreatorsVan der Rest, John C.
PublisherAston University
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://publications.aston.ac.uk/7986/

Page generated in 0.0057 seconds