Global ETD Search

Return to search

Studies in protein secondary structure prediction with neural network models

The aim of this work was to predict protein secondary structure using neural network models. Initially a Hopfield network was used but abandoned in favour of a layered network trained using the back propagation algorithm. In the early stages of this work an exploration of the many different approaches to this problem was undertaken. These included attempts to predict boundaries between secondary structures, the secondary structures of individual residues, and the secondary structures of sequences wholly within a particular secondary structure. Results indicated the latter to be the best approach to continue with. In addition two coding schemes were investigated: a coding scheme based on occurrences of pairs of residues and one based on the positions of residues. It was found that this positional coding scheme was the natural coding scheme for this problem. On segments of whole alpha-helix and whole non-alpha-helix 10 residues in length a prediction success of around 80% with a correlation coefficient of 0.52 was achieved with the positional coding scheme. On whole proteins, where predictions are made for individual residues, alpha-helix prediction drops to 73% with a correlation coefficient of 0.34. The relative predictability of alpha-helices of above and below average accessibility was also investigated showing that those of above average accessibility are more predictable than those with below average accessibility. The main body of this work concerns the apparent limit of predictability of alpha-helices. It was found that test set prediction did not depend on the number of hidden nodes. In fact, a single layer network performed as well as those with hidden nodes showing that the probolem is basically linearly separable. In addition, prediction success plateaus well below that of perfect prediction success. During training, test set prediction is seen to peak. The decrease in prediction success was found to be due to non-alpha-helix sequences that the network was unable to distinguish from real alpha-helix sequences. These regions of non-alpha-helix were shown to occur adjacent to actual alpha-helices with statistical significance. It is proposed that potential alpha-helices are disrupted by global constraints during the formation of tertiary structure. The effect of window size was also investigated as was beta-sheet prediction, but this was found to be limited by the small number of examples available with our approach. However, its distribution in the input space in relation to alpha-helix and coil was determined.

http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.652275

573.8

Identifer	oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:652275
Date	January 1991
Creators	Hayward, Steven John
Publisher	University of Edinburgh
Source Sets	Ethos UK
Detected Language	English
Type	Electronic Thesis or Dissertation
Source	http://hdl.handle.net/1842/14034

Page generated in 0.0018 seconds

Studies in protein secondary structure prediction with neural network models

Description

Links & Downloads

Tags

Additional Fields