Return to search

Classification context in a machine learning approach to predicting protein secondary structure

An important problem in molecular biology is to predict the secondary
structure of proteins from their primary structure. The primary structure of a
protein is the sequence of amino acid residues. The secondary structure is an
abstract description of the shape of the folded protein, with regions identified
as alpha helix, beta strands, and random coil. Existing methods of secondary
structure prediction examine a short segment of the primary structure and predict
the secondary structure class (alpha, beta, coil) of an individual residue centered in
that segment. The last few years of research have failed to improve these methods
beyond the level of 65% correct predictions.
This thesis investigates whether these methods can be improved by permitting
them to examine externally-supplied predictions for the secondary structure
of other residues in the segment. The externally-supplied predictions are called
the "classification context," because they provide contextual information about
the secondary structure classifications of neighboring residues. The classification
context could be provided by an existing algorithm that made initial secondary
structure predictions, and then these could be taken as input by a second algorithm
that would attempt to improve the predictions.
A series of experiments on both real and simulated classification context
were performed to measure the possible improvement that could be obtained from
classification context. The results showed that the classification context provided
by current algorithms does not yield improved performance when used as input by
those same algorithms. However, if the classification context is generated by randomly
damaging the correct classifications, substantial performance improvements
are possible. Even small amounts of randomly damaged correct context improves
performance. / Graduation date: 1994

Identiferoai:union.ndltd.org:ORGSU/oai:ir.library.oregonstate.edu:1957/35946
Date13 May 1993
CreatorsLangford, Bill T.
ContributorsDietterich, Thomas G.
Source SetsOregon State University
Languageen_US
Detected LanguageEnglish
TypeThesis/Dissertation

Page generated in 0.0015 seconds