Global ETD Search

Return to search

Synthesizing Regularity Exposing Attributes in Large Protein Databases

This thesis describes a system that synthesizes regularity exposing attributes from large protein databases. After processing primary and secondary structure data, this system discovers an amino acid representation that captures what are thought to be the three most important amino acid characteristics (size, charge, and hydrophobicity) for tertiary structure prediction. A neural network trained using this 16 bit representation achieves a performance accuracy on the secondary structure prediction problem that is comparable to the one achieved by a neural network trained using the standard 24 bit amino acid representation. In addition, the thesis describes bounds on secondary structure prediction accuracy, derived using an optimal learning algorithm and the probably approximately correct (PAC) model.

representation reformulation

secondary structuresprediction

genetic algorithms

neural networks

clustering algorithm

sdecision tree systems

Identifer	oai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/6789
Date	01 May 1993
Creators	de la Maza, Michael
Source Sets	M.I.T. Theses and Dissertation
Language	en_US
Detected Language	English
Format	90 p., 204397 bytes, 794429 bytes, application/octet-stream, application/pdf
Relation	AITR-1444

Page generated in 0.0796 seconds

Synthesizing Regularity Exposing Attributes in Large Protein Databases

Description

Links & Downloads

Tags

Additional Fields