Return to search

Biologically Relevant Multiple Sequence Alignment

Researchers use multiple sequence alignment algorithms to detect conserved regions in genetic sequences and to identify drug docking sites for drug development. In this dissertation, a novel algorithm is presented for using physicochemical properties to increase the accuracy of multiple sequence alignments. Secondary structures are also incorporated in the evaluation function. Additionally, the location of the secondary structures is assimilated into the function. Multiple properties are combined with weights, determined from prediction accuracies of protein secondary structures using artificial neural networks. A new metric, the PPD Score is developed, that captures the average change in physicochemical properties. Using the physicochemical properties and the secondary structures for multiple sequence alignment results in alignments that are more accurate, biologically relevant and useful for drug development and other medical uses. In addition to a novel multiple sequence alignment algorithm, we also propose a new protein-coding DNA reference alignment database. This database is a collection of multiple sequence alignment data sets derived from tertiary structural alignments. The primary purpose of the database is to benchmark new and existing multiple sequence alignment algorithms with DNA data. The first known comparative study of protein-coding DNA alignment accuracies is also included in this work.

Identiferoai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-2592
Date21 August 2008
CreatorsCarroll, Hyrum D.
PublisherBYU ScholarsArchive
Source SetsBrigham Young University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceTheses and Dissertations
Rightshttp://lib.byu.edu/about/copyright/

Page generated in 0.0021 seconds