<p> Identifying motifs that are "close" to one or more substrings in each sequence in a given set of sequences and hence characterize that set is an important problem in computational biology. The target motif identification problem requires motifs that characterize one given set of sequences but are far from every substring in another given set of sequences. This problem is N P-hard and hence is unlikely to have efficient optimal solution algorithms. In this thesis, we propose a set of modifications to one of the most popular stochastic heuristics for finding motifs, Gibbs Sampling [LAB+93], which allow this heuristic to detect target motifs. We also present the results of four simulation studies and tests on real protein datasets which suggest that these modified heuristics are very good at (and are even, in some cases, necessary for) detecting target motifs.</p> / Thesis / Master of Science (MSc)
Identifer | oai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/22706 |
Date | 12 August 1999 |
Creators | Zhang, Xian |
Contributors | Jiang, Tao, Computer Science |
Source Sets | McMaster University |
Language | en_US |
Detected Language | English |
Type | Thesis |
Page generated in 0.0019 seconds