Return to search

Algorithms and complexity for annotated sequence analysis

Molecular biologists use algorithms that compare and otherwise analyze sequences that represent genetic and protein molecules. Most of these algorithms, however, operate on the basic sequence and do not incorporate the additional information that is often known about the molecule and its pieces. This research describes schemes to combinatorially annotate this information onto sequences so that it can be analyzed in tandem with the sequence; the overall result would thus reflect both types of information about the sequence. These annotation schemes include adding colours and arcs to the sequence. Colouring a sequence would produce a same-length sequence of colours or other symbols that highlight or label parts of the sequence. Arcs can be used to link sequence symbols (or coloured substrings) to indicate molecular bonds or other relationships. Adding these annotations to sequence analysis problems such as sequence alignment or finding the longest common subsequence can make the problem more complex, often depending on the complexity of the annotation scheme. This research examines the different annotation schemes and the corresponding problems of verifying annotations, creating annotations, and finding the longest common subsequence of pairs of sequences with annotations. This work involves both the conventional complexity framework and parameterized complexity, and includes algorithms and hardness results for both frameworks. Automata and transducers are created for some annotation verification and creation problems. Different restrictions on layered substring and arc annotation are considered to determine what properties an annotation scheme must have to make its incorporation feasible. Extensions to the algorithms that use weighting schemes are explored. schemes are explored. / Graduate

Identiferoai:union.ndltd.org:uvic.ca/oai:dspace.library.uvic.ca:1828/8864
Date14 December 2017
CreatorsEvans, Patricia Anne
ContributorsFellows, Michael R., Ruskey, Frank
Source SetsUniversity of Victoria
LanguageEnglish, English
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf
RightsAvailable to the World Wide Web

Page generated in 0.0025 seconds