Return to search

Patterns in Words Related to DNA Rearrangements

Patterns, sequences of variables, have traditionally only been studied when morphic images of them appear as factors in words. In this thesis, we initiate a study of patterns in words that appear as subwords of words. We say that a pattern appears in a word if each pattern variable can be morphically mapped to a factor in the word. To gain insight into the complexity of, and similarities between, words, we define pattern indices and distances between two words relative a given set of patterns. The distance is defined as the minimum number of pattern insertions and/or removals that transform one word into another. The pattern index is defined as the minimum number of pattern removals that transform a given word into the empty word. We initially consider pattern distances between arbitrary words. We conjecture that the word distance is computable relative the pattern αα and prove a lemma in this direction. Motivated by patterns detected in certain scrambled ciliate genomes, we focus on double occurrence words (words where every symbol appears twice) and consider recursive patterns, a generalization of the notion of a pattern which includes new types of words. We show that in double occurrence words the distance relative so-called complete sets of recursive patterns is computable. In particular, the pattern distance relative patterns αα (repeat words) and ααR (return words) is computable for double occurrence words. We conclude by applying pattern indices and word distances towards the analysis of highly scrambled genes in O. trifallax and discover a common pattern.

Identiferoai:union.ndltd.org:USF/oai:scholarcommons.usf.edu:etd-8109
Date30 June 2017
CreatorsNabergall, Lukas
PublisherScholar Commons
Source SetsUniversity of South Flordia
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceGraduate Theses and Dissertations

Page generated in 0.0026 seconds