Global ETD Search

1	Novel Structural Properties and An Improved Bound for the Number Distinct Squares in a Strings Thierry, Adrien January 2016 (has links) Combinatorics on words explore words – often called strings in the com- puter science community, or monoids in mathematics – and their structural properties. One of the most studied question deals with repetitions which are a form of redundancy. The thesis focuses on estimating the maximum number of distinct squares in a string of length n. Our approach is to study the combinatorial properties of these overlapping structures, nested systems, and obtain insights into the intricate patterns that squares create. Determin- ing the maximum number of repetitions in a string is of interest in different fields such as biology and computer science. For example, the question arrises when one tries to bound the number of repetitions in a gene or in a computer file to be data compressed. Specific strings containing many repetitions are often of interest for additional combinatorial properties. After a brief review of earlier results and an introduction to the question of bounding the maxi- mum number of distinct squares, we present the combinatorial insights and techniques used to obtain the main result of the thesis: a strengthening of the universal upper bound obtained by Fraenkel and Simpson in 1998. / Thesis / Doctor of Philosophy (PhD) Combinatorics on Words free monoid String Algorithms
2	An Extension of The Berry-Ravindran Algorithm for protein and DNA data Riekkola, Jesper January 2022 (has links) String matching algorithms are the algorithms used to search through different types of text in search of a certain pattern. Many of these algorithms achieve their impressive performance by analysing the pattern and saving that information. That information is then continuously used during the searching phase to know what parts of the text can be skipped. One such algorithm is the Berry-Ravindran. The Berry-Ravindran checks the two characters past the current try for a match and sees if those characters exist in the pattern. This thesis compares the Berry-Ravindran algorithm to new versions of itself that check three and four characters instead of two, along with the Boyer-Moore algorithm. Checking more characters improves the amount of the text that can be skipped by reducing the number of attempts needed but exponentially increases the pre-processing time. The improved performance in attempts does not necessarily mean a faster run-time because of the increased pre-processing time. The variable impacting the pre-processing time the biggest is the size of the alphabet that the text uses. This is researched by testing these algorithms with patterns ranging from 4 to 100 characters long on two different data sets. Protein data which has an alphabet size of 27 and DNA data which has an alphabet size of 4. Berry-Ravindran text matching string matching exact-matching string algorithms pattern matching Berry-Ravindran textmatchning strängmatchning Engineering and Technology Teknik och teknologier
3	A Generalization of Square-free Strings Mhaskar, Neerja January 2016 (has links) Our research is in the general area of String Algorithms and Combinatorics on Words. Specifically, we study a generalization of square-free strings, shuffle properties of strings, and formalizing the reasoning about finite strings. The existence of infinitely long square-free strings (strings with no adjacent repeating word blocks) over a three (or more) letter finite set (referred to as Alphabet) is a well-established result. A natural generalization of this problem is that only subsets of the alphabet with predefined cardinality are available, while selecting symbols of the square-free string. This problem has been studied by several authors, and the lowest possible bound on the cardinality of the subset given is four. The problem remains open for subset size three and we investigate this question. We show that square-free strings exist in several specialized cases of the problem and propose approaches to solve the problem, ranging from patterns in strings to Proof Complexity. We also study the shuffle property (analogous to shuffling a deck of cards labeled with symbols) of strings, and explore the relationship between string shuffle and graphs, and show that large classes of graphs can be represented with special type of strings. Finally, we propose a theory of strings, that formalizes the reasoning about finite strings. By engaging in this line of research, we hope to bring the richness of the advanced field of Proof Complexity to Stringology. / Thesis / Doctor of Philosophy (PhD) Strings Repetitions Square-free Thue Morphisms Formalism for strings Proof Complexity String Algorithms String shuffle Graphs and string shuffle

1

Page generated in 0.0645 seconds