Global ETD Search

11	Automatic Sequences and Decidable Properties: Implementation and Applications Goc, Daniel January 2013 (has links) In 1912 Axel Thue sparked the study of combinatorics on words when he showed that the Thue-Morse sequence contains no overlaps, that is, factors of the form ayaya. Since then many interesting properties of sequences began to be discovered and studied. In this thesis, we consider a class of infinite sequences generated by automata, called the k-automatic sequences. In particular, we present a logical theory in which many properties of k-automatic sequences can be expressed as predicates and we show that such predicates are decidable. Our main contribution is the implementation of a theorem prover capable of practically characterizing many commonly sought-after properties of k-automatic sequences. We showcase a panoply of results achieved using our method. We give new explicit descriptions of the recurrence and appearance functions of a list of well-known k-automatic sequences. We define a related function, called the condensation function, and give explicit descriptions for it as well. We re-affirm known results on the critical exponent of some sequences and determine it for others where it was previously unknown. On the more theoretical side, we show that the subword complexity p(n) of k-automatic sequences is k-synchronized, i.e., the language of pairs (n, p(n)) (expressed in base k) is accepted by an automaton. Furthermore, we prove that the Lyndon factorization of k-automatic sequences is also k-automatic and explicitly compute the factorization for several sequences. Finally, we show that while the number of unbordered factors of length n is not k-synchronized, it is k-regular. automatic sequences combinatorics on words decidable properties Computer Science
12	Automatic Sequences and Decidable Properties: Implementation and Applications Goc, Daniel January 2013 (has links) In 1912 Axel Thue sparked the study of combinatorics on words when he showed that the Thue-Morse sequence contains no overlaps, that is, factors of the form ayaya. Since then many interesting properties of sequences began to be discovered and studied. In this thesis, we consider a class of infinite sequences generated by automata, called the k-automatic sequences. In particular, we present a logical theory in which many properties of k-automatic sequences can be expressed as predicates and we show that such predicates are decidable. Our main contribution is the implementation of a theorem prover capable of practically characterizing many commonly sought-after properties of k-automatic sequences. We showcase a panoply of results achieved using our method. We give new explicit descriptions of the recurrence and appearance functions of a list of well-known k-automatic sequences. We define a related function, called the condensation function, and give explicit descriptions for it as well. We re-affirm known results on the critical exponent of some sequences and determine it for others where it was previously unknown. On the more theoretical side, we show that the subword complexity p(n) of k-automatic sequences is k-synchronized, i.e., the language of pairs (n, p(n)) (expressed in base k) is accepted by an automaton. Furthermore, we prove that the Lyndon factorization of k-automatic sequences is also k-automatic and explicitly compute the factorization for several sequences. Finally, we show that while the number of unbordered factors of length n is not k-synchronized, it is k-regular. automatic sequences combinatorics on words decidable properties Computer Science
13	Combinatorial methods for counting pattern occurrences in a Markovian text Yucong Zhang (9518483) 16 December 2020 (has links) In this dissertation, we provide combinatorial methods to obtain the probabilistic mul-tivariate generating function that counts the occurrences of patterns in a text generated by a Markovian source. The generating function can then be expanded into the Taylor series in which the power of a term gives the size of a text and the coeÿcient provides the proba-bilities of all possible pattern occurrences with the text size. The analysis is on the basis of the inclusion-exclusion principle to pattern counting (Goulden and Jackson, 1979 and 1983) and its application that Bassino et al. (2012) used for obtaining the generating function in the context of the Bernoulli text source. We followed the notations and concepts created by Bassino et al. in the discussion of distinguished patterns and non-reduced pattern sets, with modifications to the Markovian dependence. Our result is derived in the form of a linear matrix equation in which the number of linear equations depends on the size of the alphabet. In addition, we compute the moments of pattern occurrences and discuss the impact of a Markovian text to the moments comparing to the Bernoulli case. The methodology that we use involves the inclusion-exclusion principle, stochastic recurrences, and combinatorics on words including probabilistic multivariate generating functions and moment generating functions.<br> Statistics Combinatorics on Words Pattern matching Stochastic recurrence inclusion-exclusion principle
14	Kombinatorika hashovacích funkcí / Kombinatorika hashovacích funkcí Sýkora, Jiří January 2012 (has links) In this thesis, we study hash functions. We focus mainly on the famous Merkle-Damg˚ard construction and its generalisation. We show that even this generalised construction is not resistant to multicollision attacks. Combinatorics on words plays a fundamental role in the construction of our attack. We prove that regularities unavoidably appear in long words with bounded number of symbol occurences. We present our original results concerning regularities in long words. We lower some earlier published estimates, thus reducing the comlexity of the attack. Our results show that generalised iterated hash functions are interesting rather from the theoretical than practical point of view. 1
15	The Frobenius Problem in a Free Monoid Xu, Zhi January 2009 (has links) Given positive integers c1,c2,...,ck with gcd(c1,c2,...,ck) = 1, the Frobenius problem (FP) is to compute the largest integer g(c1,c2,...,ck) that cannot be written as a non-negative integer linear combination of c1,c2,...,ck. The Frobenius problem in a free monoid (FPFM) is a non-commutative generalization of the Frobenius problem. Given words x1,x2,...,xk such that there are only finitely many words that cannot be written as concatenations of words in {x1,x2,...,xk}, the FPFM is to find the longest such words. Unlike the FP, where the upper bound g(c1,c2,...,ck)≤max 1≤i≤k ci2 is quadratic, the upper bound on the length of the longest words in the FPFM can be exponential in certain measures and some of the exponential upper bounds are tight. For the 2FPFM, where the given words over Σ are of only two distinct lengths m and n with 1<m<n, the length of the longest omitted words is ≤g(m, m\|Σ\|n-m + n - m). In Chapter 1, I give the definition of the FP in integers and summarize some of the interesting properties of the FP. In Chapter 2, I give the definition of the FPFM and discuss some general properties of the FPFM. Then I mainly focus on the 2FPFM. I discuss the 2FPFM from different points of view and present two equivalent problems, one of which is about combinatorics on words and the other is about the word graph. In Chapter 3, I discuss some variations on the FPFM and related problems, including input in other forms, bases with constant size, the case of infinite words, the case of concatenation with overlap, and the generalization of the local postage-stamp problem in a free monoid. In Chapter 4, I present the construction of some essential examples to complement the theory of the 2FPFM discussed in Chapter 2. The theory and examples of the 2FPFM are the main contribution of the thesis. In Chapter 5, I discuss the algorithms for and computational complexity of the FPFM and related problems. In the last chapter, I summarize the main results and list some open problems. Part of my work in the thesis has appeared in the papers. Frobenius problem free monoid co-finite Kleene-star combinatorics on words de Bruijn graph Computer Science (Software Engineering)
16	The Frobenius Problem in a Free Monoid Xu, Zhi January 2009 (has links) Given positive integers c1,c2,...,ck with gcd(c1,c2,...,ck) = 1, the Frobenius problem (FP) is to compute the largest integer g(c1,c2,...,ck) that cannot be written as a non-negative integer linear combination of c1,c2,...,ck. The Frobenius problem in a free monoid (FPFM) is a non-commutative generalization of the Frobenius problem. Given words x1,x2,...,xk such that there are only finitely many words that cannot be written as concatenations of words in {x1,x2,...,xk}, the FPFM is to find the longest such words. Unlike the FP, where the upper bound g(c1,c2,...,ck)≤max 1≤i≤k ci2 is quadratic, the upper bound on the length of the longest words in the FPFM can be exponential in certain measures and some of the exponential upper bounds are tight. For the 2FPFM, where the given words over Σ are of only two distinct lengths m and n with 1<m<n, the length of the longest omitted words is ≤g(m, m\|Σ\|n-m + n - m). In Chapter 1, I give the definition of the FP in integers and summarize some of the interesting properties of the FP. In Chapter 2, I give the definition of the FPFM and discuss some general properties of the FPFM. Then I mainly focus on the 2FPFM. I discuss the 2FPFM from different points of view and present two equivalent problems, one of which is about combinatorics on words and the other is about the word graph. In Chapter 3, I discuss some variations on the FPFM and related problems, including input in other forms, bases with constant size, the case of infinite words, the case of concatenation with overlap, and the generalization of the local postage-stamp problem in a free monoid. In Chapter 4, I present the construction of some essential examples to complement the theory of the 2FPFM discussed in Chapter 2. The theory and examples of the 2FPFM are the main contribution of the thesis. In Chapter 5, I discuss the algorithms for and computational complexity of the FPFM and related problems. In the last chapter, I summarize the main results and list some open problems. Part of my work in the thesis has appeared in the papers. Frobenius problem free monoid co-finite Kleene-star combinatorics on words de Bruijn graph Computer Science (Software Engineering)
17	Codes bifixes, combinatoire des mots et systèmes dynamiques symboliques / Bifix codes, Combinatorics on Words and Symbolic Dynamical Systems Dolce, Francesco 13 September 2016 (has links) L'étude des ensembles de mots complexité linéaire joue un rôle très important dans la théorie de combinatoire des mots et dans la théorie des systèmes dynamiques symboliques.Cette famille d'ensembles comprend les ensembles de facteurs : d'un mot Sturmien ou d'un mot d'Arnoux-Rauzy, d'un codage d'échange d'intervalle, d'un point fixe d'un morphisme primitif, etc.L'enjeu principal de cette thèse est l'étude de systèmes dynamiques minimales, définis de façon équivalente comme ensembles factoriels de mots uniformément récurrents.Comme résultat principal nous considérons une hiérarchie naturelle de systèmes minimal contenante les ensembles neutres, les tree sets et les ensembles spéculaires.De plus, on va relier ces systèmes au groupe libre en utilisant les mots de retours et les bases de sous-groupes d'indice fini.L'on étude aussi les systèmes symboliques dynamiques engendrés par les échanges d'intervalle et les involutions linéaires, ce qui nous permet d'obtenir des exemples et des interprétations géométriques des familles d'ensembles que définis dans notre hiérarchie.L'un des principal outil utilisé ici est l'étude des extensions possibles d'un mot dans un ensemble, ce qui nous permet de déterminer des propriétés telles que la complexité factorielle.Dans ce manuscrit, nous définissons le graphe d'extension, un graphe non orienté associé à chaque mot $w$ dans un ensemble $S$ qui décrit les extensions possibles de $w$ dans $S$ à gauche et à droite.Dans cette thèse, nous présentons plusieurs classes d'ensembles de mots définis par les formes possibles que les graphes d'extensions des éléments dans l'ensemble peuvent avoir.L'une des conditions les plus faibles que nous allons étudier est la condition de neutralité: un mot $w$ est neutre si le nombre de paires $(a,b)$ de lettres telles que $awb in S$ est égal au nombre de lettres $a$ tel que $aw in S$ plus le nombre de lettres $b$ tel que $wb in S$ moins 1.Un ensemble tel que chaque mot non vide satisfait la condition de neutralité est appelé un ensemble neutre.Une condition plus forte est la condition de l'arbre: un mot $w$ satisfait cette condition si son graphe d'extension est à la fois acyclique et connecté.Un ensemble est appelé un tree set si tout mot non vide satisfait cette condition.La famille de tree sets récurrents apparaît comme fermeture naturelle de deux familles d'ensembles très importants : les facteurs d'un mot d'Arnoux-Rauzy et les ensembles d'échange d'intervalle.Nous présentons également les ensembles spéculaires, une sous-famille remarquable de tree sets.Il s'agit également de sous-ensembles de groupes qui forment une généralisation naturelle des groupes libres.Ces ensembles de mots sont une généralisation abstraite des codages naturelles d'échanges d'intervalle et d'involutions linéaires.Pour chaque classe d'ensembles considéré dans cette thèse, nous montrons plusieurs résultats concernant les propriétés de fermeture (sous décodage maximale bifixe ou par rapport aux mots dérivés), la cardinalité des codes bifixes et les de mots de retour, la connexion entre mots de retour et bases du groupe libre, ainsi qu'entre les codes bifixes et les sous-groupes du groupe libre.Chacun de ces résultats est prouvé en utilisant les hypothèses les plus faibles possibles / Sets of words of linear complexity play an important role in combinatorics on words and symbolic dynamics.This family of sets includes set of factors of Sturmian and Arnoux-Rauzy words, interval exchange sets and primitive morphic sets, that is, sets of factors of fixed points of primitive morphisms.The leading issue of this thesis is the study of minimal dynamical systems, also defined equivalently as uniformly recurrent sets of words.As a main result, we consider a natural hierarchy of minimal systems containing neutral sets, tree sets and specular sets.Moreover, we connect the minimal systems to the free group using the notions of return words and basis of subroups of finite index.Symbolic dynamical systems arising from interval exchanges and linear involutions provide us geometrical examples of this kind of sets.One of the main tool used here is the study of possible extensions of a word in a set, that allows us to determine properties such as the factor complexity.In this manuscript we define the extension graph, an undirected graph associated to each word $w$ in a set $S$ which describes the possible extensions of $w$ in $S$ on the left and the right.In this thesis we present several classes of sets of words defined by the possible shapes that the graphs of elements in the set can have.One of the weakest condition that we will study is the neutrality condition: a word $w$ is neutral if the number of pairs $(a, b)$ of letters such that $awb in S$ is equal to the number of letters $a$ such that $aw in S$ plus the number of letters $b$ such that $wb in S$ minus 1.A set such that every nonempty word satisfies the neutrality condition is called a neutral set.A stronger condition is the tree condition: a word $w$ satisfies this condition if its extension graph is both acyclic and connected.A set is called a tree set if any nonempty word satisfies this condition.The family of recurrent tree sets appears as a the natural closure of two known families, namely the Arnoux-Rauzy sets and the interval exchange sets.We also introduce specular sets, a remarkable subfamily of the tree sets.These are subsets of groups which form a natural generalization of free groups.These sets of words are an abstract generalization of the natural codings of interval exchanges and of linear involutions.For each class of sets considered in this thesis, we prove several results concerning closure properties (under maximal bifix decoding or under taking derived words), cardinality of the bifix codes and set of return words in these sets, connection between return words and basis of the free groups, as well as between bifix codes and subgroup of the free group.Each of these results is proved under the weakest possible assumptions Codes bifixes Combinatoire des mots Systèmes dynamiques symboliques Bifix codes Combinatorics on words Symbolic dynamical systems
18	Computing Lyndon Arrays Liut, Michael Adam January 2019 (has links) There are at least two reasons to have an efficient algorithm for identifying all maximal Lyndon substrings in a string: first, in 2015, Bannai et al. introduced a linear algorithm to compute all runs in a string that relies on knowing all maximal Lyndon substrings of the input string, and second, in 2017, Franek et al. showed a linear co-equivalence of sorting suffixes and sorting maximal Lyndon substrings of a string (inspired by a novel suffix sorting algorithm of Baier). In 2016, Franek et al. presented a brief overview of algorithms for com- puting the Lyndon array that encodes the knowledge of maximal Lyndon substrings of the input string. It discussed four different algorithms. Two known algorithms for computing the Lyndon array: a quadratic in-place algorithm based on iterated Duval’s algorithm for Lyndon factorization and a linear algorithmic scheme based on linear suffix sorting, computing the inverse suffix array, and applying the NSV (Next Smaller Value) algorithm. The overview also discusses a recursive version of Duval’s algorithm with a quadratic complexity and an algorithm emulating the NSV approach with a possible O(n log(n)) complexity. The authors at that time did not know of Baier’s algorithm. In 2017, Paracha proposed in her Ph.D. thesis an algorithm for the Lyndon array. The proposed algorithm was interesting as it emulated Farach’s recursive approach for computing suffix trees in linear time and introduced τ-reduction; which might be of independent interest. This was the starting point of this Ph.D. thesis. The primary aim is: (a) developing, analyzing, proving correct, and implementing in C++ a linear algorithm for computing the Lyndon array based on Baier’s suffix sorting; (b) analyzing, proving correct, and implementing in C++ the algorithm proposed by Paracha; and (c) empirically comparing the performance of these two algorithms with the iterative version of Duval’s algorithm. / Dissertation / Doctor of Philosophy (PhD) Combinatorics on Words String Algorithmics Strings Suffix Sorting Lyndon Substrings Lyndon Arrays Maximal Lyndon Substrings
19	A l'intersection de la combinatoire des mots et de la géométrie discrète : palindromes, symétries et pavages / At the intersection of combinatorics on words and discrete geometry : palindromes, symmetries and tilings Blondin Massé, Alexandre 02 December 2011 (has links) Dans cette thèse, différents problèmes de la combinatoire des mots et de géométrie discrète sont considérés. Nous étudions d'abord l'occurrence des palindromes dans les codages de rotations, une famille de mots incluant entre autres les mots sturmiens et les suites de Rote. En particulier, nous démontrons que ces mots sont pleins, c'est-à-dire qu'ils réalisent la complexité palindromique maximale. Ensuite, nous étudions une nouvelle famille de mots, appelés mots pseudostandards généralisés, qui sont générés à l'aide d'un opérateur appelé clôture pseudopalindromique itérée. Nous présentons entre autres une généralisation d'une formule décrite par Justin qui permet de générer de façon linéaire et optimale un mot pseudostandard généralisé. L'objet central, le f-palindrome ou pseudopalindrome est un indicateur des symétries présentes dans les objets géométriques. Dans les derniers chapitres, nous nous concentrons davantage sur des problèmes de nature géométrique. Plus précisément, nous don-nons la solution à deux conjectures de Provençal concernant les pavages par translation, en exploitant la présence de palindromes et de périodicité locale dans les mots de contour. À la fin de plusieurs chapitres, différents problèmes ouverts et conjectures sont brièvement présentés. / In this thesis, we explore different problems at the intersection of combinatorics on words and discrete geometry. First, we study the occurrences of palindromes in codings of rotations, a family of words including the famous Sturmian words and Rote sequences. In particular, we show that these words are full, i.e. they realize the maximal palindromic complexity. Next, we consider a new family of words called generalized pseudostandard words, which are generated by an operator called iterated pseudopalindromic closure. We present a generalization of a formula described by Justin which allows one to generate in linear (thus optimal) time a generalized pseudostandard word. The central object, the f-palindrome or pseudopalindrome, is an indicator of the symmetries in geometric objects. In the last chapters, we focus on geometric problems. More precisely, we solve two conjectures of Provençal about tilings by translation, by exploiting the presence of palindromes and local periodicity in boundary words. At the end of many chapters, different open problems and conjectures are briefly presented. Géométrie discrète Combinatoire des mots Palindromes Pavages Chemins discrets Algorithmique Discrete geometry Combinatorics on words Palindromes Tilings Discrete paths Algorithmics
20	On some reversal-invariant complexity measures of multiary words / O nekim reverznoinvarijantnim merama složenosti visearnih reči Ago Balog Kristina 11 September 2020 (has links) <p>We focus on two complexity measures of words that are invariant under the operation of reversal of a word: the palindromic defect and the MP-ratio.The palindromic defect of a given word w is dened by jwj + 1   jPal(w)j, where jPal(w)j denotes the number of palindromic factors of w. We study innite words, to which this de  nition can be naturally extended. There are many results in the literature about the so- called rich words (words  of defect 0), while words of nite positive defect have been studied signicantly less; for some time (until recently) it was not known whether there even exist such words that additionally are aperiodic and have their set of factors closed under reversal. Among the rst examples that appeared were the so-called highly potential words. In this  thesis we present a much more general construction,which gives a wider class of words, named generalized highly potential words, and analyze their signicance within the frames of combinatorics on words.The MP-ratio of a given n-ary  word w is dened as the quotient jrwsj jwj ,where r and s are words such that the word rws is minimal- palindromic and that the length jrj + jsj is minimal possible; here, an n-ary word is called minimal-palindromic if it does not contain palindromic subwords of length greater than jwj n . In the binary case, it was proved that the MP-ratio is well-dened and that it is bounded from above by 4, which is the best possible upper bound. The question of well- denedness of the MP-ratio for larger alphabets was left open. In this thesis we solve that  question in the ternary case: we show that the MP-ratio is indeed well-dened in the ternary case, that it is bounded from above by the constant 6 and that this is the best possible upper bound.</p> / <p>Izucavamo dve mere slozenosti reci koje su invarijantne u odnosu na operaciju preokretanja reci: palindromski defekt i MP-razmeru date reci.Palindromski defekt reci w denise se kao jwj + 1   jPal(w)j, gde jPal(w)j predstavlja broj palindromskih faktora reci w. Mi izucavamo beskonacne reci, na koje se ova denicija moze prirodno prosiriti. Postoje mnogobrojni rezultati u vezi sa tzv. bogatim recima (reci cije je defekt 0), dok se o recima sa konacnim pozitivnim defektom relativno malo zna; tokom jednog perioda (donedavno) nije bilo poznato ni da li uopste postoje takve reci koje su,dodatno, aperiodi cne i imaju skup faktora zatvoren za preokretanje. Medu prvim primerima koji su se pojavili u literaturi su bile tzv. visokopotencijalne reci. U disertaciji cemo predstaviti znatno opstiju konstrukciju, kojom se dobija znacajno sira klasa reci, nazvanih uop stene visokopotencijalne reci, i analiziracemo njihov znacaj u okvirima kombinatorike na recima.MP-razmera date n-arne reci w denise se kao kolicnik jrwsj jwj , gde su r i s takve da je rec rws minimalno-palindromicna, i duzina jrj + jsj je najmanja moguca; ovde, za n-arnu rec kazemo da je minimalno-palindromicna ako ne sadrzi palindromsku podrec duzine vece od  jwj n  . U binarnom slucaju dokazano je da je MP-razmera dobro  denisana i da je ogranicena odozgo konstantom 4, sto je i najbolja moguca granica. Dobra denisanost MP-razmere za vece alfabete je ostavljena kao otvoren problem. U ovoj tezi resavamo taj problem u ternarnom slucaju: pokazacemo da MP- razmera jeste dobro de-nisana u ternarnom slucaju, da je ogranicena odozgo sa 6, i da se ta granica ne moze poboljsati.<br /> </p>

Search results