1 |
On the membership problem for pattern languages and related topicsSchmid, Markus L. January 2012 (has links)
In this thesis, we investigate the complexity of the membership problem for pattern languages. A pattern is a string over the union of the alphabets A and X, where X := {x_1, x_2, x_3, ...} is a countable set of variables and A is a finite alphabet containing terminals (e.g., A := {a, b, c, d}). Every pattern, e.g., p := x_1 x_2 a b x_2 b x_1 c x_2, describes a pattern language, i.e., the set of all words that can be obtained by uniformly substituting the variables in the pattern by arbitrary strings over A. Hence, u := cacaaabaabcaccaa is a word of the pattern language of p, since substituting cac for x_1 and aa for x_2 yields u. On the other hand, there is no way to obtain the word u' := bbbababbacaaba by substituting the occurrences of x_1 and x_2 in p by words over A. The problem to decide for a given pattern q and a given word w whether or not w is in the pattern language of q is called the membership problem for pattern languages. Consequently, (p, u) is a positive instance and (p, u') is a negative instance of the membership problem for pattern languages. For the unrestricted case, i.e., for arbitrary patterns and words, the membership problem is NP-complete. In this thesis, we identify classes of patterns for which the membership problem can be solved efficiently. Our first main result in this regard is that the variable distance, i.e., the maximum number of different variables that separate two consecutive occurrences of the same variable, substantially contributes to the complexity of the membership problem for pattern languages. More precisely, for every class of patterns with a bounded variable distance the membership problem can be solved efficiently. The second main result is that the same holds for every class of patterns with a bounded scope coincidence degree, where the scope coincidence degree is the maximum number of intervals that cover a common position in the pattern, where each interval is given by the leftmost and rightmost occurrence of a variable in the pattern. The proof of our first main result is based on automata theory. More precisely, we introduce a new automata model that is used as an algorithmic framework in order to show that the membership problem for pattern languages can be solved in time that is exponential only in the variable distance of the corresponding pattern. We then take a closer look at this automata model and subject it to a sound theoretical analysis. The second main result is obtained in a completely different way. We encode patterns and words as relational structures and we then reduce the membership problem for pattern languages to the homomorphism problem of relational structures, which allows us to exploit the concept of the treewidth. This approach turns out be successful, and we show that it has potential to identify further classes of patterns with a polynomial time membership problem. Furthermore, we take a closer look at two aspects of pattern languages that are indirectly related to the membership problem. Firstly, we investigate the phenomenon that patterns can describe regular or context-free languages in an unexpected way, which implies that their membership problem can be solved efficiently. In this regard, we present several sufficient conditions and necessary conditions for the regularity and context-freeness of pattern languages. Secondly, we compare pattern languages with languages given by so-called extended regular expressions with backreferences (REGEX). The membership problem for REGEX languages is very important in practice and since REGEX are similar to pattern languages, it might be possible to improve algorithms for the membership problem for REGEX languages by investigating their relationship to patterns. In this regard, we investigate how patterns can be extended in order to describe large classes of REGEX languages.
|
2 |
Random Walks on Free Products of Cyclic GroupsAlharbi, Manal 17 April 2018 (has links)
In this thesis, we investigate examples of random walks on free products of cyclic groups. Free products are groups that contain words constructed by concatenation with possible simplifications[20]. Mairesse in [17] proved that the harmonic measure on the boundary of these random walks has a Markovian Multiplicative structure (this is a class of Markov measures which requires fewer parameters than the usual Markov measures for its description ), and also showed how in the case of the harmonic measure these parameters can be found from Traffic Equations. Then Mairesse and Math ́eus in [20] continued investigation of these random walks and the associated Traffic Equations. They introduced the Stationary Traffic Equations for the situation when the measure is shift-invariant in addition to being μ-invariant. In this thesis, we review these developments as well as explicitly describe several concrete examples of random walks on free products, some of which are new.
|
3 |
Gramáticas livres de contexto adaptativas com verificação de aparência. / Context-free adaptive grammars with appearance checking.Bravo Pariente, César Alberto 22 January 2004 (has links)
Este trabalho descreve o formalismo das gramáticas livres de contexto adaptativas com verificação de aparência. Esses dispositivos gramaticais possuem como núcleo uma gramática livre de contexto subjacente e, como mecanismo de auto-modificação, uma ou várias funções adaptativas que determinam quais produções são aplicáveis em cada passo de uma derivação. A verificação de aparência se refere a uma forma especial de aplicar algumas produções, escolhidas pelo projetista da gramática, sem alterar a forma sentencial nessa aplicação. É provado que esse formalismo tem poder de máquina de Turing demonstrando, em forma construtiva, sua equivalência com quatro formalismos gramaticais baseados em gramáticas livres de contexto com mecanismos de controle, que tem esse poder. São desenvolvidos dois analisadorers para linguagens dependentes de contexto a partir de um desses outros quatro formalismos. Um deles, que é baseado em autômatos-pilha, opera em forma ascendente; o outro, baseado em autômatos finitos adaptativos, opera em forma descendente. / This work introduces and describes the formalism of the context-free adaptive grammar with appearance checking. Such gramatical devices have as its kernel a subjacent context-free grammar and, as mechanism of self-modification, one or several adaptive functions which determines the productions able to be applied at each step of a derivation. The appearance checking refers to a special way to apply some productions, choosen by the designer of the grammar, without changing the sentential form in this application. It is proved that this formalism has Turing Machine power, proving, by construction, its equivalence with four grammatical formalisms based on context-free grammars and with control mechanisms, with such power. Two parsers have been developed for context-dependent languages from one of these four formalisms. One of them is based on stack-automata, and operates in a bottom-up fashion.
|
4 |
Gramáticas livres de contexto adaptativas com verificação de aparência. / Context-free adaptive grammars with appearance checking.César Alberto Bravo Pariente 22 January 2004 (has links)
Este trabalho descreve o formalismo das gramáticas livres de contexto adaptativas com verificação de aparência. Esses dispositivos gramaticais possuem como núcleo uma gramática livre de contexto subjacente e, como mecanismo de auto-modificação, uma ou várias funções adaptativas que determinam quais produções são aplicáveis em cada passo de uma derivação. A verificação de aparência se refere a uma forma especial de aplicar algumas produções, escolhidas pelo projetista da gramática, sem alterar a forma sentencial nessa aplicação. É provado que esse formalismo tem poder de máquina de Turing demonstrando, em forma construtiva, sua equivalência com quatro formalismos gramaticais baseados em gramáticas livres de contexto com mecanismos de controle, que tem esse poder. São desenvolvidos dois analisadorers para linguagens dependentes de contexto a partir de um desses outros quatro formalismos. Um deles, que é baseado em autômatos-pilha, opera em forma ascendente; o outro, baseado em autômatos finitos adaptativos, opera em forma descendente. / This work introduces and describes the formalism of the context-free adaptive grammar with appearance checking. Such gramatical devices have as its kernel a subjacent context-free grammar and, as mechanism of self-modification, one or several adaptive functions which determines the productions able to be applied at each step of a derivation. The appearance checking refers to a special way to apply some productions, choosen by the designer of the grammar, without changing the sentential form in this application. It is proved that this formalism has Turing Machine power, proving, by construction, its equivalence with four grammatical formalisms based on context-free grammars and with control mechanisms, with such power. Two parsers have been developed for context-dependent languages from one of these four formalisms. One of them is based on stack-automata, and operates in a bottom-up fashion.
|
5 |
Algebraic decoder specification: coupling formal-language theory and statistical machine translationBüchse, Matthias 28 January 2015 (has links) (PDF)
The specification of a decoder, i.e., a program that translates sentences from one natural language into another, is an intricate process, driven by the application and lacking a canonical methodology. The practical nature of decoder development inhibits the transfer of knowledge between theory and application, which is unfortunate because many contemporary decoders are in fact related to formal-language theory. This thesis proposes an algebraic framework where a decoder is specified by an expression built from a fixed set of operations. As yet, this framework accommodates contemporary syntax-based decoders, it spans two levels of abstraction, and, primarily, it encourages mutual stimulation between the theory of weighted tree automata and the application.
|
6 |
Regulated rewriting in formal language theoryTaha, Mohamed A. M. S 03 1900 (has links)
Thesis (MSc (Mathematical Sciences))--University of Stellenbosch, 2008. / Context-free grammars are well-studied and well-behaved in terms of decidability, but many
real-world problems cannot be described with context-free grammars. Grammars with regulated
rewriting are grammars with mechanisms to regulate the applications of rules, so that
certain derivations are avoided. Thus, with context-free rules and regulated rewriting mechanisms,
one can often generate languages that are not context-free.
In this thesis we study grammars with regulated rewriting mechanisms. We consider problems
in which context-free grammars are insufficient and in which more descriptive grammars
are required. We compare bag context grammars with other well-known classes of grammars
with regulated rewriting mechanisms. We also discuss the relation between bag context grammars
and recognizing devices such as counter automata and Petri net automata. We show
that regular bag context grammars can generate any recursively enumerable language. We
reformulate the pumping lemma for random permitting context languages with context-free
rules, as introduced by Ewert and Van der Walt, by using the concept of a string homomorphism.
We conclude the thesis with decidability and complexity properties of grammars with
regulated rewriting.
|
7 |
Aplikace teorie formálních jazyků v oblasti počítačové bezpečnosti / Formal Language Theory Applied to Computer SecurityRegéciová, Dominika January 2018 (has links)
Computer security is and will always be a critical area that affects everyone. Despite all the efforts made to build safer systems and test them, however, new vulnerabilities and vulnerabilities are still emerging and creating the impression of tilting at windmills. Partial justification of the current state, but also possible solutions, brings in many respects an extraordinary view of security through formal language theory. Emphasis should be put on a more responsible approach to the recognition and processing of inputs, which are often the gateway to many attacks. In this paper, we will get acquainted with this trend and its recommendations for development and will then introduce a new method of detecting SQL injection attacks built on its foundations.
|
8 |
Převody mezi regulárními gramatikami, regulárními výrazy a konečnými automaty / Mutual Transformations of Regular Grammars, Regular Expressions and Finite AutomataPodhorský, Michal Unknown Date (has links)
This work describes models of modern language theory - finite automata, regular grammars and regular expressions. A web application converting among these models is implemented.
|
9 |
Bimorphism Machine TranslationQuernheim, Daniel 27 April 2017 (has links) (PDF)
The field of statistical machine translation has made tremendous progress due to the rise of statistical methods, making it possible to obtain a translation system automatically from a bilingual collection of text. Some approaches do not even need any kind of linguistic annotation, and can infer translation rules from raw, unannotated data. However, most state-of-the art systems do linguistic structure little justice, and moreover many approaches that have been put forward use ad-hoc formalisms and algorithms. This inevitably leads to duplication of effort, and a separation between theoretical researchers and practitioners.
In order to remedy the lack of motivation and rigor, the contributions of this dissertation are threefold:
1. After laying out the historical background and context, as well as the mathematical and linguistic foundations, a rigorous algebraic model of machine translation is put forward. We use regular tree grammars and bimorphisms as the backbone, introducing a modular architecture that allows different input and output formalisms.
2. The challenges of implementing this bimorphism-based model in a machine translation toolkit are then described, explaining in detail the algorithms used for the core components.
3. Finally, experiments where the toolkit is applied on real-world data and used for diagnostic purposes are described. We discuss how we use exact decoding to reason about search errors and model errors in a popular machine translation toolkit, and we compare output formalisms of different generative capacity.
|
10 |
Translation as Linear Transduction : Models and Algorithms for Efficient Learning in Statistical Machine TranslationSaers, Markus January 2011 (has links)
Automatic translation has seen tremendous progress in recent years, mainly thanks to statistical methods applied to large parallel corpora. Transductions represent a principled approach to modeling translation, but existing transduction classes are either not expressive enough to capture structural regularities between natural languages or too complex to support efficient statistical induction on a large scale. A common approach is to severely prune search over a relatively unrestricted space of transduction grammars. These restrictions are often applied at different stages in a pipeline, with the obvious drawback of committing to irrevocable decisions that should not have been made. In this thesis we will instead restrict the space of transduction grammars to a space that is less expressive, but can be efficiently searched. First, the class of linear transductions is defined and characterized. They are generated by linear transduction grammars, which represent the natural bilingual case of linear grammars, as well as the natural linear case of inversion transduction grammars (and higher order syntax-directed transduction grammars). They are recognized by zipper finite-state transducers, which are equivalent to finite-state automata with four tapes. By allowing this extra dimensionality, linear transductions can represent alignments that finite-state transductions cannot, and by keeping the mechanism free of auxiliary storage, they become much more efficient than inversion transductions. Secondly, we present an algorithm for parsing with linear transduction grammars that allows pruning. The pruning scheme imposes no restrictions a priori, but guides the search to potentially interesting parts of the search space in an informed and dynamic way. Being able to parse efficiently allows learning of stochastic linear transduction grammars through expectation maximization. All the above work would be for naught if linear transductions were too poor a reflection of the actual transduction between natural languages. We test this empirically by building systems based on the alignments imposed by the learned grammars. The conclusion is that stochastic linear inversion transduction grammars learned from observed data stand up well to the state of the art.
|
Page generated in 0.3969 seconds