Global ETD Search

1	Multioperator Weighted Monadic Datalog Stüber, Torsten 06 May 2011 (has links) (PDF) In this thesis we will introduce multioperator weighted monadic datalog (mwmd), a formal model for specifying tree series, tree transformations, and tree languages. This model combines aspects of multioperator weighted tree automata (wmta), weighted monadic datalog (wmd), and monadic datalog tree transducers (mdtt). In order to develop a rich theory we will define multiple versions of semantics for mwmd and compare their expressiveness. We will study normal forms and decidability results of mwmd and show (by employing particular semantic domains) that the theory of mwmd subsumes the theory of both wmd and mdtt. We conclude this thesis by showing that mwmd even contain wmta as a syntactic subclass and present results concerning this subclass. Automatentheorie Baumautomaten Baumübersetzer Logik Monadic Datalog Automata Theory Tree Automata Tree Transducer Logic Monadic Datalog ddc:004 rvk:ST 136
2	Algebraic decoder specification: coupling formal-language theory and statistical machine translation Büchse, Matthias 28 January 2015 (has links) (PDF) The specification of a decoder, i.e., a program that translates sentences from one natural language into another, is an intricate process, driven by the application and lacking a canonical methodology. The practical nature of decoder development inhibits the transfer of knowledge between theory and application, which is unfortunate because many contemporary decoders are in fact related to formal-language theory. This thesis proposes an algebraic framework where a decoder is specified by an expression built from a fixed set of operations. As yet, this framework accommodates contemporary syntax-based decoders, it spans two levels of abstraction, and, primarily, it encourages mutual stimulation between the theory of weighted tree automata and the application. Maschinelles Übersetzen Algebraische Spezifikation Theorie der formalen Sprachen Gewichtete Baumautomaten machine translation algebraic specification formal-language theory weighted tree automata ddc:004 rvk:ST 136 Maschinelle Übersetzung Automatentheorie Algebra
3	Multioperator Weighted Monadic Datalog Stüber, Torsten 10 February 2011 (has links) In this thesis we will introduce multioperator weighted monadic datalog (mwmd), a formal model for specifying tree series, tree transformations, and tree languages. This model combines aspects of multioperator weighted tree automata (wmta), weighted monadic datalog (wmd), and monadic datalog tree transducers (mdtt). In order to develop a rich theory we will define multiple versions of semantics for mwmd and compare their expressiveness. We will study normal forms and decidability results of mwmd and show (by employing particular semantic domains) that the theory of mwmd subsumes the theory of both wmd and mdtt. We conclude this thesis by showing that mwmd even contain wmta as a syntactic subclass and present results concerning this subclass. info:eu-repo/classification/ddc/004 ddc:004
4	Weighted Automata with Storage Herrmann, Luisa 01 March 2021 (has links) In this thesis, we investigate weighted tree automata with storage theoretically. This model generalises finite state automata in three dimensions: (i) from words to trees, (ii) by using an arbitrary storage type in addition to a finite-state control, and (iii) by considering languages in a quantitative setting using a weight structure. info:eu-repo/classification/ddc/004 ddc:004
5	Quantitative Variants of Language Equations and their Applications to Description Logics Marantidis, Pavlos 10 October 2019 (has links) Unification in description logics (DLs) has been introduced as a novel inference service that can be used to detect redundancies in ontologies, by finding different concepts that may potentially stand for the same intuitive notion. Together with the special case of matching, they were first investigated in detail for the DL FL0, where these problems can be reduced to solving certain language equations. In this thesis, we extend this service in two directions. In order to increase the recall of this method for finding redundancies, we introduce and investigate the notion of approximate unification, which basically finds pairs of concepts that “almost” unify, in order to account for potential small modelling errors. The meaning of “almost” is formalized using distance measures between concepts. We show that approximate unification in FL0 can be reduced to approximately solving language equations, and devise algorithms for solving the latter problem for particular distance measures. Furthermore, we make a first step towards integrating background knowledge, formulated in so-called TBoxes, by investigating the special case of matching in the presence of TBoxes of different forms. We acquire a tight complexity bound for the general case, while we prove that the problem becomes easier in a restricted setting. To achieve these bounds, we take advantage of an equivalence characterization of FL0 concepts that is based on formal languages. In addition, we incorporate TBoxes in computing concept distances. Even though our results on the approximate setting cannot deal with TBoxes yet, we prepare the framework that future research can build on. Before we journey to the technical details of the above investigations, we showcase our program in the simpler setting of the equational theory ACUI, where we are able to also combine the two extensions. In the course of studying the above problems, we make heavy use of automata theory, where we also derive novel results that could be of independent interest. info:eu-repo/classification/ddc/004 ddc:004
6	A Formal View on Training of Weighted Tree Automata by Likelihood-Driven State Splitting and Merging Dietze, Toni 03 June 2019 (has links) The use of computers and algorithms to deal with human language, in both spoken and written form, is summarized by the term natural language processing (nlp). Modeling language in a way that is suitable for computers plays an important role in nlp. One idea is to use formalisms from theoretical computer science for that purpose. For example, one can try to find an automaton to capture the valid written sentences of a language. Finding such an automaton by way of examples is called training. In this work, we also consider the structure of sentences by making use of trees. We use weighted tree automata (wta) in order to deal with such tree structures. Those devices assign weights to trees in order to, for example, distinguish between good and bad structures. The well-known expectation-maximization algorithm can be used to train the weights for a wta while the state behavior stays fixed. As a way to adapt the state behavior of a wta, state splitting, i.e. dividing a state into several new states, and state merging, i.e. replacing several states by a single new state, can be used. State splitting, state merging, and the expectation maximization algorithm already were combined into the state splitting and merging algorithm, which was successfully applied in practice. In our work, we formalized this approach in order to show properties of the algorithm. We also examined a new approach – the count-based state merging algorithm – which exclusively relies on state merging. When dealing with trees, another important tool is binarization. A binarization is a strategy to code arbitrary trees by binary trees. For each of three different binarizations we showed that wta together with the binarization are as powerful as weighted unranked tree automata (wuta). We also showed that this is still true if only probabilistic wta and probabilistic wuta are considered.:How to Read This Thesis 1. Introduction 1.1. The Contributions and the Structure of This Work 2. Preliminaries 2.1. Sets, Relations, Functions, Families, and Extrema 2.2. Algebraic Structures 2.3. Formal Languages 3. Language Formalisms 3.1. Context-Free Grammars (CFGs) 3.2. Context-Free Grammars with Latent Annotations (CFG-LAs) 3.3. Weighted Tree Automata (WTAs) 3.4. Equivalences of WCFG-LAs and WTAs 4. Training of WTAs 4.1. Probability Distributions 4.2. Maximum Likelihood Estimation 4.3. Probabilities and WTAs 4.4. The EM Algorithm for WTAs 4.5. Inside and Outside Weights 4.6. Adaption of the Estimation of Corazza and Satta [CS07] to WTAs 5. State Splitting and Merging 5.1. State Splitting and Merging for Weighted Tree Automata 5.1.1. Splitting Weights and Probabilities 5.1.2. Merging Probabilities 5.2. The State Splitting and Merging Algorithm 5.2.1. Finding a Good π-Distributor 5.2.2. Notes About the Berkeley Parser 5.3. Conclusion and Further Research 6. Count-Based State Merging 6.1. Preliminaries 6.2. The Likelihood of the Maximum Likelihood Estimate and Its Behavior While Merging 6.3. The Count-Based State Merging Algorithm 6.3.1. Further Adjustments for Practical Implementations 6.4. Implementation of Count-Based State Merging 6.5. Experiments with Artificial Automata and Corpora 6.5.1. The Artificial Automata 6.5.2. Results 6.6. Experiments with the Penn Treebank 6.7. Comparison to the Approach of Carrasco, Oncina, and Calera-Rubio [COC01] 6.8. Conclusion and Further Research 7. Binarization 7.1. Preliminaries 7.2. Relating WSTAs and WUTAs via Binarizations 7.2.1. Left-Branching Binarization 7.2.2. Right-Branching Binarization 7.2.3. Mixed Binarization 7.3. The Probabilistic Case 7.3.1. Additional Preliminaries About WSAs 7.3.2. Constructing an Out-Probabilistic WSA from a Converging WSA 7.3.3. Binarization and Probabilistic Tree Automata 7.4. Connection to the Training Methods in Previous Chapters 7.5. Conclusion and Further Research A. Proofs for Preliminaries B. Proofs for Training of WTAs C. Proofs for State Splitting and Merging D. Proofs for Count-Based State Merging Bibliography List of Algorithms List of Figures List of Tables Index Table of Variable Names info:eu-repo/classification/ddc/004 ddc:004
7	Bimorphism Machine Translation Quernheim, Daniel 27 April 2017 (has links) (PDF) The field of statistical machine translation has made tremendous progress due to the rise of statistical methods, making it possible to obtain a translation system automatically from a bilingual collection of text. Some approaches do not even need any kind of linguistic annotation, and can infer translation rules from raw, unannotated data. However, most state-of-the art systems do linguistic structure little justice, and moreover many approaches that have been put forward use ad-hoc formalisms and algorithms. This inevitably leads to duplication of effort, and a separation between theoretical researchers and practitioners. In order to remedy the lack of motivation and rigor, the contributions of this dissertation are threefold: 1. After laying out the historical background and context, as well as the mathematical and linguistic foundations, a rigorous algebraic model of machine translation is put forward. We use regular tree grammars and bimorphisms as the backbone, introducing a modular architecture that allows different input and output formalisms. 2. The challenges of implementing this bimorphism-based model in a machine translation toolkit are then described, explaining in detail the algorithms used for the core components. 3. Finally, experiments where the toolkit is applied on real-world data and used for diagnostic purposes are described. We discuss how we use exact decoding to reason about search errors and model errors in a popular machine translation toolkit, and we compare output formalisms of different generative capacity. Computerlinguistik Maschinelle Übersetzung Bimorphismen synchrone Grammatiken Baumautomaten Formale Sprachen computational linguistics machine translation bimorphisms synchronous grammars tree automata formal language theory ddc:004 ddc:410 Computerlinguistik Maschinelle Übersetzung Automatentheorie
8	Weighted Unranked Tree Automata over Tree Valuation Monoids Götze, Doreen 16 March 2017 (has links) (PDF) Quantitative aspects of systems, like the maximal consumption of resources, can be modeled by weighted automata. The usual approach is to weight transitions with elements of a semiring and to define the behavior of the weighted automaton by mul- tiplying the transition weights along a run. In this thesis, we define and investigate a new class of weighted automata over unranked trees which are defined over valuation monoids. By turning to valuation monoids we use a more general cost model: the weight of a run is now determined by a global valuation function. Besides the binary cost functions implementable via semirings, valuation functions enable us to cope with average and discounting. We first investigate the supports of weighted unranked tree automata over valuation monoids, i.e., the languages of all words which are evalu- ated to a non-zero value. We will furthermore consider the support of several other weighted automata models over different structures, like words and ranked trees. Next we prove a Nivat-like theorem for the new weighted unranked tree automata. More- over, we give a logical characterization for them. We show that weighted unranked tree automata are expressively equivalent to a weighted MSO logic for unranked trees. This solves an open problem posed by Droste and Vogler. Finally, we present a Kleene- type result for weighted ranked tree automata over valuation monoids. Gewichtete Baumautomaten Gewichtete MSO Logik Gewichtete rationale Ausdrücke Bäume ohne Rang Bäume mit Rang Baumbewertungmonoide weighted tree automata weighed MSO logic weighted rational expressions unranked trees ranked trees valuation monoids ddc:000
9	Weighted Unranked Tree Automata over Tree Valuation Monoids Götze, Doreen 14 March 2017 (has links) Quantitative aspects of systems, like the maximal consumption of resources, can be modeled by weighted automata. The usual approach is to weight transitions with elements of a semiring and to define the behavior of the weighted automaton by mul- tiplying the transition weights along a run. In this thesis, we define and investigate a new class of weighted automata over unranked trees which are defined over valuation monoids. By turning to valuation monoids we use a more general cost model: the weight of a run is now determined by a global valuation function. Besides the binary cost functions implementable via semirings, valuation functions enable us to cope with average and discounting. We first investigate the supports of weighted unranked tree automata over valuation monoids, i.e., the languages of all words which are evalu- ated to a non-zero value. We will furthermore consider the support of several other weighted automata models over different structures, like words and ranked trees. Next we prove a Nivat-like theorem for the new weighted unranked tree automata. More- over, we give a logical characterization for them. We show that weighted unranked tree automata are expressively equivalent to a weighted MSO logic for unranked trees. This solves an open problem posed by Droste and Vogler. Finally, we present a Kleene- type result for weighted ranked tree automata over valuation monoids. info:eu-repo/classification/ddc/000 ddc:000
10	Bimorphism Machine Translation Quernheim, Daniel 10 April 2017 (has links) The field of statistical machine translation has made tremendous progress due to the rise of statistical methods, making it possible to obtain a translation system automatically from a bilingual collection of text. Some approaches do not even need any kind of linguistic annotation, and can infer translation rules from raw, unannotated data. However, most state-of-the art systems do linguistic structure little justice, and moreover many approaches that have been put forward use ad-hoc formalisms and algorithms. This inevitably leads to duplication of effort, and a separation between theoretical researchers and practitioners. In order to remedy the lack of motivation and rigor, the contributions of this dissertation are threefold: 1. After laying out the historical background and context, as well as the mathematical and linguistic foundations, a rigorous algebraic model of machine translation is put forward. We use regular tree grammars and bimorphisms as the backbone, introducing a modular architecture that allows different input and output formalisms. 2. The challenges of implementing this bimorphism-based model in a machine translation toolkit are then described, explaining in detail the algorithms used for the core components. 3. Finally, experiments where the toolkit is applied on real-world data and used for diagnostic purposes are described. We discuss how we use exact decoding to reason about search errors and model errors in a popular machine translation toolkit, and we compare output formalisms of different generative capacity. info:eu-repo/classification/ddc/004 ddc:004 info:eu-repo/classification/ddc/410 ddc:410

Search results