Spelling suggestions: "subject:"tre automata""
21 |
Towards semantic language processing / Mot semantisk språkbearbetningJonsson, Anna January 2018 (has links)
The overall goal of the field of natural language processing is to facilitate the communication between humans and computers, and to help humans with natural language problems such as translation. In this thesis, we focus on semantic language processing. Modelling semantics – the meaning of natural language – requires both a structure to hold the semantic information and a device that can enforce rules on the structure to ensure well-formed semantics while not being too computationally heavy. The devices used in natural language processing are preferably weighted to allow for comparison of the alternative semantic interpretations outputted by a device. The structure employed here is the abstract meaning representation (AMR). We show that AMRs representing well-formed semantics can be generated while leaving out AMRs that are not semantically well-formed. For this purpose, we use a type of graph grammar called contextual hyperedge replacement grammar (CHRG). Moreover, we argue that a more well-known subclass of CHRG – the hyperedge replacement grammar (HRG) – is not powerful enough for AMR generation. This is due to the limitation of HRG when it comes to handling co-references, which in its turn depends on the fact that HRGs only generate graphs of bounded treewidth. Furthermore, we also address the N best problem, which is as follows: Given a weighted device, return the N best (here: smallest-weighted, or more intuitively, smallest-errored) structures. Our goal is to solve the N best problem for devices capable of expressing sophisticated forms of semantic representations such as CHRGs. Here, however, we merely take a first step consisting in developing methods for solving the N best problem for weighted tree automata and some types of weighted acyclic hypergraphs.
22 |
Efektivní algoritmy pro stromové automaty / Efficient Algorithms for Tree AutomataValeš, Ondřej January 2019 (has links)
In this work a novel algorithm for testing language equivalence and inclusion on tree automata is proposed and implemented as a module in the VATA library. First, existing approaches to equivalence and inclusion testing on both word and tree automata are examined. These existing approaches are then modified to create bisimulation up-to congruence algorithm for tree automata and a formal proof of the soundness of the new algorithm is provided. Efficiency of this new approach is compared with existing language equivalence and inclusion testing methods for tree automata, showing the performance of our algorithm on hard cases is often superior.
23 |
Statická detekce malware nad LLVM IR / Static Behavioral Malware Detection over LLVM IRSurovič, Marek January 2016 (has links)
Tato práce se zabývá metodami pro behaviorální detekci malware, které využívají techniky formální analýzy a verifikace. Základem je odvozování stromových automatů z grafů závislostí systémových volání, které jsou získány pomocí statické analýzy LLVM IR. V rámci práce je implementován prototyp detektoru, který využívá překladačovou infrastrukturu LLVM. Pro experimentální ověření detektoru je použit překladač jazyka C/C++, který je schopen generovat mutace malware za pomoci obfuskujících transformací. Výsledky předběžných experimentů a případná budoucí rozšíření detektoru jsou diskutovány v závěru práce.
24 |
Verifikace Programů se složitými datovými strukturami / Harnessing Forest Automata for Verification of Heap Manipulating ProgramsŠimáček, Jiří Unknown Date (has links)
Tato práce se zabývá verifikací nekonečně stavových systémů, konkrétně, verifikací programů využívajích složité dynamicky propojované datové struktury. V minulosti se k řešení tohoto problému objevilo mnoho různých přístupů, avšak žádný z nich doposud nebyl natolik robustní, aby fungoval ve všech případech, se kterými se lze v praxi setkat. Ve snaze poskytnout vyšší úroveň automatizace a současně umožnit verifikaci programů se složitějšími datovými strukturami v této práci navrhujeme nový přístup, který je založen zejména na použití stromových automatů, ale je také částečně inspirován některými myšlenkami, které jsou převzaty z metod založených na separační logice. Mimo to také představujeme několik vylepšení v oblasti implementace operací nad stromovými automaty, které jsou klíčové pro praktickou využitelnost navrhované verifikační metody. Konkrétně uvádíme optimalizovaný algoritmus pro výpočet simulací pro přechodový systém s návěštími, pomocí kterého lze efektivněji počítat simulace pro stromové automaty. Dále uvádíme nový algoritmus pro testování inkluze stromových automatů společně s experimenty, které ukazují, že tento algoritmus překonává jiné existující přístupy.
25 |
Multioperator Weighted Monadic DatalogStüber, Torsten 10 February 2011 (has links)
In this thesis we will introduce multioperator weighted monadic datalog (mwmd), a formal model for specifying tree series, tree transformations, and tree languages. This model combines aspects of multioperator weighted tree automata (wmta), weighted monadic datalog (wmd), and monadic datalog tree transducers (mdtt). In order to develop a rich theory we will define multiple versions of semantics for mwmd and compare their expressiveness. We will study normal forms and decidability results of mwmd and show (by employing particular semantic domains) that the theory of mwmd subsumes the theory of both wmd and mdtt. We conclude this thesis by showing that mwmd even contain wmta as a syntactic subclass and present results concerning this subclass.
26 |
Composition of Tree Series TransformationsMaletti, Andreas 12 November 2012 (has links)
Tree series transformations computed by bottom-up and top-down tree series transducers are called bottom-up and top-down tree series transformations, respectively. (Functional) compositions of such transformations are investigated. It turns out that the class of bottomup tree series transformations over a commutative and complete semiring is closed under left-composition with linear bottom-up tree series transformations and right-composition with boolean deterministic bottom-up tree series transformations. Moreover, it is shown that the class of top-down tree series transformations over a commutative and complete semiring is closed under right-composition with linear, nondeleting top-down tree series transformations. Finally, the composition of a boolean, deterministic, total top-down tree series transformation with a linear top-down tree series transformation is shown to be a top-down tree series transformation.
27 |
Weighted Automata with StorageHerrmann, Luisa 01 March 2021 (has links)
In this thesis, we investigate weighted tree automata with storage theoretically. This model generalises finite state automata in three dimensions: (i) from words to trees, (ii) by using an arbitrary storage type in addition to a finite-state control, and (iii) by considering languages in a quantitative setting using a weight structure.
28 |
Quantitative Variants of Language Equations and their Applications to Description LogicsMarantidis, Pavlos 10 October 2019 (has links)
Unification in description logics (DLs) has been introduced as a novel inference service that can be used to detect redundancies in ontologies, by finding different concepts that may potentially stand for the same intuitive notion. Together with the special case of matching, they were first investigated in detail for the DL FL0, where these problems can be reduced to solving certain language equations.
In this thesis, we extend this service in two directions. In order to increase the recall of this method for finding redundancies, we introduce and investigate the notion of approximate unification, which basically finds pairs of concepts that “almost” unify, in order to account for potential small modelling errors. The meaning of “almost” is formalized using distance measures between concepts. We show that approximate unification in FL0 can be reduced to approximately solving language equations, and devise algorithms for solving the latter problem for particular distance measures. Furthermore, we make a first step towards integrating background knowledge, formulated in so-called TBoxes, by investigating the special case of matching in the presence of TBoxes of different forms. We acquire a tight complexity bound for the general case, while we prove that the problem becomes easier in a restricted setting. To achieve these bounds, we take advantage of an equivalence characterization of FL0 concepts that is based on formal languages. In addition, we incorporate TBoxes in computing concept distances. Even though our results on the approximate setting cannot deal with TBoxes yet, we prepare the framework that future research can build on. Before we journey to the technical details of the above investigations, we showcase our program in the simpler setting of the equational theory ACUI, where we are able to also combine the two extensions. In the course of studying the above problems, we make heavy use of automata theory, where we also derive novel results that could be of independent interest.
29 |
Expressiveness and Decidability of Weighted Automata and Weighted LogicsPaul, Erik 19 October 2020 (has links)
Automata theory, one of the main branches of theoretical computer science, established its roots in the middle of the 20th century. One of its most fundamental concepts is that of a finite automaton, a basic yet powerful model of computation. In essence, finite automata provide a method to finitely represent possibly infinite sets of strings. Such a set of strings is also called a language, and the languages which can be described by finite automata are known as regular languages. Owing to their versatility, regular languages have received a great deal of attention over the years. Other formalisms were shown to be expressively equivalent to finite automata, most notably regular grammars, regular expressions, and monadic second order (MSO) logic. To increase expressiveness, the fundamental idea underlying finite automata and regular languages was also extended to describe not only languages of strings, or words, but also of infinite words by Büchi and Muller, finite trees by Doner and Thatcher and Wright, infinite trees by Rabin, nested words by Alur and Madhusudan, and pictures by Blum and Hewitt, just to name a few examples. In a parallel line of development, Schützenberger introduced weighted automata which allow the description of quantitative properties of regular languages. In subsequent works, many of these descriptive formalisms and extensions were combined and their relationships investigated. For example, weighted regular expressions and weighted logics have been developed as well as regular expressions for trees and pictures, regular grammars for trees, pictures, and nested words, and logical characterizations for regular languages of trees, pictures, and nested words.
In this work, we focus on two of these extensions and their relationship, namely weighted automata and weighted logics. Just as the classical Büchi-Elgot-Trakhtenbrot Theorem established the coincidence of regular languages with languages definable in monadic second order logic, weighted automata have been shown to be expressively equivalent to a specific fragment of a weighted monadic second order logic by Droste and Gastin. We explore several aspects of weighted automata and of this weighted logic. More precisely, the thesis considers the following topics.
In the first part, we extend the classical Feferman-Vaught Theorem to the weighted setting. The Feferman-Vaught Theorem is one of the fundamental theorems in model theory. The theorem describes how the computation of the truth value of a first order sentence in a generalized product of relational structures can be reduced to the computation of truth values of first order sentences in the contributing structures and the evaluation of an MSO sentence in the index structure. The theorem itself has a long-standing history. It builds upon work of Mostowski, and was shown in subsequent works to hold true for MSO logic. Here, we show that under appropriate assumptions, the Feferman-Vaught Theorem also holds true for a weighted MSO logic with arbitrary commutative semirings as weight structure.
In the second part, we lift four decidability results from max-plus word automata to max-plus tree automata. Max-plus word and tree automata are weighted automata over the max-plus semiring and assign real numbers to words or trees, respectively. We show that, like for max-plus word automata, the equivalence, unambiguity, and sequentiality problems are decidable for finitely ambiguous max-plus tree automata, and that the finite sequentiality problem is decidable for unambiguous max-plus tree automata.
In the last part, we develop a logic which is expressively equivalent to quantitative monitor automata. Introduced very recently by Chatterjee, Henzinger, and Otop, quantitative monitor automata are an automaton model operating on infinite words. Quantitative monitor automata possess several interesting features. They are expressively equivalent to a subclass of nested weighted automata, an automaton model which for many valuation functions has decidable emptiness and universality problems. Also, quantitative monitor automata are more expressive than weighted Büchi-automata and their extension with valuation functions. We introduce a new logic which we call monitor logic and show that it is expressively equivalent to quantitative monitor automata.
30 |
A Formal View on Training of Weighted Tree Automata by Likelihood-Driven State Splitting and MergingDietze, Toni 03 June 2019 (has links)
The use of computers and algorithms to deal with human language, in both spoken and written form, is summarized by the term natural language processing (nlp). Modeling language in a way that is suitable for computers plays an important role in nlp. One idea is to use formalisms from theoretical computer science for that purpose. For example, one can try to find an automaton to capture the valid written sentences of a language. Finding such an automaton by way of examples is called training.
In this work, we also consider the structure of sentences by making use of trees. We use weighted tree automata (wta) in order to deal with such tree structures. Those devices assign weights to trees in order to, for example, distinguish between good and bad structures. The well-known expectation-maximization algorithm can be used to train the weights for a wta while the state behavior stays fixed. As a way to adapt the state behavior of a wta, state splitting, i.e. dividing a state into several new states, and state merging, i.e. replacing several states by a single new state, can be used. State splitting, state merging, and the expectation maximization algorithm already were combined into the state splitting and merging algorithm, which was successfully applied in practice. In our work, we formalized this approach in order to show properties of the algorithm. We also examined a new approach – the count-based state merging algorithm – which exclusively relies on state merging.
When dealing with trees, another important tool is binarization. A binarization is a strategy to code arbitrary trees by binary trees. For each of three different binarizations we showed that wta together with the binarization are as powerful as weighted unranked tree automata (wuta). We also showed that this is still true if only probabilistic wta and probabilistic wuta are considered.:How to Read This Thesis
1. Introduction
1.1. The Contributions and the Structure of This Work
2. Preliminaries
2.1. Sets, Relations, Functions, Families, and Extrema
2.2. Algebraic Structures
2.3. Formal Languages
3. Language Formalisms
3.1. Context-Free Grammars (CFGs)
3.2. Context-Free Grammars with Latent Annotations (CFG-LAs)
3.3. Weighted Tree Automata (WTAs)
3.4. Equivalences of WCFG-LAs and WTAs
4. Training of WTAs
4.1. Probability Distributions
4.2. Maximum Likelihood Estimation
4.3. Probabilities and WTAs
4.4. The EM Algorithm for WTAs
4.5. Inside and Outside Weights
4.6. Adaption of the Estimation of Corazza and Satta [CS07] to WTAs
5. State Splitting and Merging
5.1. State Splitting and Merging for Weighted Tree Automata
5.1.1. Splitting Weights and Probabilities
5.1.2. Merging Probabilities
5.2. The State Splitting and Merging Algorithm
5.2.1. Finding a Good π-Distributor
5.2.2. Notes About the Berkeley Parser
5.3. Conclusion and Further Research
6. Count-Based State Merging
6.1. Preliminaries
6.2. The Likelihood of the Maximum Likelihood Estimate and Its Behavior While Merging
6.3. The Count-Based State Merging Algorithm
6.3.1. Further Adjustments for Practical Implementations
6.4. Implementation of Count-Based State Merging
6.5. Experiments with Artificial Automata and Corpora
6.5.1. The Artificial Automata
6.5.2. Results
6.6. Experiments with the Penn Treebank
6.7. Comparison to the Approach of Carrasco, Oncina, and Calera-Rubio [COC01]
6.8. Conclusion and Further Research
7. Binarization
7.1. Preliminaries
7.2. Relating WSTAs and WUTAs via Binarizations
7.2.1. Left-Branching Binarization
7.2.2. Right-Branching Binarization
7.2.3. Mixed Binarization
7.3. The Probabilistic Case
7.3.1. Additional Preliminaries About WSAs
7.3.2. Constructing an Out-Probabilistic WSA from a Converging WSA
7.3.3. Binarization and Probabilistic Tree Automata
7.4. Connection to the Training Methods in Previous Chapters
7.5. Conclusion and Further Research
A. Proofs for Preliminaries
B. Proofs for Training of WTAs
C. Proofs for State Splitting and Merging
D. Proofs for Count-Based State Merging
List of Algorithms
List of Figures
List of Tables
Table of Variable Names
Page generated in 0.0598 seconds