• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 36
  • 3
  • 2
  • 1
  • Tagged with
  • 43
  • 43
  • 17
  • 14
  • 13
  • 13
  • 12
  • 12
  • 12
  • 12
  • 11
  • 11
  • 11
  • 10
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Bimorphism Machine Translation

Quernheim, Daniel 27 April 2017 (has links) (PDF)
The field of statistical machine translation has made tremendous progress due to the rise of statistical methods, making it possible to obtain a translation system automatically from a bilingual collection of text. Some approaches do not even need any kind of linguistic annotation, and can infer translation rules from raw, unannotated data. However, most state-of-the art systems do linguistic structure little justice, and moreover many approaches that have been put forward use ad-hoc formalisms and algorithms. This inevitably leads to duplication of effort, and a separation between theoretical researchers and practitioners. In order to remedy the lack of motivation and rigor, the contributions of this dissertation are threefold: 1. After laying out the historical background and context, as well as the mathematical and linguistic foundations, a rigorous algebraic model of machine translation is put forward. We use regular tree grammars and bimorphisms as the backbone, introducing a modular architecture that allows different input and output formalisms. 2. The challenges of implementing this bimorphism-based model in a machine translation toolkit are then described, explaining in detail the algorithms used for the core components. 3. Finally, experiments where the toolkit is applied on real-world data and used for diagnostic purposes are described. We discuss how we use exact decoding to reason about search errors and model errors in a popular machine translation toolkit, and we compare output formalisms of different generative capacity.
32

Weighted Unranked Tree Automata over Tree Valuation Monoids

Götze, Doreen 16 March 2017 (has links) (PDF)
Quantitative aspects of systems, like the maximal consumption of resources, can be modeled by weighted automata. The usual approach is to weight transitions with elements of a semiring and to define the behavior of the weighted automaton by mul- tiplying the transition weights along a run. In this thesis, we define and investigate a new class of weighted automata over unranked trees which are defined over valuation monoids. By turning to valuation monoids we use a more general cost model: the weight of a run is now determined by a global valuation function. Besides the binary cost functions implementable via semirings, valuation functions enable us to cope with average and discounting. We first investigate the supports of weighted unranked tree automata over valuation monoids, i.e., the languages of all words which are evalu- ated to a non-zero value. We will furthermore consider the support of several other weighted automata models over different structures, like words and ranked trees. Next we prove a Nivat-like theorem for the new weighted unranked tree automata. More- over, we give a logical characterization for them. We show that weighted unranked tree automata are expressively equivalent to a weighted MSO logic for unranked trees. This solves an open problem posed by Droste and Vogler. Finally, we present a Kleene- type result for weighted ranked tree automata over valuation monoids.
33

Complexity and expressiveness for formal structures in Natural Language Processing

Ericson, Petter January 2017 (has links)
The formalized and algorithmic study of human language within the field of Natural Language Processing (NLP) has motivated much theoretical work in the related field of formal languages, in particular the subfields of grammar and automata theory. Motivated and informed by NLP, the papers in this thesis explore the connections between expressibility – that is, the ability for a formal system to define complex sets of objects – and algorithmic complexity – that is, the varying amount of effort required to analyse and utilise such systems. Our research studies formal systems working not just on strings, but on more complex structures such as trees and graphs, in particular syntax trees and semantic graphs. The field of mildly context-sensitive languages concerns attempts to find a useful class of formal languages between the context-free and context-sensitive. We study formalisms defining two candidates for this class; tree-adjoining languages and the languages defined by linear context-free rewriting systems. For the former, we specifically investigate the tree languages, and define a subclass and tree automaton with linear parsing complexity. For the latter, we use the framework of parameterized complexity theory to investigate more deeply the related parsing problems, as well as the connections between various formalisms defining the class. The field of semantic modelling aims towards formally and accurately modelling not only the syntax of natural language statements, but also the meaning. In particular, recent work in semantic graphs motivates our study of graph grammars and graph parsing. To the best of our knowledge, the formalism presented in Paper III of this thesis is the first graph grammar where the uniform parsing problem has polynomial parsing complexity, even for input graphs of unbounded node degree.
34

Dynamique symbolique des systèmes 2D et des arbres infinis / Symbolic dynamics on multidimensional systems and infinite trees

Aubrun, Nathalie 22 June 2011 (has links)
Cette thèse est consacrée à l'étude des décalages, ou encore systèmes dynamiques symboliques, définis sur certains monoïdes finiment présentés, $Z^d$ d'une part et les arbres d'autre part. Le principal résultat concernant les décalages multidimensionnels établit que tout décalage effectif de dimension d est obtenu par facteur et sous-action projective d'un décalage de type fini de dimension d+1. De ce résultat nous déduisons que les décalages S-adiques multidimensionnels donnés par une suite effective de substitutions sont sofiques. Sur les décalages d'arbres nous montrons un théorème de décomposition, qui permet d'écrire une conjugaison entre deux décalages d'arbres quelconques comme une suite finie d'opérations élémentaires, les fusions entrantes et les éclatements entrants. De ce théorème, associé à la commutation des fusions entrantes, nous déduisons la décidabilité du problème de conjugaison entre deux décalages d'arbres de type fini. Nous nous intéressons ensuite à la classe des décalages d'arbres sofiques, qui sont exactement ceux reconnus par des automates d'arbres montants dans lesquels tous les états sont à la fois initiaux et finaux. Nous montrons l'existence d'un unique automate d'arbres déterministe, réduit, irréductible et synchronisé qui reconnaît un décalage d'arbres sofique. Enfin nous montrons que l'appartenance à la sous-classe des décalages d'arbres AFT est décidable / This thesis is devoted to the study of subshifts, or symbolic dynamical systems, defined on some finitely presented monoids like $Z^d$ or the infinite binary tree. The main result concerning multidimensional subshifts establishes that any effective subshift of dimension d can be obtained by factor map and projective subaction of a subshift of finite type of dimension d+1. This result has many applications, and in particular we prove that multidimensional effective S-adic subshifts are sofic. On tree-shifts we prove a decompositiontheorem, which implies that the conjugacy problem between two tree-shifts of finite type is decidable. We then investigate the class of sofic tree-shifts that are exactly those recocognized by tree automata. We prove that any sofic tree-shift has a unique deterministic, reduced, irreducible and synchronized tree automaton that recognized it. Finally we prove that it is decidable wether a sofic tree-shift belong to the sub-class of AFT tree-shifts
35

Efektivní knihovna pro práci s konečnými stromovými automaty / An Efficient Finite Tree Automata Library

Lengál, Ondřej January 2010 (has links)
Numerous computer systems use dynamic control and data structures of unbounded size. These data structures have often the character of trees or they can be encoded as trees with some additional pointers. This is exploited by some currently intensively studied techniques of formal verification that represent an infinite number of states using a finite tree automaton. However, currently there is no tree automata library implementation that would provide an efficient and flexible support for such methods. Thus the aim of this Mas- ter's Thesis is to provide such a library. The present paper first describes the theoretical background of finite tree automata and regular tree languages. Then it surveys the cur- rent implementations of tree automata libraries and studies various verification techniques, outlining requirements for the library. Representation of a finite tree automaton and algo- rithms that perform standard language operations on this representation are proposed in the next part, which is followed by description of library implementation. Through a series of experiments it is shown that the library can compete with other available tree automata libraries, in certain areas being even significantly superior to them.
36

Weighted Unranked Tree Automata over Tree Valuation Monoids

Götze, Doreen 14 March 2017 (has links)
Quantitative aspects of systems, like the maximal consumption of resources, can be modeled by weighted automata. The usual approach is to weight transitions with elements of a semiring and to define the behavior of the weighted automaton by mul- tiplying the transition weights along a run. In this thesis, we define and investigate a new class of weighted automata over unranked trees which are defined over valuation monoids. By turning to valuation monoids we use a more general cost model: the weight of a run is now determined by a global valuation function. Besides the binary cost functions implementable via semirings, valuation functions enable us to cope with average and discounting. We first investigate the supports of weighted unranked tree automata over valuation monoids, i.e., the languages of all words which are evalu- ated to a non-zero value. We will furthermore consider the support of several other weighted automata models over different structures, like words and ranked trees. Next we prove a Nivat-like theorem for the new weighted unranked tree automata. More- over, we give a logical characterization for them. We show that weighted unranked tree automata are expressively equivalent to a weighted MSO logic for unranked trees. This solves an open problem posed by Droste and Vogler. Finally, we present a Kleene- type result for weighted ranked tree automata over valuation monoids.
37

Bimorphism Machine Translation

Quernheim, Daniel 10 April 2017 (has links)
The field of statistical machine translation has made tremendous progress due to the rise of statistical methods, making it possible to obtain a translation system automatically from a bilingual collection of text. Some approaches do not even need any kind of linguistic annotation, and can infer translation rules from raw, unannotated data. However, most state-of-the art systems do linguistic structure little justice, and moreover many approaches that have been put forward use ad-hoc formalisms and algorithms. This inevitably leads to duplication of effort, and a separation between theoretical researchers and practitioners. In order to remedy the lack of motivation and rigor, the contributions of this dissertation are threefold: 1. After laying out the historical background and context, as well as the mathematical and linguistic foundations, a rigorous algebraic model of machine translation is put forward. We use regular tree grammars and bimorphisms as the backbone, introducing a modular architecture that allows different input and output formalisms. 2. The challenges of implementing this bimorphism-based model in a machine translation toolkit are then described, explaining in detail the algorithms used for the core components. 3. Finally, experiments where the toolkit is applied on real-world data and used for diagnostic purposes are described. We discuss how we use exact decoding to reason about search errors and model errors in a popular machine translation toolkit, and we compare output formalisms of different generative capacity.
38

Algebraic decoder specification: coupling formal-language theory and statistical machine translation: Algebraic decoder specification: coupling formal-language theory and statistical machine translation

Büchse, Matthias 18 December 2014 (has links)
The specification of a decoder, i.e., a program that translates sentences from one natural language into another, is an intricate process, driven by the application and lacking a canonical methodology. The practical nature of decoder development inhibits the transfer of knowledge between theory and application, which is unfortunate because many contemporary decoders are in fact related to formal-language theory. This thesis proposes an algebraic framework where a decoder is specified by an expression built from a fixed set of operations. As yet, this framework accommodates contemporary syntax-based decoders, it spans two levels of abstraction, and, primarily, it encourages mutual stimulation between the theory of weighted tree automata and the application.
39

Modèles d'automates d'arbres étendus pour la vérification de systèmes infinis

Jacquemard, Florent 10 November 2011 (has links) (PDF)
Ce document présente l'étude de plusieurs modèles de machines à états finis qui étendent tous le même formalisme: les automates d'arbres classiques, et leur application dans différentes tâches telles que l'analyse statique de programmes ou de systèmes, la typage, la vérification de la cohérence de spécifications, le model checking... Les arbres sont une structure naturelle de données, très répandue en informatique, par exemple pour la représentation des structures de données hiérarchiques ou imbriquées, pour des algorithmes spécifiques (arbres binaires de recherche, algorithmes distribués), comme modèle abstrait pour des données semi-structurées utilisées pour l'échange d'information dans le Web, pour une présentation algébrique de processus récursifs, comme les termes en logique... Lorsqu'il s'agit de raisonner sur des systèmes manipulant des arbres, ou modelisés par des arbres, il est crucial d'avoir une représentation finie d'ensembles infinis d'arbres. Les automates d'arbres sont des machines à états finis permettant une telle représentation. Ils ont fait la preuve de leur adéquation à des tâches de raisonnement: ils ont un modèle théorique bien établi, en étroite relation avec la logique, ils bénéficient de bonnes propriétés de composition et d'algorithmes de décision efficaces. En particulier, les automates d'arbres sont utilisées au coeur de systèmes de vérification formelle d'outils de déduction automatique. Toutefois, les automates d'arbres ont des limitations sévères en expressivité. Par exemple, ils sont incapables de faire du filtrage non-linéaire ou d'exprimer des contraintes d'intégrité tels que les clés dans les bases de données. Certaines extensions ont été proposées afin d'améliorer le modèle en essayant de conserver de bonnes propriétés. Nous présentons dans ce document de plusieurs de telles extensions, leurs propriétés et leur utilisation en vérification symbolique de systèmes et de programmes.
40

Simulations and Antichains for Efficient Handling of Finite Automata / Simulace a protiřetězce pro efektivní práci s konečnými automaty

Holík, Lukáš January 2011 (has links)
Cílem této práce je vývoj technik umožňujících praktické využití nedeterministických konečných automatů, zejména nedeterministických stromových automatů. Jde zvláště o techniky pro redukci velikosti a testování jazykové inkluze, jež hrají zásadní roli v mnoha oblastech aplikace konečných automatů. V oblasti redukce velikosti vycházíme z dobře známých metod pro slovní automaty které jsou založeny na relacích simulace.  Navrhli jsme efektivní algoritmy pro výpočet stromových variant simulačních relací a identifikovali jsme nový typ relace založený na kombinaci takzvaných horních a dolních simulací nad stromovými automaty. Tyto kombinované relace jsou zvláště vhodné pro redukci velikosti automatů slučováním stavů. Navržený princip kombinace relací simulace je relevantní i pro slovní automaty.  Náš přínos v oblasti testování jazykové inkluze je dvojí. Nejprve jsme zobecnili na stromové automaty takzvané protiřetězcové algoritmy, které byly původně navrženy pro slovními automaty. Dále se nám podařilo použitím simulačních relací výrazně zefektivnit protiřetězcové algoritmy pro testování jazykové inkluze jak pro slovní, tak pro stromové automaty. Relevanci našich technik pro praxi jsme demonstrovali jejich nasazením v rámci regulárního stromového model checkingu, což je verifikační metoda založená na stromových automatech. Použití našich algoritmů zde vedlo k výraznému zrychlení a zvětšení škálovatelnosti celé metody. Základní myšlenky našich algoritmů pro redukci velikosti automatů a testování jazykové inkluze jsou aplikovatelné i na jiné typy automatů. Příkladem jsou naše redukční techniky pro alternující Büchiho automaty prezentované v poslední části práce.

Page generated in 0.0553 seconds