Spelling suggestions: "subject:"rejoining grammar"" "subject:"rejoining crammar""
1 |
Rule-based Machine Translation in Limited Domain for PDAsChiang, Shin-Chian 10 September 2009 (has links)
In this thesis, we implement a rule-based machine ranslation (MT) system for Personal Digital Assistants (PDAs). Rule-based MT system has three modules in general: analysis, transfer and generation. Grammars used in our system are lexicalized tree automata-based grammar (LTA) and synchronous lexicalized tree adjoining grammar (SLTAG). LTA is used for analysis, and SLTAG is used for transfer and generation. We adjust developed parser to PDAs as a parser in the analysis module. The SLTAG parser in the transfer module would search possible source side of SLTAG in source parse tree. Then, growing target parse tree and scoring each hypothesis is based on language model and rule probability. To avoid too much estimation, generation step would prune some hypotheses under threshold. Compared with other rule-based MT systems, we can build rules automatically and design a flexible rule type. SLTAG parser is coded specially for the rule type. In experiments, Chinese-English BTEC is our training and test data. We can get 17% BLEU score for the test data.
|
2 |
Incremental Parsing with Adjoining OperationMATSUBARA, Shigeki, KATO, Yoshihide 01 December 2009 (has links)
No description available.
|
3 |
Evolutionary Developmental Evaluation : the Interplay between Evolution and DevelopmentHoang, Tuan-Hoa, Information Technology & Electrical Engineering, Australian Defence Force Academy, UNSW January 2009 (has links)
This thesis was inspired by the difficulties of artificial evolutionary systems in finding elegant and well structured, regular solutions. That is that the solutions found are usually highly disorganized, poorly structured and exhibit limited re-use, resulting in bloat and other problems. This is also true of previous developmental evolutionary systems, where structural regularity emerges only by chance. We hypothesise that these problems might be ameliorated by incorporating repeated evaluations on increasingly difficult problems in the course of a developmental process. This thesis introduces a new technique for learning complex problems from a family of structured increasingly difficult problems, Evolutionary Developmental Evaluation (EDE). This approach appears to give more structured, scalable and regular solutions to such families of problems than previous methods. In addition, the thesis proposes some bio-inspired components that are required by developmental evolutionary systems to take full advantage of this approach. The key part of this is the developmental process, in combination with a varying fitness function evaluated at multiple stages of development, generates selective pressure toward generalisation. This also means that parsimony in structure is selected for without any direct parsimony pressure. As a result, the system encourages the emergence of modularity and structural regularity in solutions. In this thesis, a new genetic developmental system called Developmental Tree Adjoining Grammar Guided Genetic Programming (DTAG3P), is implemented, embodying the requirements above. It is tested on a range of benchmark problems. The results indicate that the method generates more regularly-structured solutions than the competing methods. As a result, the system is able to scale, at least on the problem classes tested, to very complex instances the system encourages the emergence of modularity and structural regularity in solutions. In this thesis, a new genetic developmental system called Developmental Tree Adjoining Grammar Guided Genetic Programming (DTAG3P), is implemented, embodying the requirements above. It is tested on a range of benchmark problems. The results indicate that the method generates more regularly-structured solutions than competing methods. As a result, the system is able to scale, at least on the problem classes tested, to very complex problem instances.
|
4 |
Generative and Computational Power of Combinatory Categorial GrammarSchiffer, Lena Katharina 13 August 2024 (has links)
Combinatory categorial grammar (CCG) is a mildly-context sensitive formalism that is well-established in computational linguistics. At the basis of the grammar are a lexicon and a rule system: The lexicon assigns syntactic categories to the symbols of a given input string, and the rule system specifies how adjacent categories can be combined, yielding a derivation tree whose nodes are labeled by categories. In this thesis, we focus on composition rules, which are present in all variants of the grammar.
Vijay-Shanker and Weir famously show that CCG can generate the same class of string languages as tree-adjoining grammar, linear indexed grammar, and head grammar. Their equivalence proof relies on two particular features of the grammar: ε-entries, which are lexicon entries for the empty word, and rule restrictions, which allow to restrict the rule set on a per-grammar basis. However, modern variants of CCG tend to avoid these features. This raises the question how this changes the generative and computational power of CCG. Another important feature is the rule degree, which determines how complex a certain category involved in a rule application may be. The goal of this thesis is to shed light on the effects that changing these features has.
When modeling natural language, one is not only interested in the acceptability of a sentence, but also in its underlying structure. Therefore, we study the sets of constituency trees that CCG can generate, which are obtained by relabeling sets of derivation trees. We first provide a new proof of an analogous result by Buszkowski, showing that when only application rules are allowed, a proper subset of regular tree languages can be generated by CCG. Then, we show that when composition of first degree is included, CCG can generate exactly the regular tree languages. On the other hand, pure CCG, which allows all rules up to some degree, is shown to not even generate all local tree languages. Our main result on the generative capacity of CCG is its strong equivalence to tree-adjoining grammar. This means that these formalisms can generate the same class of tree languages. This is even the case when only composition rules of second degree and no ε-entries are used, showing that a CCG with these properties already has its full expressive power. Our constructions also provide an effective procedure for the removal of ε-entries.
Regarding computational complexity, ε-entries and high rule degrees are in fact problematic. Kuhlmann, Satta, and Jonsson studied the universal recognition problem for CCG, which asks whether some given string is generated by some given grammar, considering both as part of the input. They prove that this problem is EXPTIME-complete if ε-entries are included, and NP-complete if not. We refine this result and show that the runtime is exponential only in the maximum rule degree of the grammar. Hence, when the rule degree is bounded by a constant, parsing becomes polynomial in the grammar size. This also holds when substitution rules are included in the rule system.
|
5 |
Generating and simplifying sentences / Génération et simplification des phrasesNarayan, Shashi 07 November 2014 (has links)
Selon la représentation d’entrée, cette thèse étudie ces deux types : la génération de texte à partir de représentation de sens et à partir de texte. En la première partie (Génération des phrases), nous étudions comment effectuer la réalisation de surface symbolique à l’aide d’une grammaire robuste et efficace. Cette approche s’appuie sur une grammaire FB-LTAG et prend en entrée des arbres de dépendance peu profondes. La structure d’entrée est utilisée pour filtrer l’espace de recherche initial à l’aide d’un concept de filtrage local par polarité afin de paralléliser les processus. Afin nous proposons deux algorithmes de fouille d’erreur: le premier, un algorithme qui exploite les arbres de dépendance plutôt que des données séquentielles et le second, un algorithme qui structure la sortie de la fouille d’erreur au sein d’un arbre afin de représenter les erreurs de façon plus pertinente. Nous montrons que nos réalisateurs combinés à ces algorithmes de fouille d’erreur améliorent leur couverture significativement. En la seconde partie (Simplification des phrases), nous proposons l’utilisation d’une forme de représentations sémantiques (contre à approches basées la syntaxe ou SMT) afin d’améliorer la tâche de simplification de phrase. Nous utilisons les structures de représentation du discours pour la représentation sémantique profonde. Nous proposons alors deux méthodes de simplification de phrase: une première approche supervisée hybride qui combine une sémantique profonde à de la traduction automatique, et une seconde approche non-supervisée qui s’appuie sur un corpus comparable de Wikipedia / Depending on the input representation, this dissertation investigates issues from two classes: meaning representation (MR) to text and text-to-text generation. In the first class (MR-to-text generation, "Generating Sentences"), we investigate how to make symbolic grammar based surface realisation robust and efficient. We propose an efficient approach to surface realisation using a FB-LTAG and taking as input shallow dependency trees. Our algorithm combines techniques and ideas from the head-driven and lexicalist approaches. In addition, the input structure is used to filter the initial search space using a concept called local polarity filtering; and to parallelise processes. To further improve our robustness, we propose two error mining algorithms: one, an algorithm for mining dependency trees rather than sequential data and two, an algorithm that structures the output of error mining into a tree to represent them in a more meaningful way. We show that our realisers together with these error mining algorithms improves on both efficiency and coverage by a wide margin. In the second class (text-to-text generation, "Simplifying Sentences"), we argue for using deep semantic representations (compared to syntax or SMT based approaches) to improve the sentence simplification task. We use the Discourse Representation Structures for the deep semantic representation of the input. We propose two methods: a supervised approach (with state-of-the-art results) to hybrid simplification using deep semantics and SMT, and an unsupervised approach (with competitive results to the state-of-the-art systems) to simplification using the comparable Wikipedia corpus
|
Page generated in 0.0867 seconds