Global ETD Search

11	Apprentissage non supervisé de dépendances à partir de textes / Unsupervised dependency parsing from texts Arcadias, Marie 02 October 2015 (has links) Les grammaires de dépendance permettent de construire une organisation hiérarchique syntaxique des mots d’une phrase. La construction manuelle des arbres de dépendances étant une tâche exigeant temps et expertise, de nombreux travaux cherchent à l’automatiser. Visant à établir un processus léger et facilement adaptable nous nous sommes intéressés à l’apprentissage non supervisé de dépendances, évitant ainsi d’avoir recours à une expertise coûteuse. L’état de l’art en apprentissage non supervisé de dépendances (DMV) se compose de méthodes très complexes et extrêmement sensibles au paramétrage initial. Nous présentons dans cette thèse un nouveau modèle pour résoudre ce problème d’analyse de dépendances, mais de façon plus simple, plus rapide et plus adaptable. Nous apprenons une famille de grammaires (PCFG) réduites à moins de 6 non terminaux et de 15 règles de combinaisons des non terminaux à partir des étiquettes grammaticales. Les PCFG de cette famille que nous nommons DGdg (pour DROITE GAUCHE droite gauche) se paramètrent très légèrement, ainsi elles s’adaptent sans effort aux 12 langues testées. L’apprentissage et l’analyse sont effectués au moins deux fois plus rapidement que DMV sur les mêmes données. Et la qualité des analyses DGdg est pour certaines langues proches des analyses par DMV. Nous proposons une première application de notre méthode d’analyse de dépendances à l’extraction d’informations. Nous apprenons par des CRF un étiquetage en fonctions « sujet », « objet » et « prédicat », en nous fondant sur des caractéristiques extraites des arbres construits. / Dependency grammars allow the construction of a hierarchical organization of the words of sentences. The one-by-one building of dependency trees can be very long and it requries expert knowledge. In this regard, we are interested in unsupervised dependency learning. Currently, DMV give the state-of-art results in unsupervised dependency parsing. However, DMV has been known to be highly sensitive to initial parameters. The training of DMV model is also heavy and long. We present in this thesis a new model to solve this problem in a simpler, faster and more adaptable way. We learn a family of PCFG using less than 6 nonterminal symbols and less than 15 combination rules from the part-of-speech tags. The tuning of these PCFG is ligth, and so easily adaptable to the 12 languages we tested. Our proposed method for unsupervised dependency parsing can show the near state-of-the-art results, being twice faster. Moreover, we describe our interests in dependency trees to other applications such as relation extraction. Therefore, we show how such information from dependency structures can be integrated into condition random fields and how to improve a relation extraction task. Apprentissage non supervisé Grammaire de dépendances Grammaire hors contexte CYK Inside-Outside CRF Extraction de relations Unsupervised machine learning Dependency grammar Context-free grammar CKY Inside- Outside CRF Relation extraction 006.35
12	Understanding Context-free Grammars through Data Visualization Hultin, Felix January 2016 (has links) Ever since the late 1950's, context-free grammars have played an important role within the field of linguistics, been a part of introductory courses and expanded into other fields of study. Meanwhile, data visualization in modern web development has made it possible to do feature rich visualization in the browser. In this thesis, these two developments are united, by developing a browser based app, to write context-free grammars, parse sentences and visualize the output. A user experience study with usability-tests and user-interviews is conducted, in order to investigate the possible benefits and disadvantages of said visualization when writing context-free grammars. The results show that data visualization was limitedly used by participants, in that it helped them to see if sentences were parsed and, if a sentence was not parsed, at which position parsing went wrong. Future improvements on the software and studies on them are proposed as well as the expansion of the field of data visualization within linguistics. / Ända sedan det sena 1950-talet har kontextfria grammatiker spelat en viktig roll hos lingvistiska teorier, används i introduktionskurser och expanderats till andra forskningsfält. Samtidigt har datavisualisering inom modern webbutveckling gjort det möjligt att skapa innehållsrik visualisering i webbläsaren. I detta examensarbete förenas dessa två utvecklingar genom utvecklandet av en webbapplikation, gjord för att skriva kontextfria grammatiker, parsa meningar och visualisera utdatan. En användarbarhetsstudie utförs, bestående av användartest och användaintervjuer, för att undersöka möjliga fördelar och nackdelar av visualisering i skrivandet av kontextfria grammatiker. Resultaten visar att data visualisering användes på ett begränsat sätt av deltagarna, i den meningen att det hjälpte dem att se om satser kan parsas och, om en sats inte blir parsad, se på vilket stället parsning misslyckades. Framtida förbättringar av applikationen och studier av dem föreslås samt en utbyggnad av data visualisering inom lingvistik. Context-free grammar data visualization usability testing user interview D3.js Backbone.js JavaScript General Language Studies and Linguistics
13	Classification et caractérisation de familles enzymatiques à l'aide de méthodes formelles / Classification and characterization of enzymatic families with formal methods Garet, Gaëlle 16 December 2014 (has links) Cette thèse propose une nouvelle approche de découverte de signatures de familles (et superfamilles) d'enzymes. Dans un premier temps, étant donné un échantillon aligné de séquences appartenant à une même famille, cette approche infère des grammaires algébriques caractérisant cette famille. Pour ce faire, de nouveaux principes de généralisation et de nouvelles classes de langages ont été introduites sur la base de la substituabilité locale. Un algorithme a également été développé à cet effet qui produit une grammaire réduite, conservant la structuration des exemples, d'un langage substituable. Dans un second temps, ce manuscrit présente une méthode de classification des séquences d'une superfamille en familles à l'aide d'une analyse de concepts formels basée sur l'alignement des séquences qui permet la détection de nouvelles familles et la découverte des motifs fonctionnels pour améliorer les signatures précédentes. / This thesis proposes a new approach to discover signatures of families (and superfamilies) enzymes. At first, given a sample of aligned sequences belonging to the same family, this approach infers context-free grammars characteristic of this family. To do this, new principles of generalization and new classes have been introduced based on substitutability. An algorithm has also been developed for this purpose, which produces a reduced grammar able to retain the structure of examples. In a second step, this manuscript presents a method for classification of a superfamily sequences into families with a formal concept analysis based on alignement sequences allowing detection of new families and the discovery of patterns to improve functional previous signatures. Bioinformatique Enzyme Famille Inférence grammaticale Grammaire algébrique Substituabilité Analyse de concepts formels Bioinformatics Enzyme Family Grammatical inference Context-Free grammar, substitutability Formal concept analysis
14	Použití strukturální metody pro rozpoznávání objektů / Using structural method for objects recognition Valsa, Vít January 2015 (has links) This diploma thesis deals with posibilities of using structural methods for recognition objects in a picture. The first part of this thesis describes methods for preparing the picture before processing. The core of the whole thesis is in chapter 3, where is analyzed in details the problem of the formation of deformation grammars for parsing and their using. In the next part is space for syntactic parser describing the deformation grammar. The conclusion is focused on testing the suggested methods and their results.
15	Paralelní syntaktická analýza / Parallel Syntax Analysis Otáhal, Jiří January 2012 (has links) This thesis focuses on modern methods of language description. It introduces several controlled grammars, describing in detail the tree controlled grammar. The thesis is based on relatively new technique of syntax analysis using tree controlled grammars. The process of this analysis is described in detail, followed by a design of parallel-processing of this analysis. We managed to succesfully implement this design, speed up the syntax analysis and therefore achieve the main goal of the thesis.
16	Syntaktick analza zaloen na multigenerovn / Parsing Based on Multigeneration Kleiner, Milo January 2010 (has links) Multygeneratic system is based on cooperation action of infinity count of context-free grammars. All this context free-grammars parallel and synchronous derivate individual sentential form. During generation came in each specific derivation step to checking of correction of each generated sentential forms. This checking can be made by different ways. Solution is then so-called multistring (vector of strings), so by the help of this is defined generated language.
17	Hluboký syntaxí řízený překlad / Deep Syntax-Directed Translation Senko, Jozef January 2015 (has links) This thesis is a continuation of my bachelor thesis, which is dedicated to syntax analysis based on deep pushdown automata. In theorical part of this thesis is defined everything fundamental for this work, for example deep syntax-directed translation, pushdown automata, deep pushdown automata, finite transducer and deep pushdown transducer. The second part of this thesis is dedicated to the educational program for students of IFJ. In this part is defined strucure of this program and its parts. All part of program are analyzed from a theoretical and practical point of view.
18	Syntaktický analyzátor pro český jazyk / Syntactic Analyzer for Czech Language Beneš, Vojtěch January 2014 (has links) Master’s thesis describes theoretical basics, solution design, and implementation of constituency (phrasal) parser for Czech language, which is based on a part of speech association into phrases. Created program works with manually built and annotated Czech sample corpus to generate probabilistic context free grammar within runtime machine learning. Parser implementation, based on extended CKY algorithm, then for the input Czech sentence decides if the sentence can be generated by the created grammar and for the positive cases constructs the most probable derivation tree. This result is then compared with the expected parse to evaluate constituency parser success rate.
19	Syntaktická analýza založená na multigenerování / Parsing Based on Multigeneration Kyjovská, Linda January 2008 (has links) This work deals with syntax analysis problems based on multi-generation. The basic idea is to create computer program, which transforms one input string to n -1 output strings. An Input of this program is some plain text file created by user, which contains n grammar rules. Just one grammar from the input file is marked as an input grammar and others n -1 grammars are output grammars. This program creates list of used input grammar rules for an input string and uses corresponding output grammar rules for the creation of n -1 output strings. The program is written in C++ and Bison
20	Syntaktická analýza založená na párových automatech / Syntactic Analysis Based on Coupled Finite Automata Zámečníková, Eva Unknown Date (has links) Master's thesis is dealing with translation based on coupled finite automaton model. Coupled finite automaton contains input and output automaton. The input automaton makes syntactic analysis with an input string. Used rules from the input automaton control the output automaton, which generates an output string. In thesis is described a way of determinisation of the input automaton without loss of information about rules used in original automaton. The determinizitaion is divided into two parts - for finite and infinite translation specified by transducers. Then is presented a new pair automaton with increased computing power. This increased computing power consists in replace of input or output or just a part of automaton by context free grammar.

Search results