Global ETD Search

11	Recovering Chinese Nonlocal Dependencies with a Generalized Categorial Grammar Duan, Manjuan 03 September 2019 (has links) No description available. Linguistics
12	Evaluating Globally Normalized Transition Based Neural Networks for Multilingual Natural Language Understanding Azzarone, Andrea January 2017 (has links) We analyze globally normalized transition-based neural network models for dependency parsing on English, German, Spanish, and Catalan. We compare the results with FreeLing, an open source language analysis tool developed at the UPC natural language processing research group. Furthermore we study how the mini-batch size, the number of units in the hidden layers and the beam width affect the performances of the network. Finally we propose a multi-lingual parser with parameters sharing and experiment with German and English obtaining a significant accuracy improvement upon the monolingual parsers. These multi-lingual parsers can be used for low-resource languages of for all the applications with low memory requirements, where having one model per language in intractable. nlp machine learning dependency parsing Engineering and Technology Teknik och teknologier
13	Tree Transformations in Inductive Dependency Parsing Nilsson, Jens January 2007 (has links) <p>This licentiate thesis deals with automatic syntactic analysis, or parsing, of natural languages. A parser constructs the syntactic analysis, which it learns by looking at correctly analyzed sentences, known as training data. The general topic concerns manipulations of the training data in order to improve the parsing accuracy.</p><p>Several studies using constituency-based theories for natural languages in such automatic and data-driven syntactic parsing have shown that training data, annotated according to a linguistic theory, often needs to be adapted in various ways in order to achieve an adequate, automatic analysis. A linguistically sound constituent structure is not necessarily well-suited for learning and parsing using existing data-driven methods. Modifications to the constituency-based trees in the training data, and corresponding modifications to the parser output, have successfully been applied to increase the parser accuracy. The topic of this thesis is to investigate whether similar modifications in the form of tree transformations to training data, annotated with dependency-based structures, can improve accuracy for data-driven dependency parsers. In order to do this, two types of tree transformations are in focus in this thesis.</p><p>%This is a topic that so far has been less studied.</p><p>The first one concerns non-projectivity. The full potential of dependency parsing can only be realized if non-projective constructions are allowed, which pose a problem for projective dependency parsers. On the other hand, non-projective parsers tend, among other things, to be slower. In order to maintain the benefits of projective parsing, a tree transformation technique to recover non-projectivity while using a projective parser is presented here.</p><p>The second type of transformation concerns linguistic phenomena that are possible but hard for a parser to learn, given a certain choice of dependency analysis. This study has concentrated on two such phenomena, coordination and verb groups, for which tree transformations are applied in order to improve parsing accuracy, in case the original structure does not coincide with a structure that is easy to learn.</p><p>Empirical evaluations are performed using treebank data from various languages, and using more than one dependency parser. The results show that the benefit of these tree transformations used in preprocessing and postprocessing to a large extent is language, treebank and parser independent.</p> Inductive Dependency Parsing Dependency Structure Tree Transformation Non-projectivity Coordination Verb Group Language technology Språkteknologi
14	Tree Transformations in Inductive Dependency Parsing Nilsson, Jens January 2007 (has links) <p>This licentiate thesis deals with automatic syntactic analysis, or parsing, of natural languages. A parser constructs the syntactic analysis, which it learns by looking at correctly analyzed sentences, known as training data. The general topic concerns manipulations of the training data in order to improve the parsing accuracy.</p><p>Several studies using constituency-based theories for natural languages in such automatic and data-driven syntactic parsing have shown that training data, annotated according to a linguistic theory, often needs to be adapted in various ways in order to achieve an adequate, automatic analysis. A linguistically sound constituent structure is not necessarily well-suited for learning and parsing using existing data-driven methods. Modifications to the constituency-based trees in the training data, and corresponding modifications to the parser output, have successfully been applied to increase the parser accuracy. The topic of this thesis is to investigate whether similar modifications in the form of tree transformations to training data, annotated with dependency-based structures, can improve accuracy for data-driven dependency parsers. In order to do this, two types of tree transformations are in focus in this thesis.</p><p>The first one concerns non-projectivity. The full potential of dependency parsing can only be realized if non-projective constructions are allowed, which pose a problem for projective dependency parsers. On the other hand, non-projective parsers tend, among other things, to be slower. In order to maintain the benefits of projective parsing, a tree transformation technique to recover non-projectivity while using a projective parser is presented here.</p><p>The second type of transformation concerns linguistic phenomena that are possible but hard for a parser to learn, given a certain choice of dependency analysis. This study has concentrated on two such phenomena, coordination and verb groups, for which tree transformations are applied in order to improve parsing accuracy, in case the original structure does not coincide with a structure that is easy to learn.</p><p>Empirical evaluations are performed using treebank data from various languages, and using more than one dependency parser. The results show that the benefit of these tree transformations used in preprocessing and postprocessing to a large extent is language, treebank and parser independent.</p> Inductive Dependency Parsing Dependency Structure Tree Transformation Non-projectivity Coordination Verb Group Language technology Språkteknologi
15	Automatic post-editing of phrase-based machine translation outputs / Automatic post-editing of phrase-based machine translation outputs Rosa, Rudolf January 2013 (has links) We present Depfix, a system for automatic post-editing of phrase-based English-to-Czech machine trans- lation outputs, based on linguistic knowledge. First, we analyzed the types of errors that a typical machine translation system makes. Then, we created a set of rules and a statistical component that correct errors that are common or serious and can have a potential to be corrected by our approach. We use a range of natural language processing tools to provide us with analyses of the input sentences. Moreover, we reimple- mented the dependency parser and adapted it in several ways to parsing of statistical machine translation outputs. We performed both automatic and manual evaluations which confirmed that our system improves the quality of the translations.
16	[en] TRANSITIONBASED DEPENDENCY PARSING APPLIED ON UNIVERSAL DEPENDENCIES / [pt] ANÁLISE DE DEPENDÊNCIA BASEADA EM TRANSIÇÃO APLICADA A UNIVERSAL DEPENDENCIES CESAR DE SOUZA BOUCAS 11 February 2019 (has links) [pt] Análise de dependência consiste em obter uma estrutura sintática correspondente a determinado texto da linguagem natural. Tal estrutura, usualmente uma árvore de dependência, representa relações hierárquicas entre palavras. Representação computacionalmente eficiente que vem sendo utilizada para lidar com desafios que surgem com o crescente volume de informação textual online. Podendo ser utilizada, por exemplo, para inferir computacionalmente o significado de palavras das mais diversas línguas. Este trabalho apresenta a análise de dependência com enfoque em uma de suas modelagens mais populares em aprendizado de máquina: o método baseado em transição. Desenvolvemos uma implementação gulosa deste modelo com um classificador neural simples para executar experimentos. Datasets da iniciativa Universal Dependencies são utilizados para treinar e posteriormente testar o sistema com a validação disponibilizada na tarefa compartilhada da CoNLL-2017. Os resultados mostram empiricamente que se pode obter ganho de performance inicializando a camada de entrada da rede neural com uma representação de palavras obtida com pré-treino. Chegando a uma performance de 84,51 LAS no conjunto de teste da língua portuguesa do Brasil e 75,19 LAS no conjunto da língua inglesa. Ficando cerca de 4 pontos atrás da performance do melhor resultado para analisadores de dependência baseados em sistemas de transição. / [en] Dependency parsing is the task that transforms a sentence into a syntactic structure, usually a dependency tree, that represents relations between words. This representations are useful to deal with several tasks that arises with the increasing volume of textual online information and the need for technologies that depends on NLP tasks to work. It can be used, for example, to enable computers to infer the meaning of words of multiple natural languages. This paper presents dependency parsing with focus on one of its most popular modeling in machine learning: the transition-based method. A greedy implementation of this model with a simple neural network-based classifier is used to perform experiments. Universal Dependencies treebanks are used to train and then test the system using the validation script published in the CoNLL-2017 shared task. The results empirically indicate the benefits of initializing the input layer of the network with word embeddings obtained through pre-training. It reached 84.51 LAS in the Portuguese of Brazil test set and 75.19 LAS in the English test set. This result is nearly 4 points behind the performance of the best results of transition-based parsers. [pt] APRENDIZADO DE MAQUINA [en] MACHINE LEARNING [pt] ANALISE DE DEPENDENCIA [en] DEPENDENCY PARSING [pt] NLP [en] NLP
17	Robust Dependency Parsing of Spontaneous Japanese Spoken Language Ohno, Tomohiro, Matsubara, Shigeki, Kawaguchi, Nobuo, Inagaki, Yasuyoshi 03 1900 (has links) No description available. dependency parsing stochastic parsing Japanese speech linguistic phenomena syntactically annotated corpus
18	Morphosyntactic Corpora and Tools for Persian Seraji, Mojgan January 2015 (has links) This thesis presents open source resources in the form of annotated corpora and modules for automatic morphosyntactic processing and analysis of Persian texts. More specifically, the resources consist of an improved part-of-speech tagged corpus and a dependency treebank, as well as tools for text normalization, sentence segmentation, tokenization, part-of-speech tagging, and dependency parsing for Persian. In developing these resources and tools, two key requirements are observed: compatibility and reuse. The compatibility requirement encompasses two parts. First, the tools in the pipeline should be compatible with each other in such a way that the output of one tool is compatible with the input requirements of the next. Second, the tools should be compatible with the annotated corpora and deliver the same analysis that is found in these. The reuse requirement means that all the components in the pipeline are developed by reusing resources, standard methods, and open source state-of-the-art tools. This is necessary to make the project feasible. Given these requirements, the thesis investigates two main research questions. The first is how can we develop morphologically and syntactically annotated corpora and tools while satisfying the requirements of compatibility and reuse? The approach taken is to accept the tokenization variations in the corpora to achieve robustness. The tokenization variations in Persian texts are related to the orthographic variations of writing fixed expressions, as well as various types of affixes and clitics. Since these variations are inherent properties of Persian texts, it is important that the tools in the pipeline can handle them. Therefore, they should not be trained on idealized data. The second question concerns how accurately we can perform morphological and syntactic analysis for Persian by adapting and applying existing tools to the annotated corpora. The experimental evaluation of the tools shows that the sentence segmenter and tokenizer achieve an F-score close to 100%, the tagger has an accuracy of nearly 97.5%, and the parser achieves a best labeled accuracy of over 82% (with unlabeled accuracy close to 87%). Persian language technology corpus treebank preprocessing segmentation part-of-speech tagging dependency parsing
19	Syntaktická analýza textů se střídáním kódů / Syntaktická analýza textů se střídáním kódů Ravishankar, Vinit January 2018 (has links) (English) Vinit Ravishankar July 2018 The aim of this thesis is twofold; first, we attempt to dependency parse existing code-switched corpora, solely by training on monolingual dependency treebanks. In an attempt to do so, we design a dependency parser and ex- periment with a variety of methods to improve upon the baseline established by raw training on monolingual treebanks: these methods range from treebank modification to network modification. On this task, we obtain state-of-the- art results for most evaluation criteria on the task for our evaluation language pairs: Hindi/English and Komi/Russian. We beat our own baselines by a sig- nificant margin, whilst simultaneously beating most scores on similar tasks in the literature. The second part of the thesis involves introducing the relatively understudied task of predicting code-switching points in a monolingual utter- ance; we provide several architectures that attempt to do so, and provide one of them as our baseline, in the hopes that it should continue as a state-of-the-art in future tasks. 1
20	A text-mining based approach to capturing the NHS patient experience Bahja, Mohammed January 2017 (has links) An important issue for healthcare service providers is to achieve high levels of patient satisfaction. Collecting patient feedback about their experience in hospital enables providers to analyse their performance in terms of the levels of satisfaction and to identify the strengths and limitations of their service delivery. A common method of collecting patient feedback is via online portals and the forums of the service provider, where the patients can rate and comment about the service received. A challenge in analysing patient experience collected via online portals is that the amount of data can be huge and hence, prohibitive to analyse manually. In this thesis, an automated approach to patient experience analysis via Sentiment Analysis, Topic Modelling, and Dependency Parsing methods is presented. The patient experience data collected from the National Health Service (NHS) online portal in the United Kingdom is analysed in the study to understand this experience. The study was carried out in three iterations: (1) In the first, the Sentiment Analysis method was applied, which identified whether a given patient feedback item was positive or negative. (2) The second iteration involved applying Topic Modelling methods to identify automatically themes and topics from the patient feedback. Further, the outcomes of the Sentiment Analysis study from the first iteration were utilised to identify the patient sentiment regarding the topic being discussed in a given comment. (3) In the third iteration of the study, Dependency Parsing methods were employed for each patient feedback item and the topics identified. A method was devised to summarise the reason for a particular sentiment about each of the identified topics. The outcomes of the study demonstrate that text-mining methods can be effectively utilised to identify patients’ sentiment in their feedback as well as to identify the themes and topics discussed in it. The approach presented in the study was proven capable of effectively automatically analysing the NHS patient feedback database. Specifically, it can provide an overview of the positive and negative sentiment rate, identify the frequently discussed topics and summarise individual patient feedback items. Moreover, an API visualisation tool is introduced to make the outcomes more accessible to the health care providers. 362.1

Search results