Global ETD Search

11	Exploring Higher Order Dependency Parsers Madhyastha, Pranava Swaroop January 2011 (has links) Most of the recent efficient algorithms for dependency parsing work by factoring the dependency trees. In most of these approaches, the parser loses much of the contextual information during the process of factorization. There have been approaches to build higher order dependency parsers - second order, [Carreras2007] and third order [Koo and Collins2010]. In the thesis, the approach by Koo and Collins should be further exploited in one or more ways. Possible directions of further exploitation include but are not limited to: investigating possibilities of extension of the approach to non-projective parsing; integrating labeled parsing; joining word-senses during the parsing phase [Eisner2000].
12	Methods for Parallelizing Search Paths in Phrasing Marcken, Carl de 01 January 1994 (has links) Many search problems are commonly solved with combinatoric algorithms that unnecessarily duplicate and serialize work at considerable computational expense. There are techniques available that can eliminate redundant computations and perform remaining operations concurrently, effectively reducing the branching factors of these algorithms. This thesis applies these techniques to the problem of parsing natural language. The result is an efficient programming language that can reduce some of the expense associated with principle-based parsing and other search problems. The language is used to implement various natural language parsers, and the improvements are compared to those that result from implementing more deterministic theories of language processing. parallel search parsing generate and test
13	節境界に基づく独話文係り受け解析の効率化大野, 誠寛, Ohno, Tomohiro, 松原, 茂樹, Matsubara, Shigeki, 丸山, 岳彦, Maruyama, Takehiko, 柏岡, 秀紀, Kashioka, Hideki, 田中, 英輝, Tanaka, Hideki, 稲垣, 康善, Inagaki, Yasuyoshi 07 1900 (has links) No description available. dependency parsing monologue clause boundary
14	The role of statistics in human sentence processing Corley, Martin Michael Bruce January 1995 (has links) No description available. 150 Parsing preferences; Syntactic ambiguity
15	Sémantický parsing nezávislý na uspořádání vrcholů / Permutation-Invariant Semantic Parsing Samuel, David January 2021 (has links) Deep learning has been successfully applied to semantic graph parsing in recent years. However, to our best knowledge, all graph-based parsers depend on a strong assumption about the ordering of graph nodes. This work explores a permutation-invariant approach to sentence-to-graph semantic parsing. We present a versatile, cross-framework, and language-independent architecture for universal modeling of semantic structures. To empirically validate our method, we participated in the CoNLL 2020 shared task, Cross- Framework Meaning Representation Parsing (MRP 2020), which evaluated the competing systems on five different frameworks (AMR, DRG, EDS, PTG, and UCCA) across four languages. Our parsing system, called PERIN, was one of the winners of this shared task. Thus, we believe that permutation invariance is a promising new direction in the field of semantic parsing. 1
16	A Computer Language Transformation System Capable of Generalized Context-Dependent Parsing Thurston, Adrian 16 December 2008 (has links) Source transformation systems are special-purpose programming languages, or in some cases suites of languages, that are designed for the analysis and transformation of computer languages. They enable rapid prototyping of programming languages, source code renovation, language-to-language translation, design recovery, and other custom analysis techniques. With the emergence of these systems a serious problem is evident: expressing a parser for common computer languages is sometimes very difficult. Source transformation systems employ generalized parsing algorithms, and while these are well suited for the kind of agile parsing techniques in use by transformation practitioners, they are not well suited for parsing languages that are context-dependent. Traditional deterministic parser generators do not stumble in this area, but they sacrifice the generalized parsing abilities that transformation systems depend on. When it is hard to get the input into the system as a correct and accurate parse tree the utility of the unified transformation environment is degraded and more ad hoc approaches become attractive for processing input. This thesis is about the design of a new computer language transformation system with a focus on enhancing the parsing system to support generalized context-dependent parsing. We argue for the use of backtracking LR as the generalized parsing algorithm. We present an enhancement to backtracking LR that allows us to control the parsing of an ambiguous grammar by ordering the productions of the grammar definitions. We add a grammar-dependent lexical solution and integrate it with our ordered choice parsing strategy. We design a transformation language that is closer to general-purpose programming languages, yet enables common transformation techniques. We add semantic actions to our backtracking LR parsing engine and encourage the modification of global state in support of context-dependent parsing. We introduce semantic undo actions for reverting changes to global state during backtracking, thereby enabling generalized context-dependent parsing. Finally, we free the user from having to write undo actions by employing automatic reverse execution. The resulting system allows a wider variety of computer languages to be analyzed. By focusing on improving parsing abilities and moving to a transformation language that resembles general-purpose languages, we aim to extend the transformation paradigm to allow greater use by practitioners who face an immediate need to parse, analyze and transform computer languages. / Thesis (Ph.D, Computing) -- Queen's University, 2008-12-15 21:01:24.236 parsing transformation programming language computer language generalized parsing context-dependent parsing
17	Unknown word sequences in HPSG Mielens, Jason David 06 October 2014 (has links) This work consists of an investigation into the properties of unknown words in HPSG, and in particular into the phenomenon of multi-word unknown expressions consisting of multiple unknown words in a sequence. The work presented consists first of a study determining the relative frequency of multi-word unknown expressions, and then a survey of the efficacy of a variety of techniques for handling these expressions. The techniques presented consist of modified versions of techniques from the existing unknown-word prediction literature as well as novel techniques, and they are evaluated with a specific concern for how they fare in the context of sentences with many unknown words and long unknown sequences. / text Parsing HPSG Unknowns CRF Multi-word expressions
18	Probabilistic grammar induction from sentences and structured meanings Kwiatkowski, Thomas Mieczyslaw January 2012 (has links) The meanings of natural language sentences may be represented as compositional logical-forms. Each word or lexicalised multiword-element has an associated logicalform representing its meaning. Full sentential logical-forms are then composed from these word logical-forms via a syntactic parse of the sentence. This thesis develops two computational systems that learn both the word-meanings and parsing model required to map sentences onto logical-forms from an example corpus of (sentence, logical-form) pairs. One of these systems is designed to provide a general purpose method of inducing semantic parsers for multiple languages and logical meaning representations. Semantic parsers map sentences onto logical representations of their meanings and may form an important part of any computational task that needs to interpret the meanings of sentences. The other system is designed to model the way in which a child learns the semantics and syntax of their first language. Here, logical-forms are used to represent the potentially ambiguous context in which childdirected utterances are spoken and a psycholinguistically plausible training algorithm learns a probabilistic grammar that describes the target language. This computational modelling task is important as it can provide evidence for or against competing theories of how children learn their first language. Both of the systems presented here are based upon two working hypotheses. First, that the correct parse of any sentence in any language is contained in a set of possible parses defined in terms of the sentence itself, the sentence’s logical-form and a small set of combinatory rule schemata. The second working hypothesis is that, given a corpus of (sentence, logical-form) pairs that each support a large number of possible parses according to the schemata mentioned above, it is possible to learn a probabilistic parsing model that accurately describes the target language. The algorithm for semantic parser induction learns Combinatory Categorial Grammar (CCG) lexicons and discriminative probabilistic parsing models from corpora of (sentence, logical-form) pairs. This system is shown to achieve at or near state of the art performance across multiple languages, logical meaning representations and domains. As the approach is not tied to any single natural or logical language, this system represents an important step towards widely applicable black-box methods for semantic parser induction. This thesis also develops an efficient representation of the CCG lexicon that separately stores language specific syntactic regularities and domain specific semantic knowledge. This factorised lexical representation improves the performance of CCG based semantic parsers in sparse domains and also provides a potential basis for lexical expansion and domain adaptation for semantic parsers. The algorithm for modelling child language acquisition learns a generative probabilistic model of CCG parses from sentences paired with a context set of potential logical-forms containing one correct entry and a number of distractors. The online learning algorithm used is intended to be psycholinguistically plausible and to assume as little information specific to the task of language learning as is possible. It is shown that this algorithm learns an accurate parsing model despite making very few initial assumptions. It is also shown that the manner in which both word-meanings and syntactic rules are learnt is in accordance with observations of both of these learning tasks in children, supporting a theory of language acquisition that builds upon the two working hypotheses stated above. 410
19	Iterative parameter mixing for distributed large-margin training of structured predictors for natural language processing Coppola, Gregory Francis January 2015 (has links) The development of distributed training strategies for statistical prediction functions is important for applications of machine learning, generally, and the development of distributed structured prediction training strategies is important for natural language processing (NLP), in particular. With ever-growing data sets this is, first, because, it is easier to increase computational capacity by adding more processor nodes than it is to increase the power of individual processor nodes, and, second, because data sets are often collected and stored in different locations. Iterative parameter mixing (IPM) is a distributed training strategy in which each node in a network of processors optimizes a regularized average loss objective on its own subset of the total available training data, making stochastic (per-example) updates to its own estimate of the optimal weight vector, and communicating with the other nodes by periodically averaging estimates of the optimal vector across the network. This algorithm has been contrasted with a close relative, called here the single-mixture optimization algorithm, in which each node stochastically optimizes an average loss objective on its own subset of the training data, operating in isolation until convergence, at which point the average of the independently created estimates is returned. Recent empirical results have suggested that this IPM strategy produces better models than the single-mixture algorithm, and the results of this thesis add to this picture. The contributions of this thesis are as follows. The first contribution is to produce and analyze an algorithm for decentralized stochastic optimization of regularized average loss objective functions. This algorithm, which we call the distributed regularized dual averaging algorithm, improves over prior work on distributed dual averaging by providing a simpler algorithm (used in the rest of the thesis), better convergence bounds for the case of regularized average loss functions, and certain technical results that are used in the sequel. The central contribution of this thesis is to give an optimization-theoretic justification for the IPM algorithm. While past work has focused primarily on its empirical test-time performance, we give a novel perspective on this algorithm by showing that, in the context of the distributed dual averaging algorithm, IPM constitutes a convergent optimization algorithm for arbitrary convex functions, while the single-mixture distribution algorithm is not. Experiments indeed confirm that the superior test-time performance of models trained using IPM, compared to single-mixture, correlates with better optimization of the objective value on the training set, a fact not previously reported. Furthermore, our analysis of general non-smooth functions justifies the use of distributed large-margin (support vector machine [SVM]) training of structured predictors, which we show yields better test performance than the IPM perceptron algorithm, the only version of the IPM to have previously been given a theoretical justification. Our results confirm that IPM training can reach the same level of test performance as a sequentially trained model and can reach better accuracies when one has a fixed budget of training time. Finally, we use the reduction in training time that distributed training allows to experiment with adding higher-order dependency features to a state-of-the-art phrase-structure parsing model. We demonstrate that adding these features improves out-of-domain parsing results of even the strongest phrase-structure parsing models, yielding a new state-of-the-art for the popular train-test pairs considered. In addition, we show that a feature-bagging strategy, in which component models are trained separately and later combined, is sometimes necessary to avoid feature under-training and get the best performance out of large feature sets. 006.3
20	Syntactic and semantic interplay during Chinese text processing. January 1996 (has links) by Tang Siu-Lam. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 48-54). / Appendix in Chinese. / Acknowledgements --- p.I / Abstract --- p.II / Table of Contents --- p.III / Appendix --- p.IV / Introduction --- p.1 / Parsing Models --- p.3 / Possible Causes for the Discrepancies Observed in Past Studies --- p.7 / Language Specific Properties and Parsing --- p.13 / The Present Study --- p.15 / Experiment1 --- p.19 / Method --- p.22 / Results and Discussion --- p.25 / Experiment2 --- p.28 / Method --- p.30 / Results and Discussion --- p.30 / Experiment3 --- p.35 / Method --- p.38 / Results and Discussion --- p.38 / General Discussion --- p.45 / References --- p.43 / Appendix --- p.55 / "Instructions used in Experiments 1, 2, and3" --- p.55 Chinese language--Pronoun Parsing (Computer grammar)

Search results