Global ETD Search

1	Robust practical parsing of English with an automatically generated grammar Fang, Chengyu January 2006 (has links) No description available. 410.285
2	Approaches to automatic biographical sentence classification : an empirical study Conway, Michael Ambrose January 2007 (has links) No description available. 410.285
3	Detection and characterization of figurative language use in WordNet Peters, Wim January 2004 (has links) No description available. 410.285
4	LEEP : learning event extraction patterns Uribe, Diego January 2004 (has links) No description available. 410.285
5	Game semantics for region analysis Greenland, William Edward January 2005 (has links) No description available. 410.285
6	Comparative evaluation of modular automatic summarisation systems using CAST OraÌ†san, Constantin January 2006 (has links) No description available. 410.285
7	Investigating domain information as dynamic support for the learner during spoken conversations Rudman, Paul Douglas January 2004 (has links) No description available. 410.285
8	Post-grammatical processing for discourse segmentation Mapleson, Daniel Lee January 2006 (has links) No description available. 410.285
9	Using automatic speech recognition to evaluate Arabic to English transliteration Khalil, G. January 2013 (has links) Increased travel and international communication has led to an increased need for transliteration of Arabic proper names for people, places, technical terms and organisations. There are a variety of available Arabic to English transliteration systems such as Unicode, the Buckwalter Arabic transliteration, and ArabTeX. The transliteration tables have been developed and used by researchers for many years, but there are only limited attempts to evaluate and compare different transliteration systems. This thesis investigates whether or not speech recognition technology could be used to evaluate different Arabic-English transliteration systems. In order to do so there were 5 main objectives: firstly, to investigate the possibility of using English speech recognition engines to recognize Arabic words; secondly, to establish the possibility of automatic transliteration of diacritised Arabic words for the purpose of creating a vocabulary for the speech recognition engine; thirdly, to explore the possibility of automatically generating transliterations of non diacritised Arabic words; fourthly to construct a general method to compare and evaluate different transliteration; and finally, to test the system and use it to experiment with new transliterations ideas. A novel testing method was found to evaluate transliteration rules and an automatic application system has been developed. This method was used to compare five existing transliteration tables: UN, Qalam, Buckwalter, ArabTeX and Alghamdi tables. From the results of these comparisons, new rules were developed in order to improve transliteration performance; these rules achieved of score 37.9% transliteration performance which is higher than the 19.1% score achieved using Alghamdi’s table which was the best performing of the existing transliteration tables tested. Most of the improvement was obtained by changing letter(s) for letter(s) transliterations, further improvements were made by more sophisticated rules based on combinations of letters and diacritics. Speech recognition performance is not a direct test of transliteration acceptability, but does correlate well with human judgement, and offers consistency and repeatability. The issues surrounding the user of English ASR for this application are discussed, as are proposals to further improve transliteration systems. 410.285
10	Data and models for statistical parsing with combinatory categorial grammar Hockenmaier, Julia January 2003 (has links) This dissertation is concerned with the creation of training data and the development of probability models for statistical parsing of English with Combinatory Categorial Grammar (CCG). Parsing, or syntactic analysis, is a prerequisite for semantic interpretation, and forms therefore an integral part of any system which requires natural language understanding. Since almost all naturally occurring sentences are ambiguous, it is not sufficient (and often impossible) to generate all possible syntactic analyses. Instead, the parser needs to rank competing analyses and select only the most likely ones. A statistical parser uses a probability model to perform this task. I propose a number of ways in which such probability models can be defined for CCG. The kinds of models developed in this dissertation, generative models over normal-form derivation trees, are particularly simple, and have the further property of restricting the set of syntactic analyses to those corresponding to a canonical derivation structure. This is important to guarantee that parsing can be done efficiently. In order to achieve high parsing accuracy, a large corpus of annotated data is required to estimate the parameters of the probability models. Most existing wide-coverage statistical parsers use models of phrase-structure trees estimated from the Penn Treebank, a 1-million-word corpus of manually annotated sentences from theWall Street Journal. This dissertation presents an algorithm which translates the phrase-structure analyses of the Penn Treebank to CCG derivations. The resulting corpus, CCGbank, is used to train and test the models proposed in this dissertation. Experimental results indicate that parsing accuracy (when evaluated according to a comparable metric, the recovery of unlabelled word-word dependency relations), is as high as that of standard Penn Treebank parsers which use similar modelling techniques. Most existing wide-coverage statistical parsers use simple phrase-structure grammars whose syntactic analyses fail to capture long-range dependencies, and therefore do not correspond to directly interpretable semantic representations. By contrast, CCG is a grammar formalism in which semantic representations that include long-range dependencies can be built directly during the derivation of syntactic structure. These dependencies define the predicate-argument structure of a sentence, and are used for two purposes in this dissertation: First, the performance of the parser can be evaluated according to how well it recovers these dependencies. In contrast to purely syntactic evaluations, this yields a direct measure of how accurate the semantic interpretations returned by the parser are. Second, I propose a generative model that captures the local and non-local dependencies in the predicate-argument structure, and investigate the impact of modelling non-local in addition to local dependencies. 410.285

Search results