Global ETD Search

171	Grundtonsstrategier vid tonlösa segment von Kartaschew, Filip January 2007 (has links) <p>Prosodimodeller som bl.a. kan användas i talsynteser grundar sig ofta på analyser av tal som består av enbart tonande segment. Framför tonlös konsonant saknar vokalsegments grundtonskurvor möjlig fortsättning och blir dessutom kortare. Detta brukar då justeras med hjälp av trunkering av grundtonskurvan. Tidigare studier har i korthet visat att skillnader, förutom trunkering, i vokalers grundtonskurva kan uppstå beroende på om efterföljande segment är tonande eller tonlöst. Med utgångspunkt från dessa studier undersöks i detta examensarbete grundtonskurvan i svenska satser. Även resultaten i denna studie visar att olika strategier i grundtonskurvan används, och att trunkering inte räcker för att förklara vad som sker med grundtonskurvan i dessa kontexter. Generellt visar resultaten på att det verkar viktigt för försökspersonerna att behålla den information som grundtonskurvan ger i form av max- och minimumvärde, och att fall och stigningar så långt det går bibehålls.</p> prosodimodellering språkteknologi datorlingvistik grundton prosodi talteknologi fonetik Computational linguistics Datorlingvistik
172	Word-recognition computer program. January 1966 (has links) Contract no. DA36-039-AMC-03200(E). TK7855.M41 R43 no.452 Computational linguistics Word recognition
173	Multilingual information retrieval on the world wide web : the development of a Cantonese-Dagaare-English trilingual electronic lexicon Mok, Yuen-kwan, Sally. January 2006 (has links) Thesis (M. Phil.)--University of Hong Kong, 2006. / Title proper from title frame. Also available in printed format.
174	The phraseology of administrative French a corpus-based study / Anderson, Wendy J. January 2006 (has links) Thesis (doctoral)--University of St. Andrews, 2002. / Description based on print version record. Includes bibliographical references (p. [207]-224) and index.
175	Towards the Development of an Automatic Diacritizer for the Persian Orthography based on the Xerox Finite State Transducer Nojoumian, Peyman 12 August 2011 (has links) Due to the lack of short vowels or diacritics in Persian orthography, many Natural Language Processing applications for this language, including information retrieval, machine translation, text-to-speech, and automatic speech recognition systems need to disambiguate the input first, in order to be able to do further processing. In machine translation, for example, the whole text should be correctly diacritized first so that the correct words, parts of speech and meanings are matched and retrieved from the lexicon. This is primarily because of Persian’s ambiguous orthography. In fact, the core engine of any Persian language processor should utilize a diacritizer and a lexical disambiguator. This dissertation describes the design and implementation of an automatic diacritizer for Persian based on the state-of-the-art Finite State Transducer technology developed at Xerox by Beesley & Karttunen (2003). The result of morphological analysis and generation on a test corpus is shown, including the insertion of diacritics. This study will also look at issues that are raised by phonological and semantic ambiguities as a result of short vowels in Persian being absent in the writing system. It suggests a hybrid model (rule-based & inductive) that is inspired by psycholinguistic experiments on the human mental lexicon for the disambiguation of heterophonic homographs in Persian using frequency and collocation information. A syntactic parser can be developed based on the proposed model to discover Ezafe (the linking short vowel /e/ within a noun phrase) or disambiguate homographs, but its implementation is left for future work. Persian Persian computational linguistics diacritizer morphological analyzer heterophonic homograph disambiguation
176	Hierarchical Bayesian Models of Verb Learning in Children Parisien, Christopher 11 January 2012 (has links) The productivity of language lies in the ability to generalize linguistic knowledge to new situations. To understand how children can learn to use language in novel, productive ways, we must investigate how children can find the right abstractions over their input, and how these abstractions can actually guide generalization. In this thesis, I present a series of hierarchical Bayesian models that provide an explicit computational account of how children can acquire and generalize highly abstract knowledge of the verb lexicon from the language around them. By applying the models to large, naturalistic corpora of child-directed speech, I show that these models capture key behaviours in child language development. These models offer the power to investigate developmental phenomena with a degree of breadth and realism unavailable in existing computational accounts of verb learning. By most accounts, children rely on strong regularities between form and meaning to help them acquire abstract verb knowledge. Using a token-level clustering model, I show that by attending to simple syntactic features of potential verb arguments in the input, children can acquire abstract representations of verb argument structure that can reasonably distinguish the senses of a highly polysemous verb. I develop a novel hierarchical model that acquires probabilistic representations of verb argument structure, while also acquiring classes of verbs with similar overall patterns of usage. In a simulation of verb learning within a broad, naturalistic context, I show how this abstract, probabilistic knowledge of alternations can be generalized to new verbs to support learning. I augment this verb class model to acquire associations between form and meaning in verb argument structure, and to generalize this knowledge appropriately via the syntactic and semantic aspects of verb alternations. The model captures children's ability to use the alternation pattern of a novel verb to infer aspects of the verb's meaning, and to use the meaning of a novel verb to predict the range of syntactic forms in which the verb may participate. These simulations also provide new predictions of children's linguistic development, emphasizing the value of this model as a useful framework to investigate verb learning in a complex linguistic environment. Computational linguistics Language acquisition Bayesian models 0800 0984
177	Hierarchical Bayesian Models of Verb Learning in Children Parisien, Christopher 11 January 2012 (has links) The productivity of language lies in the ability to generalize linguistic knowledge to new situations. To understand how children can learn to use language in novel, productive ways, we must investigate how children can find the right abstractions over their input, and how these abstractions can actually guide generalization. In this thesis, I present a series of hierarchical Bayesian models that provide an explicit computational account of how children can acquire and generalize highly abstract knowledge of the verb lexicon from the language around them. By applying the models to large, naturalistic corpora of child-directed speech, I show that these models capture key behaviours in child language development. These models offer the power to investigate developmental phenomena with a degree of breadth and realism unavailable in existing computational accounts of verb learning. By most accounts, children rely on strong regularities between form and meaning to help them acquire abstract verb knowledge. Using a token-level clustering model, I show that by attending to simple syntactic features of potential verb arguments in the input, children can acquire abstract representations of verb argument structure that can reasonably distinguish the senses of a highly polysemous verb. I develop a novel hierarchical model that acquires probabilistic representations of verb argument structure, while also acquiring classes of verbs with similar overall patterns of usage. In a simulation of verb learning within a broad, naturalistic context, I show how this abstract, probabilistic knowledge of alternations can be generalized to new verbs to support learning. I augment this verb class model to acquire associations between form and meaning in verb argument structure, and to generalize this knowledge appropriately via the syntactic and semantic aspects of verb alternations. The model captures children's ability to use the alternation pattern of a novel verb to infer aspects of the verb's meaning, and to use the meaning of a novel verb to predict the range of syntactic forms in which the verb may participate. These simulations also provide new predictions of children's linguistic development, emphasizing the value of this model as a useful framework to investigate verb learning in a complex linguistic environment. Computational linguistics Language acquisition Bayesian models 0800 0984
178	Towards the Development of an Automatic Diacritizer for the Persian Orthography based on the Xerox Finite State Transducer Nojoumian, Peyman 12 August 2011 (has links) Due to the lack of short vowels or diacritics in Persian orthography, many Natural Language Processing applications for this language, including information retrieval, machine translation, text-to-speech, and automatic speech recognition systems need to disambiguate the input first, in order to be able to do further processing. In machine translation, for example, the whole text should be correctly diacritized first so that the correct words, parts of speech and meanings are matched and retrieved from the lexicon. This is primarily because of Persian’s ambiguous orthography. In fact, the core engine of any Persian language processor should utilize a diacritizer and a lexical disambiguator. This dissertation describes the design and implementation of an automatic diacritizer for Persian based on the state-of-the-art Finite State Transducer technology developed at Xerox by Beesley & Karttunen (2003). The result of morphological analysis and generation on a test corpus is shown, including the insertion of diacritics. This study will also look at issues that are raised by phonological and semantic ambiguities as a result of short vowels in Persian being absent in the writing system. It suggests a hybrid model (rule-based & inductive) that is inspired by psycholinguistic experiments on the human mental lexicon for the disambiguation of heterophonic homographs in Persian using frequency and collocation information. A syntactic parser can be developed based on the proposed model to discover Ezafe (the linking short vowel /e/ within a noun phrase) or disambiguate homographs, but its implementation is left for future work. Persian Persian computational linguistics diacritizer morphological analyzer heterophonic homograph disambiguation
179	Lexical Affinities and Language Applications Terra, Egidio January 2004 (has links) Understanding interactions among words is fundamental for natural language applications. However, many statistical NLP methods still ignore this important characteristic of language. For example, information retrieval models still assume word independence. This work focuses on the creation of lexical affinity models and their applications to natural language problems. The thesis develops two approaches for computing lexical affinity. In the first, the co-occurrence frequency is the calculated by point estimation. The second uses parametric models for co-occurrence distances. For the point estimation approach, we study several alternative methods for computing the degree of affinity by making use of point estimates for co-occurrence frequency. We propose two new point estimators for co-occurrence and evaluate the measures and the estimation procedures with synonym questions. In our evaluation, synonyms are checked directly by their co-occurrence and also by comparing them indirectly, using other lexical units as supporting evidence. For the parametric approach, we address the creation of lexical affinity models by using two parametric models for distance co-occurrence: an independence model and an affinity model. The independence model is based on the geometric distribution; the affinity model is based on the gamma distribution. Both fit the data by maximizing likelihood. Two measures of affinity are derived from these parametric models and applied to the synonym questions, resulting in the best absolute performance on these questions by a method not trained to the task. We also explore the use of lexical affinity in information retrieval tasks. A new method to score missing terms by using lexical affinities is proposed. In particular, we adapt two probabilistic scoring functions for information retrieval to allow all query terms to be scored. One is a document retrieval method and the other is a passage retrieval method. Our new method, using replacement terms, shows significant improvement over the original methods. Computer Science lexical affinity information retrieval computational linguistics
180	Using Rhetorical Figures and Shallow Attributes as a Metric of Intent in Text Strommer, Claus Walter January 2011 (has links) In this thesis we propose a novel metric of document intent evaluation based on the detection and classification of rhetorical figure. In doing so we dispel the notion that rhetoric lacks the structure and consistency necessary to be relevant to computational linguistics. We show how the combination of document attributes available through shallow parsing and rules extracted from the definitions of rhetorical figures produce a metric which can be used to reliably classify the intent of texts. This metric works equally well on entire documents as on portions of a document. rhetoric classification natural language processing computational linguistics epanaphora Computer Science

Search results