11 |
Speech in parts : understanding and modelling the semantic differences between wordsParadis, Michel January 2012 (has links)
This thesis is about the problem of differences in lexical semantics with a special emphasis on antonymy. It explores part-of-speech as a means to formalize semantic differences computationalIy, enhance the performance of computational linguistic tasks and aid in the understanding of lexical semantics more broadly. The thesis begins with an overview of how antonymy has been studied within experimental psychology and the major schools of theoretical linguistics as well as a review of the semantic foundations of part-of-speech. It then turns to computational experiments that use part-of-speech as a primitive organizing principle, including a source cate- gorization task and four automatic antonym identification experiments, which with few exceptions, show results that either meet or exceed human performance. The final chapter presents a computational analysis of semantic markedness and the se- quence preferences that that antonyms often demonstrate when they eo-occur, The theoretical accounts for these observations are evaluated on the basis of corpus statis- tics and the thesis concludes with some general observations about the usefulness of computational linguistics in the analysis of semantic theories
|
12 |
Linguistic and statistical extensions of data oriented parsingLinardaki, Evita January 2006 (has links)
No description available.
|
13 |
Replicating corpus linguistics : a corpus-driven investigation of lexical networks in textDoyle, Paul G. January 2003 (has links)
No description available.
|
14 |
Automatic conversion of natural language to 3D animationMa, Minhua January 2006 (has links)
No description available.
|
15 |
Automatic acquisition of sense examples for word sense disambiguation : a bilingual approachWang, Xinglong January 2006 (has links)
No description available.
|
16 |
Evaluating parallel corpora and translation quality for Chinese and EnglishLiu, Wei January 2016 (has links)
Parallel bilingual corpora are important basic resources for statistical machine translation. Accurate alignment of textual elements (e.g. documents, paragraphs, sentences) in a parallel bilingual corpus is a crucial step for statistical machine translation. Rather than using sentence length, word co-occurrence, cognates, dictionaries or parts of speech, this thesis uses compression code lengths based on the Prediction by Partial Matching (PPM) compression algorithm to measure when two sentences are aligned for parallel Chinese-English corpora. PPM has been found to be an effective method as a measure of whether the information conveyed by the texts is similar at estimating the entropy of the text. Evaluation of the quality of sentence alignment is a way to measure the quality of a corpus. Evaluating parallel bilingual corpora is also an important process and usually the last step for parallel bilingual corpus creation. However, most statistics of parallel bilingual corpora are based on counts of characters, words, tokens, sentences or files. As there is a lack of advanced parallel bilingual corpus evaluation methods, this thesis adopts a new PPMbased method for parallel bilingual corpus evaluation. The method has been used to evaluate the quality of three existing parallel bilingual corpora|the DC Corpus, the Hong Kong Yearbook Corpus and the UN Corpus. The compression-based method has also been applied to the problem of the automatic creation of new parallel corpora. The quality of sentence alignment for automatically created parallel bilingual corpora is always lower than manually checked corpora. This thesis processed the Corpus of United Nations by using the PPM-based metric and sought the best code length threshold value that can be used for automatically determining satisfactory or unsatisfactory sentence alignment in terms of translation quality in the corpus. The thesis also collected bilingual textual elements from the web and improved the quality based on the threshold code length ratio of 1.5. The approach has also been adapted to use as a method to perform translation system evaluation by comparing the compression code lengths of back translations at the sentence level. Compared to Bilingual Evaluation Understudy (BLEU) scores, the back translation-based evaluation method was able to present differences at the sentence level between original sentences and their back translations more accurately when used to evaluate some common Chinese-English translation systems.
|
17 |
Determining the types of temporal relations in discourseDerczynski, Leon January 2013 (has links)
The ability to describe the order of events is crucial for effective communication. It is used to describe causality, to plan and to relay stories. Event ordering can be thought of in terms of binary temporal relations which hold between event pairs or pairs of times and events. Complex event structures can be modelled as compositions of these pairs. Temporal relations can be expressed linguistically in a variety of ways. For example, one may use tense to describe the relation between the time of speaking and other events, or use a temporal conjunction to temporal situate an event relative to time. In the area of automatic temporal information extraction, determining the type of temporal relation between a given pair of times or events is currently the hardest task. Very sophisticated approaches have yielded only small improvements over initial attempts. Rather than develop a generic approach to event ordering through relation typing, a failure analysis informs grouping and segmentation of these temporal relations according to the type of information that can be used to temporally relate them. Two major sources of information are identified that provide typing information for two segments: relations explicitly described by a signal word, and relations involving a shift of tense and aspect. Following this, we investigate automatic temporal relation typing in both these segments, presenting results, introducing new methods and a generating set of new language resources.
|
18 |
Lexically specified derivational control in combinatory categorial grammarBaldridge, Jason January 2002 (has links)
This dissertation elaborates several refinements to the Combinatory Categorial Grammar (CCG) framework which are motivated by phenomena in parametrically diverse languages such as English, Dutch, Tagalog, Toba Batak and Turkish. I present Multi-Modal Combinatory Categorial Grammar, a formulation of CCG which incorporates devices and category constructors from related categorical frameworks and demonstrate the effectiveness of these modifications both for providing parsimonious linguistic analyses and for improving the representation of the lexicon and computational processing. Altogether, this dissertation provides many formal, linguistic, and computational justifications for the central thesis that this dissertation puts forth- that an explanatory theory of natural language grammar can be based on a categorial grammar formalism which allows cross-linguistic variation only in the lexicon and has computationally attractive properties.
|
19 |
Modelling the transition to complex, culturally transmitted communicationRitchie, Graham R. S. January 2009 (has links)
Human language is undoubtedly one of the most complex and powerful communication systems to have evolved on Earth. Study of the evolution of this behaviour is made difficult by the lack of comparable communication systems elsewhere in the animal kingdom, and by the fact that language leaves little trace in the fossil record. The human language faculty can, however, be decomposed into several component abilities and a proposed evolutionary explanation of the whole must address (at least) the evolution of each of these components. Some of these features may also be found in other species, and thus permit use of the powerful comparative method. This thesis addresses the evolution of two such component features of human language; complex vocal signalling and the cultural transmission of these vocal signals. I argue that these features make a significant contribution to the nature of human language as we observe it today and so a better understanding of the evolutionary processes that gave rise to them will contribute to study of the evolution of language. This thesis addresses the evolution of these features firstly by identifying other communication systems found in nature that display them, and focusing in particular on the song of the oscine passerines (songbirds). Bird song is chosen as a model system because of the wealth of empirical data on nearly all aspects of the behaviour and the variety of song behaviour found in this group. There also appear to be some striking similarities in the development of language and song. I argue that a better understanding of the evolution of complex signalling and cultural transmission in songbirds and other species will provide useful insight into the evolution of these features in language. This thesis presents a series of related formal models that investigate several issues in the evolution of these features. I firstly present a simple formal model of bird song acquisition and use this in a computational model of evolution to investigate some ecological conditions under which vocal behaviour can become more or less reliant on cultural transmission. I then present a pertinent case study of two closely related songbird sub-species and develop a computational model that demonstrates that domestication, or a similar shift in the fitness landscape, may play a surprising role in the evolution of signal complexity (in some sense) and increased vocal plasticity. Finally, I present several models that investigate the plausibility and consistency of the ‘developmental stress hypothesis’, an important hypothesis drawn from the biological literature that proposes that song learning and song complexity may serve as a sexually selected mate quality indicator mechanism. These models provide the first theoretical support for this important but complex hypothesis and identify a number of relevant parameters that may affect the evolution of such a system.
|
20 |
Στατιστική αναγνώριση είδους κειμένου και συγγραφέα σε νεοελληνικά κείμενα χωρίς περιορισμούςΣταματάτος, Ευστάθιος 22 September 2009 (has links)
- / -
|
Page generated in 0.0141 seconds