Global ETD Search

41	Biak morphosyntax Mofu, Suriel Semuel January 2008 (has links) This thesis is a general description of the morphology and syntax of the Biak language. The Biak language belongs to the West New Guinea subgroup of the Austronesian language family and is spoken by around 50,000 to 70,000 speakers in West Papua in the northern part of the Geelvink Bay. The thesis consists of 7 main chapters that cover demographic and ethnographic information of the language, morphology, grammatical categories, basic constituent order, noun compounding and denominalization, relative clauses, and predicate nominal constructions. The main findings of the thesis are: • The Biak language is predominantly a head-initial language. • The Biak language has morphological variation from monomorphemic to polymorphemic with the polymorphemic being the dominant pattern in the language. • Inflectional patterns on verbal and prepositional predicates, demonstratives, and possessive pronouns are divided into two patterns: the consonantal pattern and the vocal pattern. • Biak has alienable and inalienable nouns. Alienability in Biak is a syntactic distinction, not exactly corresponding to the semantic distinction. • The basic constituent order is SVO or AVP. Variations occur with predicate nominal (OV) and internally headed relative clause which uses SOV pattern. • Three types of relative clauses were identified: (i) Post nominal relative clause; (ii) Headless relative clause; and, (iii) Internally headed relative clause. The Biak language allows stacked and nested relative clauses. Two kinds of predicate nominal constructions were identified: (i) copular clitics (clitic –ri, -s-, and free pronoun clitics) and (ii) copular verbs –iri and iso. The two kinds of predicate nominal constructions can be distinguished syntactically. 410
42	Standardisation and variation in Latin orthography and morphology (100 BC - AD 100) Nikitina, Veronika January 2015 (has links) The period 100 BC – AD 100 is often seen by scholars as the time when the 'standard' form of educated Latin was established. Standardisation, according to some, was the defining process for the fixing of written language and written norms. Once established, these written norms, we are led to believe, remained unchanged for the rest of the Antiquity. This study addresses this alleged standardisation of Latin in 100 BC – AD 100 by studying variations in spelling and morphology. Elimination of variation is a central part of establishing a standard language, while continuing variation characterises lack of standardisation. By studying variation in a diachronic perspective, therefore, we are able to assess the evidence for standardisation or lack thereof. Complete standardisation can be achieved mainly in spelling: therefore, the study of spelling is central for determining the existence of any standardisation movement. The first part of the thesis is dedicated to studying spelling variation in high-register formal inscriptions, where standardisation ought to be most evident. We discuss variation of the type maximus/maxumus, variant spellings ei and i for /ī/ and variation between assimilated and non-assimilated spelling of prefixes. A separate chapter addresses the spelling reform of Claudius. The second part of the thesis focuses on cases of morphological variation in literary and non-literary texts (variation between quis and quibus in the dat./abl. pl. and variation between active and deponent forms of verbs). The study of these cases of variation should add to our knowledge of language development in this period and provide a basis on which to begin a reassessment of standardisation in Latin. Language attitudes of literary authors and authors of nonliterary texts, which are relevant for the question of standardisation, will also be considered. My overall conclusion is that it is easy to exaggerate the importance of any standardising, and that it is important not to mix up uncontrolled linguistic change, which is a phenomenon of any language, and change, or fixing, that is the result of the conscious and deliberate efforts of language purists.
43	The phonetics and phonology of late bilingual prosodic acquisition : a cross-linguistic investigation Graham, Calbert Rechardo January 2014 (has links) No description available. 410
44	Evaluating distributional models of compositional semantics Batchkarov, Miroslav Manov January 2016 (has links) Distributional models (DMs) are a family of unsupervised algorithms that represent the meaning of words as vectors. They have been shown to capture interesting aspects of semantics. Recent work has sought to compose word vectors in order to model phrases and sentences. The most commonly used measure of a compositional DM's performance to date has been the degree to which it agrees with human-provided phrase similarity scores. The contributions of this thesis are three-fold. First, I argue that existing intrinsic evaluations are unreliable as they make use of small and subjective gold-standard data sets and assume a notion of similarity that is independent of a particular application. Therefore, they do not necessarily measure how well a model performs in practice. I study four commonly used intrinsic datasets and demonstrate that all of them exhibit undesirable properties. Second, I propose a novel framework within which to compare word- or phrase-level DMs in terms of their ability to support document classification. My approach couples a classifier to a DM and provides a setting where classification performance is sensitive to the quality of the DM. Third, I present an empirical evaluation of several methods for building word representations and composing them within my framework. I find that the determining factor in building word representations is data quality rather than quantity; in some cases only a small amount of unlabelled data is required to reach peak performance. Neural algorithms for building single-word representations perform better than counting-based ones regardless of what composition is used, but simple composition algorithms can outperform more sophisticated competitors. Finally, I introduce a new algorithm for improving the quality of distributional thesauri using information from repeated runs of the same non deterministic algorithm. 006.3
45	Graph-based approaches to word sense induction Hope, David Richard January 2015 (has links) This thesis is a study of Word Sense Induction (WSI), the Natural Language Processing (NLP) task of automatically discovering word meanings from text. WSI is an open problem in NLP whose solution would be of considerable benefit to many other NLP tasks. It has, however, has been studied by relatively few NLP researchers and often in set ways. Scope therefore exists to apply novel methods to the problem, methods that may improve upon those previously applied. This thesis applies a graph-theoretic approach to WSI. In this approach, word senses are identifed by finding particular types of subgraphs in word co-occurrence graphs. A number of original methods for constructing, analysing, and partitioning graphs are introduced, with these methods then incorporated into graphbased WSI systems. These systems are then shown, in a variety of evaluation scenarios, to return results that are comparable to those of the current best performing WSI systems. The main contributions of the thesis are a novel parameter-free soft clustering algorithm that runs in time linear in the number of edges in the input graph, and novel generalisations of the clustering coeficient (a measure of vertex cohesion in graphs) to the weighted case. Further contributions of the thesis include: a review of graph-based WSI systems that have been proposed in the literature; analysis of the methodologies applied in these systems; analysis of the metrics used to evaluate WSI systems, and empirical evidence to verify the usefulness of each novel method introduced in the thesis for inducing word senses. 004
46	Paraphrase identification using knowledge-lean techniques Eyecioglu Ozmutlu, Asli January 2016 (has links) This research addresses the problem of identification of sentential paraphrases; that is, the ability of an estimator to predict well whether two sentential text fragments are paraphrases. The paraphrase identification task has practical importance in the Natural Language Processing (NLP) community because of the need to deal with the pervasive problem of linguistic variation. Accurate methods for identifying paraphrases should help to improve the performance of NLP systems that require language understanding. This includes key applications such as machine translation, information retrieval and question answering amongst others. Over the course of the last decade, a growing body of research has been conducted on paraphrase identification and it has become an individual working area of NLP. Our objective is to investigate whether techniques concentrating on automated understanding of text requiring less resource may achieve results comparable to methods employing more sophisticated NLP processing tools and other resources. These techniques, which we call “knowledge-lean”, range from simple, shallow overlap methods based on lexical items or n-grams through to more sophisticated methods that employ automatically generated distributional thesauri. The work begins by focusing on techniques that exploit lexical overlap and text-based statistical techniques that are much less in need of NLP tools. We investigate the question “To what extent can these methods be used for the purpose of a paraphrase identification task?” For the two gold standard data, we obtained competitive results on the Microsoft Research Paraphrase Corpus (MSRPC) and reached the state-of-the-art results on the Twitter Paraphrase Corpus, using only n-gram overlap features in conjunction with support vector machines (SVMs). These techniques do not require any language specific tools or external resources and appear to perform well without the need to normalise colloquial language such as that found on Twitter. It was natural to extend the scope of the research and to consider experimenting on another language, which is poor in resources. The scarcity of available paraphrase data led us to construct our own corpus; we have constructed a paraphrasecorpus in Turkish. This corpus is relatively small but provides a representative collection, including a variety of texts. While there is still debate as to whether a binary or fine-grained judgement satisfies a paraphrase corpus, we chose to provide data for a sentential textual similarity task by agreeing on fine-grained scoring, knowing that this could be converted to binary scoring, but not the other way around. The correlation between the results from different corpora is promising. Therefore, it can be surmised that languages poor in resources can benefit from knowledge-lean techniques. Discovering the strengths of knowledge-lean techniques extended with a new perspective to techniques that use distributional statistical features of text by representing each word as a vector (word2vec). While recent research focuses on larger fragments of text with word2vec, such as phrases, sentences and even paragraphs, a new approach is presented by introducing vectors of character n-grams that carry the same attributes as word vectors. The proposed method has the ability to capture syntactic relations as well as semantic relations without semantic knowledge. This is proven to be competitive on Twitter compared to more sophisticated methods. 006.3
47	Aspects of language shift in a Hong Kong Chiu Chow family Cheung, Y. Y., Vivian. January 2006 (has links) Thesis (M. A.)--University of Hong Kong, 2006. / Title proper from title frame. Also available in printed format.
48	Acquisition of wh-movement in German / Bursey, Angelina, January 2004 (has links) Thesis (M.A.)--Memorial University of Newfoundland, 2005. / Bibliography: leaves 91-95.
49	Le Français régional : mythe ou réalité? le cas de la Provence / Ferrat, Virginie. January 2000 (has links) Thesis (doctoral)--Nice, 1999.
50	An examination of the tone system of Fur and its function in grammar Noel, Georgianna. January 2008 (has links) Thesis ( M.A. ) -- University of Texas at Arlington, 2008.

Search results