Clitic Combinations in Spanish: Syntax, Processing and AcquisitionAlba de la Fuente, Anahi 21 August 2012 (has links)
The study of clitic clusters and the restrictions that surface when two or more clitics are combined have long intrigued linguists and, as such, clitic phenomena are at the core of an ever-growing body of research in linguistic theory. However, three aspects remain largely unexplored when it comes to clitic cluster constraints, namely the evolution of these restrictions through time, the perception and processing of different clitic combinations, both acceptable and unacceptable, by native speakers and the acquisition of such combinations by non-native speakers. This dissertation, which focuses on 1st and 2nd person clitic clusters in Spanish, aims to shed new light on clitic phenomena with a new analysis and new data from all these perspectives. Specifically, I study the effects that case and marked features have on Spanish clitic combinations, both synchronically and diachronically. In addition, I explore the effects of clitic combination restrictions in language processing and analyze the learnability issues derived from such restrictions in three groups of speakers of Spanish as a second language whose L1s are English, French and Romanian, respectively. At a particular level, this dissertation is a study of clitic cluster constraints from different perspectives, both traditional and new, namely linguistic theory, diachrony, language processing and language acquisition. At a general level, it constitutes an attempt to explore the ways in which linguistic theory can guide applied research and, conversely, the ways in which experimental data may contribute to linguistic theory.
Framework to manage labels for e-assessment of diagramsJayal, Ambikesh January 2010 (has links)
Automatic marking of coursework has many advantages in terms of resource benefits and consistency. Diagrams are quite common in many domains including computer science but marking them automatically is a challenging task. There has been previous research to accomplish this, but results to date have been limited. Much of the meaning of a diagram is contained in the labels and in order to automatically mark the diagrams the labels need to be understood. However the choice of labels used by students in a diagram is largely unrestricted and diversity of labels can be a problem while matching. This thesis has measured the extent of the diagram label matching problem and proposed and evaluated a configurable extensible framework to solve it. A new hybrid syntax matching algorithm has also been proposed and evaluated. This hybrid approach is based on the multiple existing syntax algorithms. Experiments were conducted on a corpus of coursework which was large scale, realistic and representative of UK HEI students. The results show that the diagram label matching is a substantial problem and cannot be easily avoided for the e-assessment of diagrams. The results also show that the hybrid approach was better than the three existing syntax algorithms. The results also show that the framework has been effective but only to limited extent and needs to be further refined for the semantic stage. The framework proposed in this Thesis is configurable and extensible. It can be extended to include other algorithms and set of parameters. The framework uses configuration XML, dynamic loading of classes and two design patterns namely strategy design pattern and facade design pattern. A software prototype implementation of the framework has been developed in order to evaluate it. Finally this thesis also contributes the corpus of coursework and an open source software implementation of the proposed framework. Since the framework is configurable and extensible, its software implementation can be extended and used by the research community.
Syntactic relations in San Martin QuechuaHowkins, Angela January 1977 (has links)
Linguistic description has been described as "the application of a particular linguistic theory to a selected field of linguistic phenomena". The thesis presented here offers a partial application of Axiomatic Functionalism, (partial because its concern is with syntax only), to data collected on the San Martín dialect of Quechua. Proportionate to the whole body of Quechua studies, there has been little produced on the syntax of any Quechua dialect. Most syntactic studies, as do the large majority of phonological and morphological studies, use American methodology, be it based on Bloomfieldian linguistics, or be it based on those of Chomsky. The present methodology stands diametrically opposed to both schools of American linguistics cited above, and as a result introduces a fresh approach to the study of the syntactic aspect of Quechua. With Axiomatic functionalism, a new way of looking at Quechua grammar is presented and thus much of what is accepted "fact" reappraised. For this reason, while the concern of the thesis is with producing a description of syntactic relations in San Martín Quechua under the terms of Axiomatic Functionalism, reference is made to descriptions of other Quechua dialects, most notably where the application of Axiomatic Functionalism produces statements containing certain phenomena which are quite different from statements made on equivalent phenomena in other dialects using a different linguistic theory. Moreover, Axiomatic Fundamentalism is a deductive theory, and so statements regarding the data contained in the description are not statements of "fact", but are hypotheses which may stand as valid hypotheses regarding the data unless they can be refuted. Given that the theoretical base on which the description rests is different from that used in other descriptions of Quechua dialects, and so that the hypotheses made regarding syntactic relations in San Martín Quechua may be tested, Part I of the thesis is given over to the theoretical side of the work: to explaining the relation between theory and description in Chapter I, to giving brief explications of those notions in the theory which have particular relevance for a syntactic description in Chapter II, and in noting some of the limits set to the selection of the data for description in Chapter III./ The axioms and definitions of the theory are given in Appendix A. Part II of the thesis, which is in six chapters, deals with the description proper. Structures which may stand as sentences are established and analysed into their constituent structures, the relations between each constituent being ascertained. Analysis is carried through to the stage where there are no constituents analysable in syntactic terms left.
Wide-coverage parsing for TurkishÇakici, Ruket January 2009 (has links)
Wide-coverage parsing is an area that attracts much attention in natural language processing research. This is due to the fact that it is the first step tomany other applications in natural language understanding, such as question answering. Supervised learning using human-labelled data is currently the best performing method. Therefore, there is great demand for annotated data. However, human annotation is very expensive and always, the amount of annotated data is much less than is needed to train well-performing parsers. This is the motivation behind making the best use of data available. Turkish presents a challenge both because syntactically annotated Turkish data is relatively small and Turkish is highly agglutinative, hence unusually sparse at the whole word level. METU-Sabancı Treebank is a dependency treebank of 5620 sentences with surface dependency relations and morphological analyses for words. We show that including even the crudest forms of morphological information extracted from the data boosts the performance of both generative and discriminative parsers, contrary to received opinion concerning English. We induce word-based and morpheme-based CCG grammars from Turkish dependency treebank. We use these grammars to train a state-of-the-art CCG parser that predicts long-distance dependencies in addition to the ones that other parsers are capable of predicting. We also use the correct CCG categories as simple features in a graph-based dependency parser and show that this improves the parsing results. We show that a morpheme-based CCG lexicon for Turkish is able to solve many problems such as conflicts of semantic scope, recovering long-range dependencies, and obtaining smoother statistics from the models. CCG handles linguistic phenomena i.e. local and long-range dependencies more naturally and effectively than other linguistic theories while potentially supporting semantic interpretation in parallel. Using morphological information and a morpheme-cluster based lexicon improve the performance both quantitatively and qualitatively for Turkish. We also provide an improved version of the treebank which will be released by kind permission of METU and Sabancı.
Evidence-based spatial intervention for regeneration of deteriorating urban areas : a case of study from Tehran, IranRismanchian, Omid January 2012 (has links)
Throughout the urban development process over the last seven decades in Tehran, the capital city of Iran, many self-generated neighbourhoods have developed in which the majority of the residents are low-income families. On one hand, the main spatial attribute of these deprived neighbourhoods is spatial isolation from the surrounding, more affluent areas, which is accompanied by inadequate urban infrastructure and a lack of accessibility and permeability. On the other hand, the Tehran City Revitalisation Organisation - the governmental sector which is in charge of the deprived areas - is incapable of conducting urban regenerations without investment from the private sector, and is seeking methods to create ‘socio-economic stimulant zones’ to attract private sector participation in regeneration programmes. In this regard, this research investigates the notion of ‘spatial isolation’ which in return causes socio-economic isolation as highlighted in the literature. The research suggests that in order to develop feasible regeneration programmes, which can meet the interest of both people and government, and release the deprived area from isolation both spatially and socio-economically, the regeneration plans should focus on public open space developments as ‘socio-economic stimulant zones’. With regard to this idea, the research highlights the street as a ‘social arena’ – not arteries or thoroughfares – as the type of public open space in which its development could not only release the deprived areas from spatial isolation, but could also direct more pedestrian movement to and through the deprived neighbourhoods, making more opportunities for the creation of socio-economic interactions. In this respect, the theory of ‘natural movement’ and theories and literature of ‘integrated public open spaces’ form the theoretical framework of the research to support this idea. For further investigation, two case studies, one as the deprived area and one as the control area, have been chosen, and the spatial pattern of the city and the two cases have been analysed in regard to the notion of ‘spatial isolation’ through Space Syntax using Depthmap software and GIS. Also, the correlation between the distribution pattern of commercial land uses and syntactic measures across the city of Tehran is investigated to identify the potential streets in which to create commercial opportunities. Afterwards, in order to study the street life and the variety of activities the streets can afford, a few locally integrated streets in the deprived case have been chosen. At this stage, nineteen behaviours have been observed and classified in five major classes including the necessary, social, optional, hazardous, and occasional activities, and the correlation with syntactic measures are studied. Moreover, the methods of developing a route filtering system and a transformability index for identifying the most suitable streets for the creation of a pedestrian friendly network are discussed, using an example of a deprived area, integrating it with the surrounding urban fabric to create the ‘socio-economic stimulant zones’. The results show that by identifying the underlying spatial pattern of the urban fabric, it is possible to release the deprived areas from its spatial isolation through developing a street network without causing urban fragmentation. This approach could also form a cost-effective basis for developing a pedestrian friendly street network as one of the ‘socio-economic stimulant zones’, which the Tehran City Revitalisation Organisation is looking for; the type of streets that not only support the necessary activities and transportation, but could also facilitate socio-economic interaction.
Anaphoric preferences of null and overt subjects in Italian and Spanish : a cross-linguistic comparisonFiliaci, Francesca January 2011 (has links)
This thesis focuses on the cross-linguistic differences between Italian and Spanish regarding the pragmatic restrictions on the resolution of null and overt subject pronouns (NS and OSP). It also tries to identify possible links between such cross-linguistic differences and morpho-syntactic differences at the level of the verbal morphology of the two languages. Spanish and Italian are typologically related and morpho-syntactically similar and have been assumed to instantiate the same setting of the NS parameter with respect to not only its syntactic licensing conditions, but also the pragmatic constraints determining the distribution of null and overt subject pronouns, and this assumption has had important implications for cross-linguistic research. The first aim of this study was to test directly for the first time the assumption about the equivalence of Italian and Spanish; in order to do so, I run a series of self-paced reading experiments using the same materials translated in each language, so that the results were directly comparable. The experiments were based on Carminati’s (2002) study on antecedent preferences for Italian NSs and OSPs in intra-sentential anaphora, testing the Position of Antecedent Strategy. The results suggest that while in Italian there is a strict division of labour between NS and OSP (confirming Carminati’s findings), this division is not as clear-cut in Spanish. More precisely, while Italian personal pronouns unambiguously signal a switch in subject reference, the association between OSPs and switch reference seems to be much weaker in Spanish. These results, which are interpreted in terms of Cardinaletti and Starke’s (1999) cross-linguistic typology of deficient pronouns, highlight an asymmetry between the strength of NS and OSP biases in Spanish that could not have emerged through the traditional methodology used by the numerous variationist studies on the subject, based on corpus analysis. A subsequent pair of experiments tested the hypothesis that the cross-linguistic differences attested might be related to the relative syncretism of the Spanish verbal morphology compared to the Italian one with regard to the unambiguous expression of person features on the verbal head. The results only provided weak support for the hypothesis, although they did confirm the presence of the cross- linguistic differences in the processing and resolution of anaphoric NS and OSP dependencies revealed by the previous experiments.
The Design and Implementation of a Prolog Parser Using JavaccGupta, Pankaj 08 1900 (has links)
Operatorless Prolog text is LL(1) in nature and any standard LL parser generator tool can be used to parse it. However, the Prolog text that conforms to the ISO Prolog standard allows the definition of dynamic operators. Since Prolog operators can be defined at run-time, operator symbols are not present in the grammar rules of the language. Unless the parser generator allows for some flexibility in the specification of the grammar rules, it is very difficult to generate a parser for such text. In this thesis we discuss the existing parsing methods and their modified versions to parse languages with dynamic operator capabilities. Implementation details of a parser using Javacc as a parser generator tool to parse standard Prolog text is provided. The output of the parser is an “Abstract Syntax Tree” that reflects the correct precedence and associativity rules among the various operators (static and dynamic) of the language. Empirical results are provided that show that a Prolog parser that is generated by the parser generator like Javacc is comparable in efficiency to a hand-coded parser.
Izražavanje koncesivnosti u francuskom, italijanskom i srpskom jeziku / Expressing Concessionality in French, Italian and SerbianSeder Ružica 20 September 2016 (has links)
<p>U ovom istraţivanju bavimo se kategorijom koncesivnosti u francuskom, italijanskom i srpskom jeziku. Ovoj kategoriji pristupamo sa stanovišta sintakse i semantike: utvrĊujemo inventar formalnih sredstava i sintakstiĉkih postupaka kojima se koncesivnost formalizuje u posmatranim jezicima, a pritom analiziramo i semantiĉki sadrţaj tih struktura. Cilj ove studije jeste da se najpre ustanove razliĉiti postupci izraţavanja koncesivnosti na svim sintaksiĉkim nivoima, a zatim, u skladu sa kontrastivnim pristupom, da se utvrde i sistematizuju strukturne podudarnosti i nepodudarnosti u francuskom, italijanskom i srpskom jeziku, kao i da se utvrdi stemen semantiĉke ekvivalencije izmeĊu njih. Na teorijskom planu, rezultati ovog istraţivanja objedinjuju postojeća lingvistiĉka saznanja o ovoj problematici, a na praktiĉnom planu moguća je njihova primena u nastavi francuskog i italijanskog jezika kao stranih jezika, kao i u prevodilaĉkoj praksi.<br />GraĊa za ovo istraţivanje ekscerpirana je iz dela napisanih na francuskom jeziku, i njihovih objavljenih prevoda na italijanski i srpski jezik.<br />U prvom delu rada daje se pregled teorijskih stavova francuskih, italijanskih i srpskih lingvista o kategoriji koncesivnosti, kao i o njenom odnosu sa drugim semantiĉkim kategorijama, pre svega sa kategorijom kauzalnosti. U drugom delu rada navodi se inventar konstrukcija i leksiĉkih sredstava kojima se koncesivnost iskazuje u tri posmatrana jezika. Pri tom se pravi poseban osvrt na upotrebu glagolskih naĉina u zavisnim koncesivnim reĉenicama. Centralni deo rada predstavlja deo u kome se ustanovljeni inventar analizira na primerima iz korpusa. Pri tom se posebna paţnja posvećuje onim sredstvima za koja korpus beleţi upotrebe koje do sada nisu zabeleţene u literaturi. Zakljuĉna razmatranja sistematizuju dobijene rezultate, i ukazuju na mogućnosti daljih istraţivanja u ovom domenu.</p> / <p>This research deals with the category of concessionality in French, Italian and Serbian. This category is approached from the point of view of syntax and semantics: the research establishes the inventory of formal means and syntactic procedures by which concessionality is formalized in the languages being analyzed, while the semantic content of these structures is also analyzed in the process. The goal of this study is to first identify various procedures for expressing concessionality at all syntactic levels and then, in accordance with the contrastive approach, to determine and systematize the structural congruences and incongruences in French, Italian and Serbian, as well as to determine the level of semantic equivalence among them. At the theoretical level, the results of this study merge the existing linguistic knowledge on this issue, while at a practical level they enable its application in teaching French and Italian as foreign languages, as well as in doing professional translation.<br />The corpus for this research was complied from literary titles written in French, as well as from their published translations into Italian and Serbian.<br />The first part of the thesis provides an overview of theoretical approaches to the category of concessionality by various French, Italian and Serbian linguistcs, as well as of its relationship with other semantic categories, in particular with the category of causality. The second part enumerates the inventory of constructions and lexical means by which concessionality is being expressed in the three languages being analyzed. In doing so, a particular focus is placed on the use of the verbal category of mood in subordinate clauses of concession. The central part of the thesis is the one in which the identified inventory is analyzed on the examples from the corpus. In this part, a special attention is given to the means found in the corpus the use of which has so far not been mentioned in reference titles. Concluding remarks systematize the results and point at possible directions for further research in this field.</p>
Attityd, interferens, genitivsyntax : Studier i nutida Överkalixmål / Attitudes, interference, genitive syntax : Studies in the present-day dialect of ÖverkalixKällskog, Margareta January 1992 (has links)
The dissertation deals with the Överkalix dialect in three respects. Överkalix is the northernmost community of the country where Swedish dialect is spoken. It is surrounded on the east and the north by Finnish, and on the west by Finnish and Saami. The first section of the thesis is based on a questionnaire survey among all junior high school students (14-16 years old) in Överkalix and among their parents. It discusses the present-day position of the Överkalix dialect and the attitudes of the people of Överkalix toward the dialect. The results indicate that the people who consider themselves to be speakers of the local dialect have access to two language codes: local dialect and standard Swedish. Personal relationship is the deciding factor in language code choice. None of the parents considers himself/herself to be dialectally monolingual: 11% speak only standard Swedish, 75% keep the varieties apart and are thus bidialectal. 77% of the dialect-speaking students and 69% of those who do not speak dialect have a positive attitude toward the dialect, boys to a greater extent than girls among the dialect-speaking, and girls to a greater extent than boys among non-dialect speakers. The second section examines interference from the surrounding languages, Finnish and Saami, in the Överkalix dialect in general and in the Överkalix dialect of multilingual informantsin particular. These informants speak standard Swedish, dialect, Finnish and/or Saami. The main data of this section originates from recorded interviews performed as informal conversations. The author discusses some characteristic phonetic features in the dialect which seem to be the result of influence from Saami and/or Finnish. The material also shows a number of influences on the syntactic level. The third section describes how the genitive is expressed in the dialect of Överkalix. The author gives several examples of how the -s genitive is paraphrased—most commonly with a prepostion. / <p>Eftertryck av doktorsavhandling framlagd vid Uppsala universitet 1990.</p>
Syntaktische Strukturen gesprochener Sprache in Videomaterial für DaF. Eine korpusbasierte UntersuchungTiegelkamp, Vera 05 September 2016 (has links) (PDF)
Die Arbeit beschäftigt sich mit syntaktischen Strukturen des gesprochenen Gegenwartsdeutschen. Ausgehend von einem Korpus von spontaner Sprache, u.a. aus Talkshows und Reality-TV-Sendungen, wird Videomaterial aus Lehrwerken für Deutsch als Fremdsprache auf gesprochensprachliche syntaktische Strukturen hin analysiert. Es soll der Frage nachgegangen werden, inwieweit Unterrichtsmaterialien die Sprachwirklichkeit angemessen widerspiegeln.
