• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 88
  • 46
  • 46
  • 10
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • Tagged with
  • 246
  • 106
  • 103
  • 89
  • 51
  • 30
  • 29
  • 28
  • 23
  • 22
  • 22
  • 21
  • 20
  • 20
  • 19
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Swedish problems with English prepositions

Blom, Liane January 2007 (has links)
English prepositions cause problems for learners of English. The way prepositions are taught has impact on how students learn. Using corpora in teaching makes it possible for teachers and pupils to explore language together and is a good alternative to filling in missing prepositions on worksheets. Sometimes linguistic errors are caused by mother tongue interference. Little research has been made earlier with a Swedish contrastive approach to prepositions but a great deal of literature concern language transfer and mother tongue interference. This essay is written on the assumption that Swedish as a first language interferes with English and causes prepositional mistakes. Two classes of ninth graders participated in my investigation. I wanted to find out if students performed better when they had given answers to choose from or when they had to produce the preposition themselves. My study proved that pupils had a better knowledge of prepositions perceptively than productively. It also proved that learners resorted to Swedish when they did not know the correct answer. Many learners fail to recognise prepositions as parts of multiword expressions. By teaching students how to notice grammatical collocations and lexical chunks we can help them to achieve acceptable levels of language proficiency and accuracy.
92

The use of the general nouns people and thing by L2 learners of English : A corpus-based study

Gerdin, Göran January 2006 (has links)
With the advent of corpora documenting learner English, a new and interesting field of research has become available. Learner corpora provide a new type of data which can inform thinking both in second language acquisition research and in foreign language teaching research. Analyses of learner corpora normally report on features which are typically ‘overused’ and ‘underused’, when contrasted to comparable native speaker corpora, in addition to those which are ‘misused’ by the learners. Ringbom (1998) conducted a study in which he identified one common aspect of non-native speaker corpora: the high frequency of general nouns, such as people and thing. The aim of this paper was to test Ringbom’s findings and attempt to identify how English as a second language learners’ usage of these particular nouns in written production differ from that of native speakers by conducting a corpus comparison of comparable learner and native speaker corpora. The results of this study clearly support Ringbom’s findings; additionally, it was found that the learners’ written production does not appear vaguer and ‘non-native like’ merely because they overuse the general nouns people and thing, but it also seems as if the learners use these nouns in a more restricted range of meanings whereas the natives’ usage is more diversified. Moreover, this study has identified some of the issues that teachers of English as a second language should be aware of when helping their students to avoid using the general nouns people and thing in a non-native like manner.
93

Acquisition de relations entre entités nommées à partir de corpus / Corpus-based recognition of relations between named entities

Ezzat, Mani 06 May 2014 (has links)
Les entités nommées ont été l’objet de nombreuses études durant les années 1990. Leur reconnaissance dans les textes a atteint un niveau de maturité suffisante, du moins pour les principaux types (personne, organisation et lieu), pour aller plus loin dans l’analyse, vers la reconnaissance de relations entre entités. Il est par exemple intéressant de savoir qu’un texte contient des occurrences des mots « Google » et « Youtube » ; mais l’analyse devient plus intéressante si le système est capable de détecter une relation entre ces deux éléments, voire de les typer comme étant une relation d’achat (Google ayant racheté Youtube en 2006). Notre contribution s’articule autour de deux grands axes : tracer un contour plus précis autour de la définition de la relation entre entités nommées, notamment au regard de la linguistique, et explorer des techniques pour l’élaboration de systèmes d’extraction automatique qui sollicitent des linguistes. / Named entities have been the topic of many researches during the 90’s. Their detection in texts has reached a high level of performance, at least for the main categories (person, organization and location). It becomes now possible to go further, toward relation between entities recognition. For instance, knowing that a text contains the words “Google” and “Youtube” can be relevant but being able to link them and detect an acquisition relation can be more interesting (Google has bought Youtube in 2006). Our work is focusing on two different aspects: to define a finer perimeter around the relation between named entities definition, with linguistic aspect in mind, and to explore new techniques that make use of linguists in order to build a relation between named entities recognition system.
94

Analyse distributionnelle appliquée aux textes de spécialité : réduction de la dispersion des données par abstraction des contextes / Distributional analysis applied to specialised corpora : reduction of data sparsity through context abstraction

Périnet, Amandine 17 March 2015 (has links)
Dans les domaines de spécialité, les applications telles que la recherche d’information ou la traduction automatique, s’appuient sur des ressources terminologiques pour prendre en compte les termes, les relations sémantiques ou les regroupements de termes. Pour faire face au coût de la constitution de ces ressources, des méthodes automatiques ont été proposées. Parmi celles-ci, l’analyse distributionnelle s’appuie sur la redondance d’informations se trouvant dans le contexte des termes pour établir une relation. Alors que cette hypothèse est habituellement mise en oeuvre grâce à des modèles vectoriels, ceux-ci souffrent du nombre de dimensions considérable et de la dispersion des données dans la matrice des vecteurs de contexte. En corpus de spécialité, ces informations contextuelles redondantes sont d’autant plus dispersées et plus rares que les corpus ont des tailles beaucoup plus petites. De même, les termes complexes sont généralement ignorés étant donné leur faible nombre d’occurrence. Dans cette thèse, nous nous intéressons au problème de la limitation de la dispersion des données sur des corpus de spécialité et nous proposons une méthode permettant de densifier la matrice des contextes en réalisant une abstraction des contextes distributionnels. Des relations sémantiques acquises en corpus sont utilisées pour généraliser et normaliser ces contextes. Nous avons évalué la robustesse de notre méthode sur quatre corpus de tailles, de langues et de domaines différents. L’analyse des résultats montre que, tout en permettant de prendre en compte les termes complexes dans l’analyse distributionnelle, l’abstraction des contextes distributionnels permet d’obtenir des groupements sémantiques de meilleure qualité mais aussi plus cohérents et homogènes. / In specialised domains, the applications such as information retrieval for machine translation rely on terminological resources for taking into account terms or semantic relations between terms or groupings of terms. In order to face up to the cost of building these resources, automatic methods have been proposed. Among those methods, the distributional analysis uses the repeated information in the contexts of the terms to detect a relation between these terms. While this hypothesis is usually implemented with vector space models, those models suffer from a high number of dimensions and data sparsity in the matrix of contexts. In specialised corpora, this contextual information is even sparser and less frequent because of the smaller size of the corpora. Likewise, complex terms are usually ignored because of their very low number of occurrences. In this thesis, we tackle the problem of data sparsity on specialised texts. We propose a method that allows making the context matrix denser, by performing an abstraction of distributional contexts. Semantic relations acquired from corpora are used to generalise and normalise those contexts. We evaluated the method robustness on four corpora of different sizes, different languages and different domains. The analysis of the results shows that, while taking into account complex terms in distributional analysis, the abstraction of distributional contexts leads to defining semantic clusters of better quality, that are also more consistent and more homogeneous.
95

Translating the Environment: A Comparative Analysis of Monolingual Corpora and Corpus-Based Resources, their Usability and their Effectiveness in Improving Translation Students’ Comprehension and Usage of Specialized Terminology in the Field of the Environment

Stentaford, Allison January 2017 (has links)
Corpora and corpus-based resources have received much attention with regard to translator training, terminology, and specialized resource development. With a specialized monolingual corpus and a specialized online dictionary, the DiCoEnviro, we sought to provide insight into the usability and effectiveness of both types of resources in improving translation students’ comprehension and usage of specialized terminology in the field of the environment. We assessed a specialized corpus and the DiCoEnviro through three lenses adapted from the usability framework proposed by Nielsen (2001): effectiveness, efficiency, and satisfaction. We used data ( screen recordings, questionnaires, translation exercises) collected from six translation students enrolled in undergraduate and graduate programs at the University of Ottawa School of Translation and Interpretation (UO-STI). Through quantitative and qualitative data analysis, we provide insight into the usability of both types of resources and into the prospective application of these findings in translator training programs and the development of specialized resources.
96

Constitution d'un corpus oral deFLE : enjeux théoriques et méthodologiques / Constitution of an oral corpus of FLE : theoretical and methodological stakes

Arbach, Najib 06 February 2015 (has links)
Les méthodologies de constitution de corpus linguistiques ont été amplement étudiées, mais sont moins abondantes quand il s’agit de corpus oraux ; ces méthodologies sont encore plus rares en ce qui concerne l’interlangue orale. Le projet CIL (Corpus Inter Langue), en cours de finalisation à l’Université Rennes 2 et sous la supervision de l’équipe d’accueil LIDILE (EA 3874), vise à la constitution d’un corpus de productions écrites et orales d’apprenants en FLE et ALE. Cette thèse concerne le corpus oral de FLE du projet global (CIL-FLE). Partant du constat que l’intérêt des linguistes pour la langue orale a systématiquement été en retard par rapport à celui porté à la langue écrite, nous nous intéressons dans un premier temps à l’étude de l’oralité dans différents domaines de la linguistique d’un point de vue historique et épistémologique. Le second chapitre est consacré à la linguistique de corpus de manière générale et au corpus en tant qu’objet linguistique en particulier. En ce qui concerne la linguistique de corpus, nous tentons de présenter les différentes méthodologies auxquelles les linguistes ont recours lorsqu’il s’agit de consulter des données : introspection, élicitation ou consultation de données authentiques. Le concept de corpus est ensuite analysé selon un ensemble de critères définitoires que nous étudions en détail, afin de proposer une définition du corpus linguistique. Le troisième et dernier chapitre est la mise en application des constats théoriques dans la constitution du corpus CIL-FLE : nous détaillons lesconstituants du corpus, les protocoles de collecte et d’archivage. C’est au protocole de transcription que nous nous intéressons en particulier, en insistant sur les difficultés de la transcription de l’interlangue. Le corpus CILFLE, qui représente environ 105000 mots, représente le fruit de ce travail et sera ainsi détaillé. / The need to design linguistic corpora to support research in linguistics has triggered the development of numerous studies exploring various approaches and methodologies regarding good practices for written corpus building. Fewer studies are available when it comes to spoken data and those that concern the interlanguage of learners are even rarer. The CIL project (Corpus Inter Langue), under completion at the University of Rennes2 and supervised by a research team specialising in the fields of linguistics and pedagogy (LIDILE), aims at building a large corpus of written and spoken productions in EFL and in FFL. This phd dissertation mainly focuses on the FFL (French as a Foreign Language) corpus (CIL-FLE).The first chapter of the thesis is dedicated to the study of oral speech as a linguistic object from both a historical and an epistemological perspective. The second chapter tackles the question of corpus linguistics generally speaking as well as the concept/ notion of corpus as a linguistic object. Regarding corpus linguistics, we will review and explore the diverse approaches and methods that are used so as to carry out research enquiries: introspection, elicitation or consultation of authentic data. The concept of corpus is then analysed according to/following a series of criteria which we will closely examine in order to propose a definition of the linguistic corpus. The third and last chapter will implement the former theoretical findings through the description of the CIL corpus design. Thus, corpus constituents, transcription and archiving protocols will be described in detail. We are particularly interested in the transcription protocol and we will insist on the difficulties encountered when attempting to transcribe learners ‘data. Finally, the CIL-FLE corpus, which contains approximately 105 000 words and was developed all along this phd, will be described.
97

Longitudinal Comparison of Word Associations in Shallow Word Embeddings

Geetanjali Bihani (8815607) 08 May 2020 (has links)
Word embeddings are utilized in various natural language processing tasks. Although effective in helping computers learn linguistic patterns employed in natural language, word embeddings also tend to learn unwanted word associations. This affects the performance of NLP tasks, as unwanted word associations propagate and amplify biases. Current word association evaluation methods for word embeddings do not account for changes in word embedding models and training corpora, when creating the rubric for word association evaluation. Current literature also lacks a consistent training and evaluation protocol for comparison of word associations across varying word embedding models and varying training corpora. In order to address this gap in prior literature, this research aims to evaluate different types of word associations, not limited to gender, racial or religious attributes, incorporating and evaluating the diachronic and variable nature of words over text data collected over a period of 200 years. This thesis introduces a framework to track changes in word associations between neutral words (proper nouns) and attributes (adjectives), across different word embedding models, over a temporal dimension, by evaluating clustering tendencies between neutral words (proper nouns) and attributive words (adjectives) over five different word embedding frameworks: Word2vec (CBOW), Word2vec (Skip-gram), GloVe, fastText (CBOW) and fastText (Skip-gram) and 20 decades of text data from 1810s to 2000s. <a>Finally, various cluster level and corpus level measurements will be compared across aforementioned word embedding frameworks, to find how</a> word associations evolve with changes in the embedding model and the training corpus.
98

Korpus francouzštiny českých studentů: možnosti jeho vytváření a využívání / A Corpus of Czech Learners' French: Possibilities of its Building and Exploitation

Stehlíková, Karolína January 2016 (has links)
This work deals with the possibilities of building and exploitation of a corpus of Czech Learners' French. The first part presents the learner corpora in the context of corpus linguistics on the basis of literature and scientific articles relevant to this subject with a special attention for corpora of French L2. Furthermore, the work focusses on the methodological principles of building a learner corpus (e.g. the specific metadata about the respondents and the material) and the error taxonomy. Next section of the work is devoted to analysis of a sample of learner's written productions with a critical description of the error taxonomy.
99

Pragmatické užití deminutiv ve francouzštině / Pragmatic meaning of diminutives in French

Jandeková, Kateřina January 2019 (has links)
Key words: diminution, apostrophe, pragmatics, speech situation, linguistic corpora Abstract: This master's thesis deals with pragmatic use of diminutives in French. The theoretical part defines the category of diminutives and possibilities of their classification, in particular with regards to their pragmatic use. Following Dressler-Merlini Barbaresi's model, this thesis accentuates emotional component of diminutives besides the traditional interpretation of dimensional meaning employed in child-centered speech situations, pet-centered speech situations and lover-centered speech situations. The master's thesis focuses on affected apostrophes. Between those hypocorisms in French, besides other forms, one can also find diminutive forms. The language materials for the empirical analysis are subcorpora of film subtitles and fiction in parallel linguistic corpora InterCorp aligned French-Czech. This alignment with Czech language on one hand serves to more efficient search of diminutives in pragmatic speech situations in French, on the other hand it serves to contrastive analysis. We used also monolinguistic corpora FRANTEXT and SYN2015 to demonstrate the frequency of diminutives forms in apostrophe in Czech and French. In the empirical part we analyze which diminutives are employed in those speech situations...
100

Using Language Corpora in Teaching

McGarry, Theresa, Mwinyelle, J. 01 July 2011 (has links)
No description available.

Page generated in 0.085 seconds