• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 11
  • 10
  • 3
  • 3
  • 1
  • Tagged with
  • 28
  • 17
  • 16
  • 14
  • 12
  • 11
  • 9
  • 9
  • 9
  • 8
  • 7
  • 7
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Mémorisation et reconnaissance de séquences multimots chez l'enfant et l'adulte : effets de la fréquence et de la variabilité interne / Multiword sequence storage and recognition in children and adults : frequency and internal variability effects

Bellanger, Cindy 11 December 2017 (has links)
Les modèles de la perception du langage écrit et du langage oral mettent au premier plan l’importance du lexique mental. En effet, parmi les nombreux indices hiérarchisés et guidant la segmentation du flux continu de parole chez l’adulte et l’enfant, les indices lexicaux ont une place prépondérante. Tout au long de ce travail, nous nous intéressons aux spécificités du stockage des séquences multimots dans le lexique mental et à l’hypothèse d’une mémorisation de ces séquences en une seule unité.Ce travail se divise en deux parties, chacune composée d’une série d’expériences. La première partie interroge en premier lieu les indices impliqués dans les effets facilitateurs de la reconnaissance des noms au sein du groupe nominal. Pour cela, sont mis en perspective l’effet du genre grammatical porté par les déterminants et l’effet de fréquence de co-occurrence des séquences déterminant-nom sur le traitement du nom. C’est ensuite l’effet de la cohésion des séquences multimots sur leur reconnaissance qui est examiné.La seconde partie aborde l’influence de la variabilité interne des combinaisons déterminant-nom dans l’acquisition de la structure du groupe nominal chez l’enfant de deux ans à deux ans et demi. Au travers d’une étude longitudinale, nous opposons deux grandes conceptions de l’acquisition du langage chez le jeune enfant: la Grammaire Universelle et les approches Basées sur l’Usage. / The mental lexicon is usually assumed as the main foundation of written and spoken-language perception. Numerous and hierarchically-organized cues drive speech segmentation in adults and infants but lexical cues appear as overriding. Throughout this work, we question multiword-sequence storage idiosyncrasy and multiword-sequence memorizing as one unit in the mental lexicon.This work splits into two parts, each composed of a set of experiments. The first one assesses the cues involved in recognition facilitation of nouns in noun phrases. For that purpose, we disentangled grammatical-gender effects and co-occurrence frequency effects on the processing of determiner-noun sequences. Then, we tested the cohesiveness effect on three-word sequences’ recognition.The second set of experiments is about the influence of determiner-noun sequences’ internal variability in noun-phrase’s structure aquisition in 2 to 2,5 year-old children. In a three-month longitudinal study, we contrast two main conceptions of first-language acquisition: Universal Grammar and Usage-Based approaches.
22

Un environnement générique et ouvert pour le traitement des expressions polylexicales : de l'acquisition aux applications / A generic and open framework for multiword expressions treatment : from acquisition to applications

Ramisch, Carlos Eduardo 11 September 2012 (has links)
Cette thèse présente un environnement ouvert et souple pour l'acquisition automatique d'expressions multimots (MWE) à partir de corpus textuels monolingues. Cette recherche est motivée par l'importance des MWE pour les applications du TALN. Après avoir brièvement présenté les modules de l'environnement, le mémoire présente des résultats d'évaluation intrinsèque en utilisant deux applications: la lexicographie assistée par ordinateur et la traduction automatique statistique. Ces deux applications peuvent bénéficier de l'acquisition automatique de MWE, et les expressions acquises automatiquement à partir de corpus peuvent à la fois les accélérer et améliorer leur qualité. Les résultats prometteurs de nos expériences nous encouragent à mener des recherches ultérieures sur la façon optimale d'intégrer le traitement des MWE dans ces applications et dans bien d'autres / This thesis presents an open and flexible methodological framework for the automatic acquisition of multiword expressions (MWEs) from monolingual textual corpora. This research is motivated by the importance of MWEs for NLP applications. After briefly presenting the modules of the framework, the work reports extrinsic evaluation results considering two applications: computer-aided lexicography and statistical machine translation. Both applications can benefit from automatic MWE acquisition and the expressions acquired automatically from corpora can both speed up and improve their quality. The promising results of our experiments encourage further investigation about the optimal way to integrate MWE treatment into these and many other applications.
23

Názvy současných profesí ve zdravotnictví / Names of contemporary professions in healthcare

Hubková, Helena January 2016 (has links)
The presented thesis describes the typology of current names of professions in healthcare in terms of onomasiology and structural-morphology. The starting point of the thesis is the existing professional linguistic description of naming structures implemented as one-word names and nominal collocations. The language material consists of the names of professions in healthcare that are currently used. The names are classified in terms of word-formation, lexical semantics and the origin of words.
24

Lexical levels and formulaic language : an exploration of undergraduate students' vocabulary and written production of delexical multiword units

Scheepers, Ruth Angela 11 1900 (has links)
This study investigates undergraduate students’ vocabulary size, and their use of formulaic language. Using the Vocabulary Levels Test (Laufer and Nation 1995), it measures the vocabulary size of native and non-native speakers of English and explores relationships between this and course of study, gender, age and home language, and their academic performance. A corpus linguistic approach is then applied to compare student writers’ uses of three high-frequency verbs (have, make and take) relative to expert writers. Multiword units (MWUs) featuring these verbs are identified and analysed, focusing on delexical MWUs as one very specific aspect of depth of vocabulary knowledge. Student and expert use of these MWUs is compared. Grammatically and semantically deviant MWUs are also analysed. Finally, relationships between the size and depth of students’ vocabulary knowledge, and between the latter and academic performance, are explored. Findings reveal that Literature students had larger vocabularies than Law students, females knew more words than males, and older students knew more than younger ones. Importantly, results indicated a relationship between vocabulary size and academic performance. Literature students produced more correct MWUs and fewer errors than Law students. Correlations suggest that the smaller students’ vocabulary, the poorer the depth of their vocabulary is likely to be. Although no robust relationship between vocabulary depth and academic performance emerged, there was evidence of an indirect link between academic performance and correct use of MWUs. In bringing together traditional methods of measuring vocabulary size with an investigation of depth of vocabulary knowledge using corpus analysis methods, this study provides further evidence of the importance of vocabulary knowledge to academic performance. It contributes to debates on the value of a sound knowledge of high-frequency vocabulary and a developing knowledge of at least 5000 words to academic performance, and the analysis and quantification of errors in MWUs adds to our understanding of novice writers’ difficulties with these combinations. The study also explores new ways of investigating relationships between size and depth of vocabulary knowledge, and between depth of vocabulary knowledge and academic performance. / Linguistics and Modern Languages / D. Litt. et Phil. (Linguistics)
25

Použití hlubokých kontextualizovaných slovních reprezentací založených na znacích pro neuronové sekvenční značkování / Deep contextualized word embeddings from character language models for neural sequence labeling

Lief, Eric January 2019 (has links)
A family of Natural Language Processing (NLP) tasks such as part-of- speech (PoS) tagging, Named Entity Recognition (NER), and Multiword Expression (MWE) identification all involve assigning labels to sequences of words in text (sequence labeling). Most modern machine learning approaches to sequence labeling utilize word embeddings, learned representations of text, in which words with similar meanings have similar representations. Quite recently, contextualized word embeddings have garnered much attention because, unlike pretrained context- insensitive embeddings such as word2vec, they are able to capture word meaning in context. In this thesis, I evaluate the performance of different embedding setups (context-sensitive, context-insensitive word, as well as task-specific word, character, lemma, and PoS) on the three abovementioned sequence labeling tasks using a deep learning model (BiLSTM) and Portuguese datasets. v
26

Lexical levels and formulaic language : an exploration of undergraduate students' vocabulary and written production of delexical multiword units

Scheepers, Ruth Angela 11 1900 (has links)
This study investigates undergraduate students’ vocabulary size, and their use of formulaic language. Using the Vocabulary Levels Test (Laufer and Nation 1995), it measures the vocabulary size of native and non-native speakers of English and explores relationships between this and course of study, gender, age and home language, and their academic performance. A corpus linguistic approach is then applied to compare student writers’ uses of three high-frequency verbs (have, make and take) relative to expert writers. Multiword units (MWUs) featuring these verbs are identified and analysed, focusing on delexical MWUs as one very specific aspect of depth of vocabulary knowledge. Student and expert use of these MWUs is compared. Grammatically and semantically deviant MWUs are also analysed. Finally, relationships between the size and depth of students’ vocabulary knowledge, and between the latter and academic performance, are explored. Findings reveal that Literature students had larger vocabularies than Law students, females knew more words than males, and older students knew more than younger ones. Importantly, results indicated a relationship between vocabulary size and academic performance. Literature students produced more correct MWUs and fewer errors than Law students. Correlations suggest that the smaller students’ vocabulary, the poorer the depth of their vocabulary is likely to be. Although no robust relationship between vocabulary depth and academic performance emerged, there was evidence of an indirect link between academic performance and correct use of MWUs. In bringing together traditional methods of measuring vocabulary size with an investigation of depth of vocabulary knowledge using corpus analysis methods, this study provides further evidence of the importance of vocabulary knowledge to academic performance. It contributes to debates on the value of a sound knowledge of high-frequency vocabulary and a developing knowledge of at least 5000 words to academic performance, and the analysis and quantification of errors in MWUs adds to our understanding of novice writers’ difficulties with these combinations. The study also explores new ways of investigating relationships between size and depth of vocabulary knowledge, and between depth of vocabulary knowledge and academic performance. / Linguistics and Modern Languages / D. Litt. et Phil. (Linguistics)
27

Elaboration de ressources électroniques pour les noms composés de type N (E+DET=G) N=G du grec moderne / The N (E + DET=G) N=G compound nouns in Modern Greek

Kyriakopoulou, Anthoula 25 March 2011 (has links)
L'objectif de cette recherche est la construction manuelle de ressources lexicales pour les noms composés grecs qui sont définis par la structure morphosyntaxique : Nom (E+Déterminant au génitif) Nom au génitif, notés N (E+DET:G) N:G (e.g. ζώνη ασφαλείας/ceinture de sécurité). Les ressources élaborées peuvent être utilisées pour leur reconnaissance lexicale automatique dans les textes écrits et dans d'autres applications du TAL. Notre travail s'inscrit dans la perspective de l'élaboration du lexique-grammaire général du grec moderne en vue de l'analyse automatique des textes écrits. Le cadre théorique et méthodologique de cette étude est celui du lexique-grammaire (M. Gross 1975, 1977), qui s'appuie sur la grammaire transformationnelle harisienne.Notre travail s'organise en cinq parties. Dans la première partie, nous délimitons l'objet de notre travail tout en essayant de définir la notion fondamentale qui régit notre étude, à savoir celle de figement. Dans la deuxième partie, nous présentons la méthodologie utilisée pour le recensement de nos données lexicales et nous étudions les phénomènes de variation observés au sein des noms composés de type N (E+DET:G) N:G. La troisième partie est consacrée à la présentation des différentes sous-catégories des N (E+DET:G) N:G identifiées lors de l'étape du recensement et à l'étude de leur structure lexicale interne. La quatrième partie porte sur l'étude syntaxico-sémantique des N (E+DET:G) N:G. Enfin, dans la cinquième partie, nous présentons les différentes méthodes de représentation formalisée que nous proposons pour nos données lexicales en vue de leur reconnaissance lexicale automatique dans les textes écrits. Des échantillons représentatifs des ressources élaborées sont présentés en Annexe / The object of this research is the manual construction of lexical resources for the Greek compound nouns defined by the following morphosyntactic structure : Noun (E+Determiner in genitive) Noun in genitive, (N (E+DET:G) N:G) (e.g. ζώνη ασφαλείας/safety belt). The elaborated resources may be used for their automatic recognition in written texts and other NLP applications. Our study is part of the general lexicon-grammar for Modern Greek in view of automatic processing of written texts. Our theoretical and methodological framework is that of lexicon-grammar (M. Gross 1975, 1977), based on the Transformational Grammar principles defined by Z. S. Harris. Our study is organised into five parts. In the first part, we give an overview of the core notion governing our research : the notions of (fixed) multiword expression (MWE). In the second part, we present the methodology used to collect our lexical data and we study the variation phenomena observed within the framework of the N (E+DET:G) N:G. The third part is dedicated to the presentation of the different N (E+DET:G) N:G categories identified in the listing phase qnd to the study of their lexical composition. The fourth concerns the syntactical and semantic study of the N (E+DET:G) N:G. Finally, the fifth part deals with the formal representation methods we propose for our lexical data in view of their lexical recognition in Greek written texts. Representative samples of the elaborated resources are illustrated in Appendix
28

Le traitement des locutions en génération automatique de texte multilingue

Dubé, Michaelle 08 1900 (has links)
La locution est peu étudiée en génération automatique de texte (GAT). Syntaxiquement, elle forme un syntagme, alors que sémantiquement, elle ne constitue qu’une seule unité. Le présent mémoire propose un traitement des locutions en GAT multilingue qui permet d’isoler les constituants de la locution tout en conservant le sens global de celle-ci. Pour ce faire, nous avons élaboré une solution flexible à base de patrons universels d’arbres de dépendances syntaxiques vers lesquels pointent des patrons de locutions propres au français (Pausé, 2017). Notre traitement a été effectué dans le réalisateur de texte profond multilingue GenDR à l’aide des données du Réseau lexical du français (RL-fr). Ce travail a abouti à la création de 36 règles de lexicalisation par patron (indépendantes de la langue) et à un dictionnaire lexical pour les locutions du français. Notre implémentation couvre 2 846 locutions du RL-fr (soit 97,5 %), avec une précision de 97,7 %. Le mémoire se divise en cinq chapitres, qui décrivent : 1) l’architecture classique en GAT et le traitement des locutions par différents systèmes symboliques ; 2) l’architecture de GenDR, (principalement sa grammaire, ses dictionnaires, son interface sémantique-syntaxe et ses stratégies de lexicalisations) ; 3) la place des locutions dans la phraséologie selon la théorie Sens-Texte, ainsi que le RL-fr et ses patrons syntaxiques linéarisés ; 4) notre implémentation de la lexicalisation par patron des locutions dans GenDR, et 5) notre évaluation de la couverture de la précision de notre implémentation. / Idioms are rarely studied in natural language generation (NLG). Syntactically, they form a phrase, while semantically, they correspond to a single unit. In this master’s thesis, we propose a treatment of idioms in multilingual NLG that enables us to isolate their constituents while preserving their global meaning. To do so, we developed a flexible solution based on universal templates of syntactic dependency trees, onto which we map French-specific idiom patterns (Pausé, 2017). Our work was implemented in Generic Deep Realizer (GenDR) using data from the Réseau lexical du français (RL-fr). This resulted in the creation of 36 template-based lexicalization rules (independent of language) and of a lexical dictionary for French idioms. Our implementation covers 2846 idioms of the RL-fr (i.e., 97.5%), with an accuracy of 97.7%. We divided our analysis into five chapters, which describe: 1) the classical NLG architecture and the handling of idioms by different symbolic systems; 2) the architecture of GenDR (mainly its grammar, its dictionaries, its semantic-syntactic interface, and its lexicalization strategies); 3) the place of idioms in phraseology according to Meaning-Text Theory (théorie Sens-Texte), the RL-fr and its linearized syntactic patterns; 4) our implementation of the template lexicalization of idioms in GenDR; and 5) our evaluation of the coverage and the precision of our implementation.

Page generated in 0.0503 seconds