Spelling suggestions: "subject:"lexical acquisition"" "subject:"iexical acquisition""
11 |
Interaction of Task-Induced Involvement Components in Lexical AcquisitionMcGarry, Theresa, Michieka, M. 01 September 2011 (has links)
This study aims to ascertain the relative importance of certain aspects of task design for the acquisition of vocabulary by high-proficiency second language speakers. Previous research suggests that vocabulary learning is most effective when "task-induced involvement" is high and that involvement can be measured as a sum of three components: need for the vocabulary in the task, searching for a meaning of a word or a word to fit a meaning, and evaluation of how the word meaning compares to that of other words or how the word fits in with surrounding words. We attempt to corroborate those premises and, further, to examine the hypothetical components of involvement separately and investigate whether they interact with each other and/or differ in impact on the learning process. To address this question, we administered a different task to each of four groups of high-proficiency learners of English, varying the search and evaluation components among the tasks. We measured the vocabulary gains on immediate and delayed tests and compared the results among the four groups. The results on the immediate test accord with earlier studies in showing greater gains for the groups with more task-induced involvement than for the control group. Concerning search and evaluation, the two components appear to compensate for each other, suggesting that the presence of either one is as effective as the presence of both. The delayed post-test results followed the same patterns, although the results are not statistically significant.
|
12 |
Learning with Laura: investigating the effects of a pedagogical agent on Spanish lexical acquisitionTheodoridou, Katerina Demetre 22 October 2009 (has links)
The purpose of this study was to investigate the effects of an animated pedagogical agent on Spanish vocabulary learning. Furthermore, the study examined learners’ reactions and attitudes towards the presence of the pedagogical agent in a web-based environment, as well as how learners used the conversational component of the pedagogical agent in their learning process. A total of 47 university students enrolled in two fourth-semester Spanish classes participated in this study. Both the Control group and the Experimental group used a web-based environment that presented new vocabulary (in audio and text), along with activities for practicing the vocabulary in context. The difference between the two groups was that an animated pedagogical agent (Laura) was present in the environment used by the Experimental group. In addition, a conversational component was added at a second phase to the environment used by the Experimental group, which the learners used to chat with the pedagogical agent about the material presented. The data were analyzed through quantitative and qualitative methods. The quantitative data were derived from a demographic information questionnaire, a vocabulary pre-test and two vocabulary post-tests (an immediate post-test and a delayed post-test), as well as from attitudes scales completed prior to the learners’ exposure to the web-based environments and after completing the learning sessions. The qualitative data were derived from a learning experience questionnaire completed by all learners at the end of the learning sessions, as well as from the scripts of the chat sessions between the learners in the Experimental group and the pedagogical agent, and a chatting experience questionnaire completed by the same group. Analysis of the quantitative data did not yield significant differences between the Control and the Experimental groups with respect to vocabulary learning outcomes and affective outcomes. Analysis of the qualitative data revealed learners’ preferences with respect to features embedded in the web-based language learning environments. In addition, it explored how learners utilized the conversational aspect of the pedagogical agent, and provided information as to the type of information the agent’s knowledge base should include in order for the agent to be a beneficial tool for the learners’ progress. / text
|
13 |
O vocabulário básico do português no processo de aquisição da língua maternaSouza, Vanzorico Carlos de [UNESP] 11 August 2005 (has links) (PDF)
Made available in DSpace on 2014-06-11T19:22:20Z (GMT). No. of bitstreams: 0
Previous issue date: 2005-08-11Bitstream added on 2014-06-13T19:07:42Z : No. of bitstreams: 1
souza_vc_me_sjrp.pdf: 380415 bytes, checksum: ea74590116197988065325974e479210 (MD5) / Seed - Secretaria de Educação do Estado de São Paulo / Os professores de todas as disciplinas são unânimes em apontar que uma das maiores dificuldades no aprendizado é justamente as limitações do vocabulário dos alunos. Sabemos que não basta resolver apenas essa questão do vocabulário para que os textos sejam compreendidos sem maiores dificuldades pelos alunos. Por um lado, é inegável que o domínio de um vocabulário variado certamente ajudará no processo de aprendizagem. A Lexicologia, portanto, pode trazer grandes contribuições ao ensino da língua materna e ao desenvolvimento de um Vocabulário Básico do Português Fundamental. Por outro lado, reconhecemos que a competência lexical do indivíduo é de fato um produto social, isto é, resultado das suas relações interativas na sociedade em que vive. Essa sociedade não é homogênea, mas se divide em classes sociais, portanto é de se esperar que os grupos sociais dentro de uma mesma sociedade utilizem vocabulários diferentes nas suas inter-relações cotidianas. O léxico também é, por conseguinte, organizado de maneira diferente nos diversos estratos sociais. Este trabalho apresenta os resultados de uma pesquisa em que testamos a competência lexical de alunos da oitava série do Ensino Fundamental de duas escolas representativas de grupos sociais diferentes. Verificou-se, pois, se a condição socioeconômica interfere na aquisição do Vocabulário Básico do Português Brasileiro, cujas unidades léxicas foram distribuídas em diversos campos semânticos, e como o livro didático, importante instrumento pedagógico que trabalha sobretudo a interpretação de texto, trata a questão do léxico e se esse tratamento contribui para a aquisição e expansão do vocabulário dos alunos. / Teachers of most school subjects know that one of the most difficult things in learning is the poor pupils' vocabulary. As everybody knows, it doesn't worth to solve only this problem of vocabulary to improve the comprehension of texts in the class. On the one hand, it's undisputed that a varied vocabulary will surely help in the learning process. Lexicology can bring great impulse to mother tongue teaching and developing a Basic Vocabulary of Fundamental Portuguese. On the other hand we recognize that the lexical skill of a person is in fact a social product, result of his interacting with the relationships living in the same society he lives in. This society is not uniform but divided in social levels, so it's clear that different social groups in the same society uses different vocabularies in their daily connections. Lexicon is differently organized in the varied social layers. This work shows the results of a research where we tested the lexical competence of pupils from eighth grade of two schools representing two different social groups. We verified, too, if the social-economic condition interferes in the learning of the Basic Vocabulary of Fundamental Brazilian Portuguese, whose lexical units were distributed in different semantic fields. We also aim to observe as the textbook, important pedagogic tool which works mainly the text interpretation, contributes in the reception and expansion of the students' vocabulary.
|
14 |
Estudo sobre o uso de dicionários escolares nas salas de 4º e 5º anos da rede municipal de Catalão-GORibeiro, Cacildo Galdino 25 March 2014 (has links)
Submitted by Erika Demachki (erikademachki@gmail.com) on 2014-09-03T21:39:24Z
No. of bitstreams: 2
Dissertação - versão final.pdf: 949945 bytes, checksum: 034a90decde37c53686eb03acc7a7687 (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Made available in DSpace on 2014-09-03T21:39:24Z (GMT). No. of bitstreams: 2
Dissertação - versão final.pdf: 949945 bytes, checksum: 034a90decde37c53686eb03acc7a7687 (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5)
Previous issue date: 2014-03-25 / Fundação de Amparo à Pesquisa do Estado de Goiás - FAPEG / Studies conducted in Brazil have shown that the school dictionary is a didactic work that can
contribute significantly with the process of lexical acquisition of student, but it is little used in
schools. In this sense, the lexicographic proposals of school dictionaries must be aligned to
the profile of the audience, for whom they are intended, with a view, therefore, aspects of
micro and macroestructure. Therefore, the Ministry of Education (MEC) promoted, on the
occasion of the National Plan of the Textbook – PNLD/2006, the provision of public schools
with lexicographic collections to be used by the students of the first segment of elementary
school. The collections offered by the Government were organized according to the
evaluation and selection of dictionaries, carried out by the Commission of scholars and
researchers, established by the MEC. It is from this context that this work verifies how and
whether the school dictionaries offered by MEC are used in local government schools of
Catalão-Go, particularly, in the fourth and fifth years. For both, this research presents brief
considerations about matters related to the subject in question, for reasons connected with the
Lexicography and Lexicology, such as some concepts of word, the types of dictionaries etc.,
in order to substantiate the analyses of corpus collected via questionnaires applied to teachers
of these grades and informal conversations with professionals from the local government the
Secretary of education of Catalão and with coordinators and principals of the participating
schools. It is believed that the little use of dictionaries is linked to the lack of training courses
for teachers in the area of the lexicon, so the results of this research can contribute to the
elaboration of extension courses or specialization for teachers of the local government
education. / Estudos realizados no Brasil têm demonstrado que o dicionário escolar é um expediente
didático que pode contribuir significativamente com o processo de aquisição lexical do aluno,
mas é pouco utilizado nas escolas. Neste sentido, as propostas lexicográficas dos dicionários
escolares devem estar alinhadas ao perfil do público a que se destinam, tendo em vista,
portanto, os seus aspectos micro e macroestruturais. Por isso, o MEC promoveu, na ocasião
do Plano Nacional do Livro Didático – PNLD/2006, o provimento das escolas públicas com
acervos lexicográficos para serem usados pelos alunos do primeiro segmento do Ensino
Fundamental. Os acervos oferecidos pelo governo foram organizados segundo a avaliação e a
seleção dos dicionários, realizadas pela comissão de estudiosos e pesquisadores, instituída
pelo MEC. É a partir deste contexto que este trabalho verifica como e se os dicionários
escolares ofertados pelo MEC são utilizados nas escolas municipais de Catalão-Go,
particularmente, nas salas de 4º e 5º anos. Para tanto, esta pesquisa apresenta breves
considerações acerca de assuntos ligados ao tema em questão, atinentes à Lexicografia e à
Lexicologia, tais como alguns conceitos de palavra, as tipologias de dicionários etc., no
intuito de fundamentar as análises do corpus coletado via questionários aplicados aos
professores das referidas séries e conversas informais com profissionais da Secretaria
Municipal de Educação de Catalão e com coordenadores e diretores das escolas participantes
da pesquisa. Acredita-se que o pouco uso dos dicionários está ligado à inexistência de cursos
de formação para professores na área do léxico, assim sendo, os resultados desta pesquisa
podem contribuir na elaboração de cursos de extensão ou especialização para os professores
da rede municipal de educação.
|
15 |
Der Gebrauch lexikalischer Erwerbsbeschränkungen bei Kindern mit Williams-Beuren-Syndrom / Lexical constraints in German children with Williams syndromeSiegmüller, Julia January 2008 (has links)
In der vorliegenden Arbeit wird eine Studie zum mentalen Lexikon bei Kindern mit Williams-Beuren-Syndrom (WBS) präsentiert. Das Lexikon junger WBS-Kinder entwickelt sich verzögert (Mervis & Robinson, 2000). Trotzdem gilt das Lexikon jugendlicher WBS-Probanden im Vergleich zu Probanden mit anderen Syndromen als elaboriert (Wang et al. 1995). Dies könnte auf sich spät entwickelnde Sprachfähigkeiten hindeuten. Es wird vermutet, dass ab 11 Jahren Veränderungen stattfinden, durch die das typische Profil des WBS erst entsteht (Rossen et al. 1996). Ziel der vorliegenden Arbeit ist es, sich der Aufholphase zu nähern, indem die lexikalischen Fähigkeiten vor dem kritischen Alter untersucht werden. Dazu werden zwei lexical constraints untersucht, die Markman (1989) für den ungestörten Lexikonerwerb postuliert. Whole object constraint (WOC): Das Kind nimmt an, dass sich ein unfamiliäres Wort auf ein ganzes Objekt bezieht. Mutual exclusivity constraint (MEC): Das Kind nimmt eine beidseitig exklusive Beziehung zwischen Wortform und Referenten an. Zum WBS gibt es eine einzige Studie zu den constraints (Stevens & Karmiloff-Smith 1997). Die WBS-Probanden sind zu alt (7;5 bis 31;5), um Aussagen über die Sprachfähigkeiten in der Zeit des Spurts machen zu können.
Markman postuliert die constraints als Teil des universalen Wissens von Kindern. Dementsprechend ist die Hypothese, dass die constraints auch bei WBS-Kindern aktiv sind und in experimentellen Situationen zur Anwendung kommen. Zentral für die Hypothese ist die Untersuchung von Vorschulalkindern. Es werden 5 WBS-Kinder (3;2-7;0) und 98 chronologisch gematchte Kontrollkinder im WOC bzw. 97 im MEC untersucht. Es wird jeweils ein Versuch zum WOC (n=9) und zum MEC (n=12) durchgeführt.
Beim WOC-Versuch wählen WBS-Kinder und Kontrollkinder am häufigsten das Zielitem. Die WBS-Kinder wählen häufig das Teilablenkerbild. Im Einzelfallvergleich sind 4 der 5 WBS-Kinder im Vergleich zu ihrer Kontrollgruppe auffällig. Im MEC-Versuch zeigen die ungestörten Kinder signifikant häufiger auf das Bild mit dem phonologischen Ablenker als die WBS-Kinder. In der Einzelfallanalyse liegen 4 von 5 WBS-Kindern bei der Auswahl des Zielitems oberhalb des Mittelwertes ihrer Kontrollgruppe.
Insgesamt ergeben sich durch das Verhalten der WBS-Kinder in den Versuchen eher Hinweise auf defizitäre perzeptuelle Einflüsse auf die Anwendung der lexikalischen constraints als auf ihr Fehlen. Als Ursache für das Verhalten der WBS-Kinder wird die Detailpräferenzhypothese postuliert. Majerus et al.s (2003)Hypothese wird um die visuelle Verarbeitung erweitert. Diese findet lokal statt und kann nur bedingt Gattungsbegriffe aufbauen. Den überspezifizierten Wortformen stehen Teilrepräsentationen gegenüber. Die entstehenden semantischen Repräsentationen sind an konkreten Erfahrungen orientiert und verbleiben auf einer überspezifizierten Form.
Mit der Hypothese der generellen Detailpräferenz wird zum ersten Mal eine einheitliche Wurzel für das Verhalten von WBS-Kindern im Vorschulalter in verschiedenen psychologischen Fakultäten aufgestellt.
Majerus, S., Van der Linden, M., Mulder, L., Meulemans, T., & Peters, F. (2003). Verbal short-term memory reflects the sublexical organization of the phonological language network: evidence from an incidental phonotactic learning paradigm. Journal of Memory and Language, 51, 297-306.
Markman, E. (1989). Categorization and naming in children. Cambridge MA: MIT Press.
Mervis, C. B. & Robinson, B. F. (2000). Expressive vocabulary ability of toddlers with Williams syndrome or Down syndrome: a comparison. Developmental Neuropsychology, 17, 11-126.
Rossen, M., Klima, E., Bellugi, U., Bihrle, A., & Jones, W. (1996). Interaction between language and cognition: evidence from Williams syndrome. In J. H. Beitchman, N. Cohen, M. Konstantareas, & R. Tannock (Eds.), Language, learning and behavior disorders: developmental, biological, and clinical perspectives. (367-392). New York: Cambridge University Press.
Stevens, T. & Karmiloff-Smith, A. (1997). Word learning in a special population: do individuals with Williams syndrome obey lexical constraints? Journal of Child Language, 24, 737-765.
Wang, P. P., Doherty, S., Rourke, S. B., & Bellugi, U. (1995). Unique profile of visuo-perceptual skills in a genetic syndrome. Brain and Cognition, 29, 54-65. / This thesis presents a study on two lexical constraints in german children with Williams syndrome (WS). The lexicon ist known to be delayed in WS, however in adults the lexicon is said to be elaborated (Wang et al. 1995). This might be a hint for late developing language compenteces. Rossen et al. (1996) see a performance growth in fluency in WS children older than 11 years. The aim of the current study is to examine the lexical learning mechanisms in WS children in kindergarden age.
Five WS children are matched to 97 normal children on chronological age. Two experiments (whole object constraint, mutual exclusivity constraints) are designed, following the argumentations of Markman (1989). The results show that both lexical constraints are active in WS children but act on different inputinformations than in other children. In the discussion, the detail preference hypothesis is drawn, which postulates for the first time a unique perceptual deficit which influences language acquisition without also implying a primary language disorder.
Markman, E. (1989). Categorization and naming in children. Cambridge MA: MIT Press.
Wang, P. P., Doherty, S., Rourke, S. B., & Bellugi, U. (1995). Unique profile of visuo-perceptual skills in a genetic syndrome. Brain and Cognition, 29, 54-65.
|
16 |
Facilitating Lexical Acquisition in Beginner Learners of Italian through Popular SongNatale Rukholm, Vanessa 31 August 2011 (has links)
This study examines the effects of Song and Involvement Load on the acquisition and retention of lexical items by beginner learners of Italian. Lexical acquisition is investigated via an incidental learning experiment that is based on the premise that growth in L2 vocabulary results from rehearsal and repeated exposure to lexical items in a variety of contexts. More specifically, the study hypothesizes that Song contributes to subvocal rehearsal, a mechanism that facilitates the retention of phonological information. In addition, the study hypothesizes that Involvement Load, as posited by Laufer and Hulstijn (2001), contributes to retention through “elaborate processing”(Craik & Tulving, 1975) of lexical items.
In order to evaluate participants‟ lexical acquisition, an experiment with pretest/posttest design was carried out. Participants were divided into one of five groups consisting of a Control Group and four treatment groups. Treatment groups were exposed to a Song either in a sung condition or read as a poem (i.e. without music) while the Control Group completed only the pretest and posttests. Treatment groups also completed lexical tasks designed with either low or high levels of Involvement Load. The pretest and posttests (administered at four and eight weeks respectively after the pretest) were based on Paribakht and Wesche‟s (1996) Vocabulary
iii
Knowledge Scale. It was hypothesized that in the case of both short-term acquisition (four weeks after the pretest) and retention (eight weeks thereafter) (i) participants exposed to Song would obtain higher scores than participants only exposed to the lyrics; (ii) participants completing High Involvement tasks would score higher than participants completing Low Involvement tasks; and (iii) the effects of Song would be greater than the effects of Involvement Load on test scores.
Results indicated that at both posttests, participants exposed to Song obtained higher scores than participants only exposed to lyrics (p=0.004). Additionally, participants carrying out High Involvement tasks scored higher than participants carrying out Low Involvement tasks (p=0.017). However, a comparison of the strength of the effects of Song and Involvement Load on acquisition and retention of target items yielded inconclusive results (p=.383).
The validation of many of the hypotheses suggests that song and involvement load are effective in the acquisition and retention of L2 lexical items and should be implemented in the L2 curriculum.
|
17 |
Facilitating Lexical Acquisition in Beginner Learners of Italian through Popular SongNatale Rukholm, Vanessa 31 August 2011 (has links)
This study examines the effects of Song and Involvement Load on the acquisition and retention of lexical items by beginner learners of Italian. Lexical acquisition is investigated via an incidental learning experiment that is based on the premise that growth in L2 vocabulary results from rehearsal and repeated exposure to lexical items in a variety of contexts. More specifically, the study hypothesizes that Song contributes to subvocal rehearsal, a mechanism that facilitates the retention of phonological information. In addition, the study hypothesizes that Involvement Load, as posited by Laufer and Hulstijn (2001), contributes to retention through “elaborate processing”(Craik & Tulving, 1975) of lexical items.
In order to evaluate participants‟ lexical acquisition, an experiment with pretest/posttest design was carried out. Participants were divided into one of five groups consisting of a Control Group and four treatment groups. Treatment groups were exposed to a Song either in a sung condition or read as a poem (i.e. without music) while the Control Group completed only the pretest and posttests. Treatment groups also completed lexical tasks designed with either low or high levels of Involvement Load. The pretest and posttests (administered at four and eight weeks respectively after the pretest) were based on Paribakht and Wesche‟s (1996) Vocabulary
iii
Knowledge Scale. It was hypothesized that in the case of both short-term acquisition (four weeks after the pretest) and retention (eight weeks thereafter) (i) participants exposed to Song would obtain higher scores than participants only exposed to the lyrics; (ii) participants completing High Involvement tasks would score higher than participants completing Low Involvement tasks; and (iii) the effects of Song would be greater than the effects of Involvement Load on test scores.
Results indicated that at both posttests, participants exposed to Song obtained higher scores than participants only exposed to lyrics (p=0.004). Additionally, participants carrying out High Involvement tasks scored higher than participants carrying out Low Involvement tasks (p=0.017). However, a comparison of the strength of the effects of Song and Involvement Load on acquisition and retention of target items yielded inconclusive results (p=.383).
The validation of many of the hypotheses suggests that song and involvement load are effective in the acquisition and retention of L2 lexical items and should be implemented in the L2 curriculum.
|
18 |
A generic and open framework for multiword expressions treatment : from acquisition to applicationsRamisch, Carlos Eduardo January 2012 (has links)
The treatment of multiword expressions (MWEs), like take off, bus stop and big deal, is a challenge for NLP applications. This kind of linguistic construction is not only arbitrary but also much more frequent than one would initially guess. This thesis investigates the behaviour of MWEs across different languages, domains and construction types, proposing and evaluating an integrated methodological framework for their acquisition. There have been many theoretical proposals to define, characterise and classify MWEs. We adopt generic definition stating that MWEs are word combinations which must be treated as a unit at some level of linguistic processing. They present a variable degree of institutionalisation, arbitrariness, heterogeneity and limited syntactic and semantic variability. There has been much research on automatic MWE acquisition in the recent decades, and the state of the art covers a large number of techniques and languages. Other tasks involving MWEs, namely disambiguation, interpretation, representation and applications, have received less emphasis in the field. The first main contribution of this thesis is the proposal of an original methodological framework for automatic MWE acquisition from monolingual corpora. This framework is generic, language independent, integrated and contains a freely available implementation, the mwetoolkit. It is composed of independent modules which may themselves use multiple techniques to solve a specific sub-task in MWE acquisition. The evaluation of MWE acquisition is modelled using four independent axes. We underline that the evaluation results depend on parameters of the acquisition context, e.g., nature and size of corpora, language and type of MWE, analysis depth, and existing resources. The second main contribution of this thesis is the application-oriented evaluation of our methodology proposal in two applications: computer-assisted lexicography and statistical machine translation. For the former, we evaluate the usefulness of automatic MWE acquisition with the mwetoolkit for creating three lexicons: Greek nominal expressions, Portuguese complex predicates and Portuguese sentiment expressions. For the latter, we test several integration strategies in order to improve the treatment given to English phrasal verbs when translated by a standard statistical MT system into Portuguese. Both applications can benefit from automatic MWE acquisition, as the expressions acquired automatically from corpora can both speed up and improve the quality of the results. The promising results of previous and ongoing experiments encourage further investigation about the optimal way to integrate MWE treatment into other applications. Thus, we conclude the thesis with an overview of the past, ongoing and future work.
|
19 |
A generic and open framework for multiword expressions treatment : from acquisition to applicationsRamisch, Carlos Eduardo January 2012 (has links)
The treatment of multiword expressions (MWEs), like take off, bus stop and big deal, is a challenge for NLP applications. This kind of linguistic construction is not only arbitrary but also much more frequent than one would initially guess. This thesis investigates the behaviour of MWEs across different languages, domains and construction types, proposing and evaluating an integrated methodological framework for their acquisition. There have been many theoretical proposals to define, characterise and classify MWEs. We adopt generic definition stating that MWEs are word combinations which must be treated as a unit at some level of linguistic processing. They present a variable degree of institutionalisation, arbitrariness, heterogeneity and limited syntactic and semantic variability. There has been much research on automatic MWE acquisition in the recent decades, and the state of the art covers a large number of techniques and languages. Other tasks involving MWEs, namely disambiguation, interpretation, representation and applications, have received less emphasis in the field. The first main contribution of this thesis is the proposal of an original methodological framework for automatic MWE acquisition from monolingual corpora. This framework is generic, language independent, integrated and contains a freely available implementation, the mwetoolkit. It is composed of independent modules which may themselves use multiple techniques to solve a specific sub-task in MWE acquisition. The evaluation of MWE acquisition is modelled using four independent axes. We underline that the evaluation results depend on parameters of the acquisition context, e.g., nature and size of corpora, language and type of MWE, analysis depth, and existing resources. The second main contribution of this thesis is the application-oriented evaluation of our methodology proposal in two applications: computer-assisted lexicography and statistical machine translation. For the former, we evaluate the usefulness of automatic MWE acquisition with the mwetoolkit for creating three lexicons: Greek nominal expressions, Portuguese complex predicates and Portuguese sentiment expressions. For the latter, we test several integration strategies in order to improve the treatment given to English phrasal verbs when translated by a standard statistical MT system into Portuguese. Both applications can benefit from automatic MWE acquisition, as the expressions acquired automatically from corpora can both speed up and improve the quality of the results. The promising results of previous and ongoing experiments encourage further investigation about the optimal way to integrate MWE treatment into other applications. Thus, we conclude the thesis with an overview of the past, ongoing and future work.
|
20 |
A generic and open framework for multiword expressions treatment : from acquisition to applicationsRamisch, Carlos Eduardo January 2012 (has links)
The treatment of multiword expressions (MWEs), like take off, bus stop and big deal, is a challenge for NLP applications. This kind of linguistic construction is not only arbitrary but also much more frequent than one would initially guess. This thesis investigates the behaviour of MWEs across different languages, domains and construction types, proposing and evaluating an integrated methodological framework for their acquisition. There have been many theoretical proposals to define, characterise and classify MWEs. We adopt generic definition stating that MWEs are word combinations which must be treated as a unit at some level of linguistic processing. They present a variable degree of institutionalisation, arbitrariness, heterogeneity and limited syntactic and semantic variability. There has been much research on automatic MWE acquisition in the recent decades, and the state of the art covers a large number of techniques and languages. Other tasks involving MWEs, namely disambiguation, interpretation, representation and applications, have received less emphasis in the field. The first main contribution of this thesis is the proposal of an original methodological framework for automatic MWE acquisition from monolingual corpora. This framework is generic, language independent, integrated and contains a freely available implementation, the mwetoolkit. It is composed of independent modules which may themselves use multiple techniques to solve a specific sub-task in MWE acquisition. The evaluation of MWE acquisition is modelled using four independent axes. We underline that the evaluation results depend on parameters of the acquisition context, e.g., nature and size of corpora, language and type of MWE, analysis depth, and existing resources. The second main contribution of this thesis is the application-oriented evaluation of our methodology proposal in two applications: computer-assisted lexicography and statistical machine translation. For the former, we evaluate the usefulness of automatic MWE acquisition with the mwetoolkit for creating three lexicons: Greek nominal expressions, Portuguese complex predicates and Portuguese sentiment expressions. For the latter, we test several integration strategies in order to improve the treatment given to English phrasal verbs when translated by a standard statistical MT system into Portuguese. Both applications can benefit from automatic MWE acquisition, as the expressions acquired automatically from corpora can both speed up and improve the quality of the results. The promising results of previous and ongoing experiments encourage further investigation about the optimal way to integrate MWE treatment into other applications. Thus, we conclude the thesis with an overview of the past, ongoing and future work.
|
Page generated in 0.1837 seconds