• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 20
  • 6
  • 2
  • 1
  • 1
  • Tagged with
  • 36
  • 11
  • 8
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Distributional models of multiword expression compositionality prediction / Modelos distribucionais para a predição de composicionalidade de expressões multipalavras

Cordeiro, Silvio Ricardo January 2018 (has links)
Sistemas de processamento de linguagem natural baseiam-se com frequência na hipótese de que a linguagem humana é composicional, ou seja, que o significado de uma entidade linguística pode ser inferido a partir do significado de suas partes. Essa expectativa falha no caso de expressões multipalavras (EMPs). Por exemplo, uma pessoa caracterizada como pão-duro não é literalmente um pão, e também não tem uma consistência molecular mais dura que a de outras pessoas. Técnicas computacionais modernas para inferir o significado das palavras com base na sua distribuição no texto vêm obtendo um considerável sucesso em múltiplas tarefas, especialmente após o surgimento de abordagens de word embeddings. No entanto, a representação de EMPs continua a ser um problema em aberto na área. Em particular, não existe um método consolidado que prediga, com base em corpora, se uma determinada EMP deveria ser tratada como unidade indivisível (por exemplo olho gordo) ou como alguma combinação do significado de suas partes (por exemplo tartaruga marinha). Esta tese propõe um modelo de predição de composicionalidade de EMPs com base em representações de semântica distribucional, que são instanciadas no contexto de uma variedade de parâmetros. Também é apresentada uma avaliação minuciosa do impacto desses parâmetros em três novos conjuntos de dados que modelam a composicionalidade de EMP, abrangendo EMPs em inglês, francês e português. Por fim, é apresentada uma avaliação extrínseca dos níveis previstos de composicionalidade de EMPs, através da tarefa de identificação de EMPs. Os resultados obtidos sugerem que a escolha adequada do modelo distribucional e de parâmetros de corpus pode produzir predições de composicionalidade que são comparáveis às observadas no estado da arte. / Natural language processing systems often rely on the idea that language is compositional, that is, the meaning of a linguistic entity can be inferred from the meaning of its parts. This expectation fails in the case of multiword expressions (MWEs). For example, a person who is a sitting duck is neither a duck nor necessarily sitting. Modern computational techniques for inferring word meaning based on the distribution of words in the text have been quite successful at multiple tasks, especially since the rise of word embedding approaches. However, the representation of MWEs still remains an open problem in the field. In particular, it is unclear how one could predict from corpora whether a given MWE should be treated as an indivisible unit (e.g. nut case) or as some combination of the meaning of its parts (e.g. engine room). This thesis proposes a framework of MWE compositionality prediction based on representations of distributional semantics, which we instantiate under a variety of parameters. We present a thorough evaluation of the impact of these parameters on three new datasets of MWE compositionality, encompassing English, French and Portuguese MWEs. Finally, we present an extrinsic evaluation of the predicted levels of MWE compositionality on the task of MWE identification. Our results suggest that the proper choice of distributional model and corpus parameters can produce compositionality predictions that are comparable to the state of the art.
12

The Difference Between Bucket-Kicking and Kicking the Bucket: Understanding Idiom Flexibility

Schildmier Stone, Megan Ann, Schildmier Stone, Megan Ann January 2016 (has links)
The question of how to integrate idioms into standard theories of grammar has been a matter of investigation since at least the beginning of generative grammar. Idioms are uniquely positioned at the interface between the lexicon and the syntax, demonstrating properties of both words and phrases. On the one hand, idioms behave like stored units, arbitrary correspondences between sound and meaning that must simply be memorized by speakers of the language. In this way, they are similar to words, which have long been recognized as arbitrary sound-meaning pairs (cf. Saussure (1986)'s arbitrariness of the sign). On the other hand, idioms in the traditional sense are multiword units, often with some degree of syntactic flexibility, ranging from tense inflection (e.g. Eli kicked the bucket yesterday vs. I'm pretty sure Eli's going to kick the bucket tomorrow) to passivizability (e.g. Lisa spilled the beans vs. The beans were spilled (by Lisa)), and beyond. This places idioms in the purview of the syntax, where the combination and manipulation of multiword units is typically assumed to take place. Idioms, then, bridge the gap between the lexicon and the syntax, challenging traditional assumptions about grammar. This dissertation provides a proposal for dealing with just such issues. I provide an account of idiomatic representations that is consistent with theoretical and empirical research in the field. I explore what kinds of structures are licensed to have special idiomatic interpretations, and I present novel experimental and corpus results that bear on the issue of how idioms are represented. Ultimately, I argue that the structural requirement model alone is able to sufficiently account for the data.
13

A Dynamic Account of the Structure of Concepts

Blouw, Peter January 2011 (has links)
Concepts are widely agreed to be the basic constituents of thought. Amongst philosophers and psychologists, however, the question of how concepts are structured has been a longstanding problem and a locus of disagreement. I draw on recent work describing how representational content is ascribed to populations of neurons to develop a novel solution to this problem. Because disputes over the structure of concepts often reflect divergent explanatory goals, I begin by arguing for a set of six criteria that a good theory ought to accommodate. These criteria address philosophical concerns related to content, reference, scope, publicity, and compositionality, and psychological concerns related to categorization phenomena and neural plausibility. Next, I evaluate a number of existing theoretical approaches in relation to these six criteria. I consider classical views that identify concepts with definitions, similarity-based views that identify concepts with prototypes or exemplars, theory-based views that identify concepts with explanatory schemas, and atomistic views that identify concepts with unstructured mental symbols that enter into law-like relations with their referents. I conclude that none of these accounts can satisfactorily accommodate all of the criteria. I then describe the theory of representational content that I employ to motivate a novel account of concept structure. I briefly defend this theory against competitors, and I describe how it can be scaled from the level of basic perceptual representations to the level of highly complex conceptual representations. On the basis of this description, I contend that concepts are structured dynamically through sets of transformations of single source representation, and that the content of a given concept specifies the set of potential transformations it can enter into. I conclude by demonstrating that the ability of this account to meet all of the criteria introduced beforehand. I consider objections to my views throughout.
14

Idiomų vertimas O.Vaildo "Doriano Grėjaus portretas" ir A.Merdok "Jūra, jūra..." romanuose / Translation of Idioms in O.Wilde's "The Picture of Dorian Gray" and I.Murdoch's "The Sea, the Sea"

Baltramaitytė, Sigita 31 May 2006 (has links)
Magistro baigiamojo darbo objektas yra angliškų idiomų struktūra bei semantika anglų grožinėje literatūroje ir jų vertimas į lietuvių kalbą. Todėl šio darbo tikslas yra ištyrinėti, kaip angliškos idiomos O.Vaildo „Doriano Grėjaus portretas“ ir A.Merdok „Jūra, jūra...“ romanuose yra verčiamos į lietuvių kalbą ir kaip pasikeičia jų stilistika bei semantika. Pagrindiniai tyrimo metodai yra šie: gretinamasis, statistinis ir mokslinės literatūros analizės metodas. Teorinėje darbo dalyje pristatomos įvairių užsienio ir Lietuvos autorių suformuluotos junginių darybos, idiomatiškumo ir idiomos sąvokos, išskiriami trys pagrindiniai idiomos aspektai: reikšmė, struktūra, funkcija, pateikiama 14 idiomų tipų anglų kalboje, nurodomi pagrindiniai idiomų vertimo metodai, idiomų vertimo sunkumai, apibrėžiamas vertėjo tikslas ir vertimo kokybės svarba. Praktinėje darbo dalyje pristatomas angliškų idiomų vertimas dviejuose romanuose, pateikiamos rekomendacijos, kaip geriau būtų galima jas i��versti į lietuvių kalbą. Tyrimo rezultatai patvirtina moksliniame darbe iškeltą hipotezę, kad vertimo proceso metu idiomų semantikos, stilistikos ir struktūros lygmenyse įvyksta ryškūs pokyčiai.
15

Extraction and coordination in phrase structure grammar and categorial grammar

Morrill, Glyn Verden January 1989 (has links)
A large proportion of computationally-oriented theories of grammar operate within the confines of monostratality (i.e. there is only one level of syntactic analysis), compositionality (i.e. the meaning of an expression is determined by the meanings of its syntactic parts, plus their manner of combination), and adjacency (i.e. the only operation on terminal strings is concatenation). This thesis looks at two major approaches falling within these bounds: that based on phrase structure grammar (e.g. Gazdar), and that based on categorial grammar (e.g. Steedman). The theories are examined with reference to extraction and coordination constructions; crucially a range of 'compound' extraction and coordination phenomena are brought to bear. It is argued that the early phrase structure grammar metarules can characterise operations generating compound phenomena, but in so doing require a categorial-like category system. It is also argued that while categorial grammar contains an adequate category apparatus, Steedman's primitives such as composition do not extend to cover the full range of data. A theory is therefore presented integrating the approaches of Gazdar and Steedman. The central issue as regards processing is derivational equivalence: the grammars under consideration typically generate many semantically equivalent derivations of an expression. This problem is addressed by showing how to axiomatise derivational equivalence, and a parser is presented which employs the axiomatisation to avoid following equivalent paths.
16

A Dynamic Account of the Structure of Concepts

Blouw, Peter January 2011 (has links)
Concepts are widely agreed to be the basic constituents of thought. Amongst philosophers and psychologists, however, the question of how concepts are structured has been a longstanding problem and a locus of disagreement. I draw on recent work describing how representational content is ascribed to populations of neurons to develop a novel solution to this problem. Because disputes over the structure of concepts often reflect divergent explanatory goals, I begin by arguing for a set of six criteria that a good theory ought to accommodate. These criteria address philosophical concerns related to content, reference, scope, publicity, and compositionality, and psychological concerns related to categorization phenomena and neural plausibility. Next, I evaluate a number of existing theoretical approaches in relation to these six criteria. I consider classical views that identify concepts with definitions, similarity-based views that identify concepts with prototypes or exemplars, theory-based views that identify concepts with explanatory schemas, and atomistic views that identify concepts with unstructured mental symbols that enter into law-like relations with their referents. I conclude that none of these accounts can satisfactorily accommodate all of the criteria. I then describe the theory of representational content that I employ to motivate a novel account of concept structure. I briefly defend this theory against competitors, and I describe how it can be scaled from the level of basic perceptual representations to the level of highly complex conceptual representations. On the basis of this description, I contend that concepts are structured dynamically through sets of transformations of single source representation, and that the content of a given concept specifies the set of potential transformations it can enter into. I conclude by demonstrating that the ability of this account to meet all of the criteria introduced beforehand. I consider objections to my views throughout.
17

Distributional models of multiword expression compositionality prediction / Modèles distributionnels pour la prédiction de compositionnalité d’expressions polylexicales

Cordeiro, Silvio Ricardo 18 December 2017 (has links)
Les systèmes de traitement automatique des langues reposent souvent sur l'idée que le langage est compositionnel, c'est-à-dire que le sens d'une entité linguistique peut être déduite à partir du sens de ses parties. Cette supposition ne s’avère pas vraie dans le cas des expressions polylexicales (EPLs). Par exemple, une "poule mouillée" n'est ni une poule, ni nécessairement mouillée. Les techniques pour déduire le sens des mots en fonction de leur distribution dans le texte ont obtenu de bons résultats sur plusieurs tâches, en particulier depuis l'apparition des word embeddings. Cependant, la représentation des EPLs reste toujours un problème non résolu. En particulier, on ne sait pas comment prédire avec précision, à partir des corpus, si une EPL donnée doit être traitée comme une unité indivisible (p.ex. "carton plein") ou comme une combinaison du sens de ses parties (p.ex. "eau potable"). Cette thèse propose un cadre méthodologique pour la prédiction de compositionnalité d'EPLs fondé sur des représentations de la sémantique distributionnelle, que nous instancions à partir d’une variété de paramètres. Nous présenterons une évaluation complète de l'impact de ces paramètres sur trois nouveaux ensembles de données modélisant la compositionnalité d'EPLs, en anglais, français et portugais. Finalement, nous présenterons une évaluation extrinsèque des niveaux de compositionnalité prédits par le modèle dans le contexte d’un système d'identification d'EPLs. Les résultats suggèrent que le choix spécifique de modèle distributionnel et de paramètres de corpus peut produire des prédictions de compositionnalité qui sont comparables à celles présentées dans l'état de l'art. / Natural language processing systems often rely on the idea that language is compositional, that is, the meaning of a linguistic entity can be inferred from the meaning of its parts. This expectation fails in the case of multiword expressions (MWEs). For example, a person who is a "sitting duck" is neither a duck nor necessarily sitting. Modern computational techniques for inferring word meaning based on the distribution of words in the text have been quite successful at multiple tasks, especially since the rise of word embedding approaches. However, the representation of MWEs still remains an open problem in the field. In particular, it is unclear how one could predict from corpora whether a given MWE should be treated as an indivisible unit (e.g. "nut case") or as some combination of the meaning of its parts (e.g. "engine room"). This thesis proposes a framework of MWE compositionality prediction based on representations of distributional semantics, which we instantiate under a variety of parameters. We present a thorough evaluation of the impact of these parameters on three new datasets of MWE compositionality, encompassing English, French and Portuguese MWEs. Finally, we present an extrinsic evaluation of the predicted levels of MWE compositionality on the task of MWE identification. Our results suggest that the proper choice of distributional model and corpus parameters can produce compositionality predictions that are comparable to the state of the art.
18

Compositional Matrix-Space Models: Learning Methods and Evaluation

Asaadi, Shima 13 October 2020 (has links)
There has been a lot of research on machine-readable representations of words for natural language processing (NLP). One mainstream paradigm for the word meaning representation comprises vector-space models obtained from the distributional information of words in the text. Machine learning techniques have been proposed to produce such word representations for computational linguistic tasks. Moreover, the representation of multi-word structures, such as phrases, in vector space can arguably be achieved by composing the distributional representation of the constituent words. To this end, mathematical operations have been introduced as composition methods in vector space. An alternative approach to word representation and semantic compositionality in natural language has been compositional matrix-space models. In this thesis, two research directions are considered. In the first, considering compositional matrix-space models, we explore word meaning representations and semantic composition of multi-word structures in matrix space. The main motivation for working on these models is that they have shown superiority over vector-space models regarding several properties. The most important property is that the composition operation in matrix-space models can be defined as standard matrix multiplication; in contrast to common vector space composition operations, this is sensitive to word order in language. We design and develop machine learning techniques that induce continuous and numeric representations of natural language in matrix space. The main goal in introducing representation models is enabling NLP systems to understand natural language to solve multiple related tasks. Therefore, first, different supervised machine learning approaches to train word meaning representations and capture the compositionality of multi-word structures using the matrix multiplication of words are proposed. The performance of matrix representation models learned by machine learning techniques is investigated in solving two NLP tasks, namely, sentiment analysis and compositionality detection. Then, learning techniques for learning matrix-space models are proposed that introduce generic task-agnostic representation models, also called word matrix embeddings. In these techniques, word matrices are trained using the distributional information of words in a given text corpus. We show the effectiveness of these models in the compositional representation of multi-word structures in natural language. The second research direction in this thesis explores effective approaches for evaluating the capability of semantic composition methods in capturing the meaning representation of compositional multi-word structures, such as phrases. A common evaluation approach is examining the ability of the methods in capturing the semantic relatedness between linguistic units. The underlying assumption is that the more accurately a method of semantic composition can determine the representation of a phrase, the more accurately it can determine the relatedness of that phrase with other phrases. To apply the semantic relatedness approach, gold standard datasets have been introduced. In this thesis, we identify the limitations of the existing datasets and develop a new gold standard semantic relatedness dataset, which addresses the issues of the existing datasets. The proposed dataset allows us to evaluate meaning composition in vector- and matrix-space models.
19

Lexikální idiomy v angličtině / Lexical idioms in English

Vašků, Kateřina January 2018 (has links)
According to the standard definition phraseology deals with multi-word lexical units, i.e. word combinations. Voices claiming that even complex words composed of two or more meaningful units may qualify for the status of (lexical) phrasemes/idioms, especially when their meaning is non- compositional, are still very isolated, in spite of the fact that linguistic literature is teeming with references to idiomatic compounds and derivatives (Chap. 3). In fact, the only systematic treatment of lexical idioms seems to be that offered by Čermák (2007), who focuses primarily on lexical idioms in Czech. The aim of the thesis is therefore to explore the situation in English and attempt to develop a useful definition of, and especially criteria for, distinguishing lexical idioms from other complex lexemes and provide an outline of the main types of lexical idioms obtaining in English. After an introduction (Chap. 1) and the presentation of state-of-the-art approaches to phraseology and the relevant information about phraseological units and their features (Chap. 2), the thesis reviews Čermák's theory of lexical idioms which inspired their quantitative study in Czech (Chap. 4). The core part is the analysis of two samples. The first one, gathered from the BNC, includes a random selection of 1000 single-word...
20

Semântica-I: questões fundacionais

Silva, Adriano Marques da 15 July 2014 (has links)
Made available in DSpace on 2015-05-14T12:11:55Z (GMT). No. of bitstreams: 1 arquivototal.pdf: 2099375 bytes, checksum: cd43c67c82a3165f8062d5c071352f58 (MD5) Previous issue date: 2014-07-15 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / The problem being addressed in this thesis can be formulated as follows: what is the relationship between the notion of an internalized linguistic competence, as conceived by the generative program, and a semantic theory? In other words, what is the extend and scope of a semantic theory consistent with the theoretical assumptions and the syntactic model assumed by the generative program? Two approaches are compared: the denotational approach, according to which syntactic derivations are inputs to the truth conditional interpretation and the intensional approach , according to which the syntactic derivations constrain, but do not determine, truth conditions .I argue that the first approach leads us to a dilemma : if the semantic structure is isomorphic to the syntagmatic structure, we multiply the terms of explanation, without explanatory gain. If there is no isomorphism, we have even more serious problems, because we could not explain the explanatory success of certain syntactic principles (such as the asymmetry between external argument and internal argument , for example) . Thus, this proposal does not provide the proper kind of idealization, it s not able to extend the positive heuristic of the generative program. I argue that the second proposal, by contrast, increases the positive heuristic of the program because it is able to explain ( and not simply redescribe) important empirical generalizations discovered by the generative program over the years . I argue that the formulation of an I-semantics requires, necessarily, a revision of traditional and tacitly accepted assumptions regarding the nature of the formal study of natural languages semantics. I-Semantics explains the etiology of the computational principles underlying interface phenomena, not the implementation of these operations, how sentences can be used to make true or false assertions. / O problema a ser abordado nesta tese pode ser formulado nos seguintes termos: qual a relação entre a noção de competência linguística internalizada, tal como concebida pelo programa gerativista, e uma teoria semântica? Dito de outro modo, qual o formato e escopo de uma teoria semântica coerente com as assunções teóricas de base e com o modelo sintático assumido pelo programa gerativista? Serão comparadas duas abordagens: a abordagem denotacional, na qual as derivações sintáticas são inputs para a interpretação semântica vero-condicional e a abordagem intensional, segundo a qual as derivações sintáticas restringem, mas não determinam, condições de verdade. Argumento que a primeira abordagem conduz-nos a um dilema: caso a estrutura semântica seja isomórfica à estrutura sintagmática, multiplicamos os termos da explicação, sem ganho explicativo. Caso não haja isomorfismo, ganhamos problemas ainda mais sérios, pois não conseguiríamos explicar o sucesso explicativo de certos princípios sintáticos (como a assimetria entre argumento externo e argumento interno, por exemplo). Assim, essa proposta não fornece o tipo adequado de idealização, não é capaz de ampliar a heurística positiva do programa gerativista. Sustento que a segunda proposta, por contraste, amplia a heurística positiva do programa, pois é capaz de explicar (e não simplesmente redescrever) importantes generalizações empíricas descobertas pelo programa gerativista ao longo dos anos. Defendo a tese de que a formulação de uma semântica-I requer, necessariamente, a revisão de pressupostos tradicional e tacitamente aceitos a respeito da natureza do estudo formal da semântica das línguas naturais. A semântica-I trata da etiologia dos princípios computacionais subjacentes aos fenômenos de interface, e não da implementação dessas operações, do modo como sentenças podem ser usadas para fazer asserções verdadeiras ou falsas.

Page generated in 0.1026 seconds