Global ETD Search

1	A metáfora e a sua representação em sistemas de processamento automático de línguas naturais Oliveira, Ana Eliza Barbosa de [UNESP] 14 March 2006 (has links) (PDF) Made available in DSpace on 2014-06-11T19:26:50Z (GMT). No. of bitstreams: 0 Previous issue date: 2006-03-14Bitstream added on 2014-06-13T19:14:10Z : No. of bitstreams: 1 oliveira_aeb_me_arafcl.pdf: 1292834 bytes, checksum: e5fd8004cbadf61fb895ca243381d7a0 (MD5) / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) / Este trabalho tem como proposta (i) o estudo da metáfora per se (em oposição, por exemplo, a um estudo aplicado da metáfora) da perspectiva lingüística, isto é, o estudo da metáfora enquanto uma expressão da linguagem natural e (ii) a investigação de uma representação formal da metáfora para fins de implementação em sistemas de processamento automático de línguas naturais. A metodologia que norteia o desenvolvimento da proposta, que se insere em um contexto interdisciplinar, focaliza dois domínios: o Domínio Lingüístico-Cognitivo, em que se investiga a expressão lingüística e o suporte cognitivo da metáfora, ou seja, a metáfora enquanto um produto resultante de recursos lingüísticos e não- lingüísticos; e o Domínio Lingüístico-Computacional, em que se investiga a representação formal da produção e da interpretação da metáfora para fins computacionais. Como delimitadores dessas investigações, adotam-se os seguintes enfoques: Retórico-Filosófico, Interacionista, Semântico, Pragmático, Cognitivista e Computacional. / This MS thesis concerns the study of metaphor per se, (as opposed to applied metaphor) from the linguistic point of view, and the investigation of a formal metaphor representation for Natural Language Processing systems. The overall methodology focuses on two domains: a Cognitive- Linguistic Domain, in which we investigate the metaphor linguistic expression and its cognitive import, i.e., metaphor as a linguistic product and as a nonlinguistic mechanism; and a Computational- Linguistic Domain, in which we investigate a formal representation for the metaphor production and interpretation. The theoretical approaches that constrain the scope of this work are: philosophical- rhetoric, interactionist, semantic, pragmatic, cognitive and computational assessment to metaphor. Read more Linguística Metafora Processamento automático de línguas Domínio lingüístico-cognitivo Domínio lingüístico-computacional Wordnet Representação formal Metaphor Cognitive-linguistic domain Computational-linguistic domain Formal representation Natural Language processing
2	Enfrentamento do problema das divergências de tradução por um sistema de tradução automática: um exercício exploratório Oliveira, Mirna Fernanda de [UNESP] 25 April 2006 (has links) (PDF) Made available in DSpace on 2014-06-11T19:32:47Z (GMT). No. of bitstreams: 0 Previous issue date: 2006-04-25Bitstream added on 2014-06-13T20:43:58Z : No. of bitstreams: 1 oliveira_mf_dr_ararafcl.pdf: 631650 bytes, checksum: fa4233637c661c5e993adcc08801d158 (MD5) / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / O objetivo desta tese é desenvolver um estudo lingüístico-computacional exploratório de um problema específico que deve ser enfrentado por sistemas de tradução automática: o problema da divergências de tradução quer de natureza sintática quer de natureza léxico-semântica que se verificam entre pares de sentenças de línguas naturais diferentes. Para isso, fundamenta-se na metodologia de pesquisa interdisciplinar em PLN (Processamento Automático de Línguas Naturais) de Dias-da-Silva (1996, 1998 e 2003) e na teoria lingüístico-computacional subjacente ao sistema de tradução automática UNITRAN de Dorr (1993), que, por sua vez é subsidiado pela teoria sintática dos princípios e Parâmetros de Chomsky (1981) e pela teoria semântica das Estruturas conceituais de Jackendoff (1990). Como contribuição, a tese descreve a composição e o funcionamento do UNITRAN, desenhado para dar conta de parte do problema posto pelas divergências de tradução e ilustra a possibilidade de inclusão do português nesse sistema através do exame de alguns tipos de divergências que se verificam entre frases do inglês e do português. / This dissertation aims to develop an exploratory linguistic and computational study of an especific type of problem that must be faced by machine translation systems: the problem of translation divergences, whether syntactic or lexical-semantic ones that can be verified between distinct natural language sentence. In order to achieve this aim, this work is based on the interdisciplinary research metodology of the NLP (Natural Language Processing) field developed by Dias-da-Silva (1996, 1998 & 2003) and on the linguistic computacional theory behind UNITRAN, a machine translation systemdeveloped by Dorr (1993), a system that is on its turned based on Chomsky's syntactic theory of Government and Binding (1981) and Jackendoff's semantic theory of Conceptual Structures (1990). As a contribution to the field of NLP, this dissertation describes the machinery of UNITRAN, designed to deal with part of the problem of translation divergencies, and it illustrates the possibility of including Brazilian Portuguese language in the system through the investigation of certain kinds of divergences that can be found between English and Brazilian Portuguese senteces. Read more Linguística Traduções - Processos eletrônicos Tradução automática Translation NLP (Natural Language Processing) Linguistic computacional theory Machine translation system
3	Investigação de estratégias de seleção de conteúdo baseadas na UNL (Universal Networking Language) Chaud, Matheus Rigobelo 03 March 2015 (has links) Made available in DSpace on 2016-06-02T20:25:24Z (GMT). No. of bitstreams: 1 6636.pdf: 3131517 bytes, checksum: 2afb763348af4eeb377c36a05732707f (MD5) Previous issue date: 2015-03-03 / Financiadora de Estudos e Projetos / The field of Natural Language Processing (NLP) has witnessed increased attention to Multilingual Multidocument Summarization (MMS), whose goal is to process a cluster of source documents in more than one language and generate a summary of this collection in one of the target languages. In MMS, the selection of sentences from source texts for summary generation may be based on either shallow or deep linguistic features. The purpose of this research was to investigate whether the use of deep knowledge, obtained from a conceptual representation of the source texts, could be useful for content selection in texts within the newspaper genre. In this study, we used a formal representation system the UNL (Universal Networking Language). In order to investigate content selection strategies based on this interlingua, 3 clusters of texts were represented in UNL, each consisting of 1 text in Portuguese, 1 text in English and 1 human-written reference summary. Additionally, in each cluster, the sentences of the source texts were aligned to the sentences of their respective human summaries, in order to identify total or partial content overlap between these sentences. The data collected allowed a comparison between content selection strategies based on conceptual information and a traditional selection method based on a superficial feature - the position of the sentence in the source text. According to the results, content selection based on sentence position was more closely correlated with the selection made by the human summarizer, compared to the conceptual methods investigated. Furthermore, the sentences in the beginning of the source texts, which, in newspaper articles, usually convey the most relevant information, did not necessarily contain the most frequent concepts in the text collection; on several occasions, the sentences with the most frequent concepts were in the middle or at the end of the text. These results indicate that, at least in the clusters analyzed, other criteria besides concept frequency help determine the relevance of a sentence. In other words, content selection in human multidocument summarization may not be limited to the selection of the sentences with the most frequent concepts. In fact, it seems to be a much more complex process. / Na área de Processamento Automático das Línguas Naturais (PLN), há um destaque crescente para a Sumarização Automática Multidocumento Multilíngue (SAMM), cujo objetivo é processar uma coleção de documentos-fonte em mais de uma língua e gerar um sumário correspondente a essa coleção em uma das línguas-alvo. Na SAMM, a seleção das sentenças dos textos-fonte para composição do sumário pode ser feita com base em atributos linguísticos superficiais ou profundos. O objetivo deste projeto foi investigar se a utilização de conhecimento profundo, obtido a partir de uma representação conceitual dos textos-fonte, pode ser útil na seleção de conteúdo em textos do gênero jornalístico. Para isso, utilizou-se um sistema de representação formal a UNL (Universal Networking Language). Visando investigar estratégias de seleção de conteúdo baseadas nessa interlíngua, fez-se a representação em UNL de 3 coleções de textos, cada qual com 1 texto-fonte em português, 1 texto-fonte em inglês e 1 sumário humano de referência. Fez-se também o alinhamento das sentenças dos textos-fonte de cada coleção às sentenças de seus respectivos sumários humanos, objetivando identificar sobreposição total ou parcial de conteúdo entre essas sentenças. Esses dados permitiram a comparação entre estratégias de seleção de conteúdo baseadas em informações conceituais e um método de seleção tradicional baseado em um atributo superficial a posição da sentença no texto-fonte. De acordo com os resultados obtidos, a seleção de conteúdo com base na posição no texto-fonte correlacionou-se mais adequadamente com a seleção realizada pelo sumarizador humano, comparado aos métodos conceituais investigados. Além disso, as sentenças iniciais dos textos-fonte, que, em textos jornalísticos, normalmente veiculam as informações mais relevantes, não necessariamente continham os conceitos mais frequentes da coleção; em diversas ocasiões, as sentenças com os conceitos mais frequentes estavam em posição intermediária ou final no texto. Esses resultados indicam que, ao menos nas coleções analisadas, outros critérios, além da frequência de conceitos, concorrem para determinar a relevância de uma sentença. Em outras palavras, na sumarização humana multidocumento, a seleção de conteúdo provavelmente não se resume a selecionar sentenças com os conceitos mais frequentes, tratando-se de um processo bem mais complexo. Read more Linguística aplicada Sumarização automática Estratégias de seleção de conteúdo Automatic summarization Multilingual multidocument summarization Natural language processing Knowledge representation systems Universal networking language Content selection LINGUISTICA, LETRAS E ARTES::LINGUISTICA

1

Page generated in 0.1448 seconds