1 |
Descrição Lexicografica do Frame de Destruição Sob a Ótica da Semântica de Frames.RAMOS, T. 06 May 2011 (has links)
Made available in DSpace on 2016-08-29T15:08:47Z (GMT). No. of bitstreams: 1
tese_4944_Dissertação - Tatiani Ramos Completa.pdf: 1883059 bytes, checksum: 05876fecf81f6b08ef6c88e631b2e3dd (MD5)
Previous issue date: 2011-05-06 / Este trabalho descreve lexicograficamente o Frame de DESTRUIÇÃO, guiando-se pela perspectiva da Semântica de Frames.
O corpus da pesquisa é composto por dez unidades lexicais (aniquilar; arrasar; demolir, desfazer, desmantelar, desmontar; destruir; devastar; explodir, vaporizar) que evocam ao Frame de DESTRUIÇÃO. Os exemplos contendo as unidades lexicais (ULs) são retirados de textos jornalísticos e legendas de filmes em Português do Brasil, selecionados através de cinco corpora que estão disponíveis ao acesso público. São eles: ANCIB, ECI-EBR, NILC/SÃO CARLOS, NURC-R, LEGENDA DE FILMES e duas ferramentas de busca LINGUATECA e SKETCH ENGINE BETA. Iniciamos a análise do Frame de DESTRUIÇÃO a partir dos textos recolhidos desses corpora e das ferramentas de busca citadas acima, seguindo os critérios descritos no THE BOOIC dos autores Ruppenhofer J., Ellsworth M., Petruck M. R. L., Johnson C. R. e Scheffczyk (2006) disponível para ser baixada na página da FrameNet.e estabelecidos como metodologia no projeto piloto FrameNet, em Berkeiey. A análise baseou-se na rotulação das frases, tendo como espelho a rotulação do frame tomado do inglês (Destroying), que se encontra na página da FrameNet.com.br. Tal rotulação sintática e semântica tem por característica gerar diferentes padrões de ocorrências surgidos com base nas diferentes informações de ordem pragmática e semântica que se fazem presentes em cada uma das frases analisadas. Essas rotulações serviram de base para classificar as unidades lexicais que foram dividas em camadas a partir dos Elementos de Frame (EF), da Função Gramatical (FG) e do Tipo Sintagmático (TS).
Tais rotulações baseadas das frases, na descrição das camadas (EF, FG, TS) e nas valências geram os diferentes padrões capazes de revelar, de forma exaustiva, as ocorrências das unidades lexicais (ULs) em questão. A partir de tais rotulações depreendem-se as marcações de lugar e tempo, que são explicitamente determinados no frame.
No caso deste trabalho verificou-se no frame de Destruição uma variação no foco da cena da destruição em que ora se destaca o destruidor/causa e ora se destaca o sofredor (da destruição).
Palavras-Chaves: Semântica, FrameNet, Lexicografia.
|
2 |
Stratégie domaine par domaine pour la création d'un FrameNet du français : annotations en corpus de cadres et rôles sémantiques / Domain by domain strategy for creating a French FrameNet : corpus annotationsof semantics frames and rolesDjemaa, Marianne 14 June 2017 (has links)
Dans cette thèse, nous décrivons la création du French FrameNet (FFN), une ressource de type FrameNet pour le français créée à partir du FrameNet de l’anglais (Baker et al., 1998) et de deux corpus arborés : le French Treebank (Abeillé et al., 2003) et le Sequoia Treebank (Candito et Seddah, 2012). La ressource séminale, le FrameNet de l’anglais, constitue un modèle d’annotation sémantique de situations prototypiques et de leurs participants. Elle propose à la fois :a) un ensemble structuré de situations prototypiques, appelées cadres, associées à des caractérisations sémantiques des participants impliqués (les rôles);b) un lexique de déclencheurs, les lexèmes évoquant ces cadres;c) un ensemble d’annotations en cadres pour l’anglais. Pour créer le FFN, nous avons suivi une approche «par domaine notionnel» : nous avons défini quatre «domaines» centrés chacun autour d’une notion (cause, communication langagière, position cognitive ou transaction commerciale), que nous avons travaillé à couvrir exhaustivement à la fois pour la définition des cadres sémantiques, la définition du lexique, et l’annotation en corpus. Cette stratégie permet de garantir une plus grande cohérence dans la structuration en cadres sémantiques, tout en abordant la polysémie au sein d’un domaine et entre les domaines. De plus, nous avons annoté les cadres de nos domaines sur du texte continu, sans sélection d’occurrences : nous préservons ainsi la distribution des caractéristiques lexicales et syntaxiques de l’évocation des cadres dans notre corpus. à l’heure actuelle, le FFN comporte 105 cadres et 873 déclencheurs distincts, qui donnent lieu à 1109 paires déclencheur-cadre distinctes, c’est-à-dire 1109 sens. Le corpus annoté compte au total 16167 annotations de cadres de nos domaines et de leurs rôles. La thèse commence par resituer le modèle FrameNet dans un contexte théorique plus large. Nous justifions ensuite le choix de nous appuyer sur cette ressource et motivons notre méthodologie en domaines notionnels. Nous explicitons pour le FFN certaines notions définies pour le FrameNet de l’anglais que nous avons jugées trop floues pour être appliquées de manière cohérente. Nous introduisons en particulier des critères plus directement syntaxiques pour la définition du périmètre lexical d’un cadre, ainsi que pour la distinction entre rôles noyaux et non-noyaux.Nous décrivons ensuite la création du FFN : d’abord, la délimitation de la structure de cadres utilisée pour le FFN, et la création de leur lexique. Nous présentons alors de manière approfondie le domaine notionnel des positions cognitives, qui englobe les cadres portant sur le degré de certitude d’un être doué de conscience sur une proposition. Puis, nous présentons notre méthodologie d’annotation du corpus en cadres et en rôles. à cette occasion, nous passons en revue certains phénomènes linguistiques qu’il nous a fallu traiter pour obtenir une annotation cohérente ; c’est par exemple le cas des constructions à attribut de l’objet.Enfin, nous présentons des données quantitatives sur le FFN tel qu’il est à ce jour et sur son évaluation. Nous terminons sur des perspectives de travaux d’amélioration et d’exploitation de la ressource créée. / This thesis describes the creation of the French FrameNet (FFN), a French language FrameNet type resource made using both the Berkeley FrameNet (Baker et al., 1998) and two morphosyntactic treebanks: the French Treebank (Abeillé et al., 2003) and the Sequoia Treebank (Candito et Seddah, 2012). The Berkeley FrameNet allows for semantic annotation of prototypical situations and their participants. It consists of:a) a structured set of prototypical situations, called frames. These frames incorporate semantic characterizations of the situations’ participants (Frame Elements, or FEs);b) a lexicon of lexical units (LUs) which can evoke those frames;c) a set of English language frame annotations. In order to create the FFN, we designed a “domain by domain” methodology: we defined four “domains”, each centered on a specific notion (cause, verbal communication, cognitive stance, or commercial transaction). We then sought to obtain full frame and lexical coverage for these domains, and annotated the first 100 corpus occurrences of each LU in our domains. This strategy guarantees a greater consistency in terms of frame structuring than other approaches and is conducive to work on both intra-domain and inter-domains frame polysemy. Our annotating frames on continuous text without selecting particular LU occurrences preserves the natural distribution of lexical and syntactic characteristics of frame-evoking elements in our corpus. At the present time, the FFNincludes 105 distinct frames and 873 distinct LUs, which combine into 1,109 LU-frame pairs (i.e. 1,109 senses). 16,167 frame occurrences, as well as their FEs, have been annotated in our corpus. In this thesis, I first situate the FrameNet model in a larger theoretical background. I then justify our using the Berkeley FrameNet as our resource base and explain why we used a domain-by- domain methodology. I next try to clarify some specific BFN notions that we found too vague to be coherently used to make the FFN. Specifically, I introduce more directly syntactic criteria both for defining a frame’s lexical perimeter and for differentiating core FEs from non-core ones.Then, I describe the FFN creation itself first by delimitating a structure of frames that will be used in the resource and by creating a lexicon for these frames. I then introduce in detail the Cognitive Stances notional domain, which includes frames having to do with a cognizer’s degree of certainty about some particular content. Next, I describe our methodology for annotating a corpus with frames and FEs, and analyze our treatment of several specific linguistic phenomena that required additional consideration (such as object complement constructions).Finally, I give quantified information about the current status of the FFN and its evaluation. I conclude with some perspectives on improving and exploiting the FFN.
|
3 |
ANÁLISE DESCRITIVO-LEXICAL DO FRAME EXPERIÊNCIA DE DANO CORPORAL SOB A ÓTICA DA SEMÂNTICA DE FRAMESSILVA, M. T. 09 December 2011 (has links)
Made available in DSpace on 2016-08-29T15:08:45Z (GMT). No. of bitstreams: 1
tese_4188_Dissertação Michelle Teixeira PDF.pdf: 1972164 bytes, checksum: 2eb713e7753f2f070e9fdfcb346ede5e (MD5)
Previous issue date: 2011-12-09 / Esta investigação vincula-se ao Projeto FrameNet Brasil e tem como base teórica a Semântica de Frames, linha de estudos cunhada por Fillmore (2006), cujo postulado central é: as significações lingüísticas são relacionadas às cenas conceptuais. Essa linha teórica fundamenta-se na noção de frames, que são esquemas conceptuais organizados de tal modo que para compreender qualquer um de seus elementos constitutivos é necessária a compreensão de todos os outros. A linguagem é observada, sob essa ótica, como um fenômeno interligado a outras habilidades humanas. Descreveram-se, nesta pesquisa, as ocorrências lingüísticas do frame Experiência de dano corporal e seus elementos, considerando algumas unidades lexicais verbais que o evocam, a saber: Fraturar, Machucar, Contundir, Torcer, Cortar, Quebrar, Queimar, Deslocar e Distender. O objetivo do trabalho realizado é contribuir para a base de dados do projeto FrameNet Brasil, cujo foco é disponibilizar a rede de frames do português brasileiro. Como metodologia, utilizaram-se as diretrizes disponíveis para esse projeto (cf. Ruppenhofer et al (2006)), o qual prevê a busca de ocorrências em corpora da Língua Portuguesa falada no Brasil e a anotação lexicográfica. Esta anotação documenta as propriedades combinatórias sintáticas e semânticas (as valências) da unidade lexical por meio da rotulação semântica, isto é, da rotulação dos elementos de frame nucleares e não-nucleares, e de seus constituintes gramaticais (tipos sintagmáticos e funções gramaticais).
Palavras-chave: Semântica de frames; FrameNet ; Frame Experiência de dano corporal.
|
4 |
O desenvolvimento da plataforma FrameNet Brasil: descrição lexicográfica das unidades lexicais que evocam a cena de corte dentro do projeto da FrameNet BrasilMarques, Renata Cristina de Barros Vieira 10 November 2009 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-10-10T11:30:46Z
No. of bitstreams: 1
renatacristinadebarrosvieiramarques.pdf: 3549084 bytes, checksum: 8a2b4da470490a3c5d9c3990155f4f30 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-10-11T15:50:14Z (GMT) No. of bitstreams: 1
renatacristinadebarrosvieiramarques.pdf: 3549084 bytes, checksum: 8a2b4da470490a3c5d9c3990155f4f30 (MD5) / Made available in DSpace on 2016-10-11T15:50:14Z (GMT). No. of bitstreams: 1
renatacristinadebarrosvieiramarques.pdf: 3549084 bytes, checksum: 8a2b4da470490a3c5d9c3990155f4f30 (MD5)
Previous issue date: 2009-11-10 / Este trabalho vincula-se ao projeto de pesquisa de implantação do Projeto FrameNet Brasil (SALOMÃO, 2008) e tem como objetivo empreender a descrição lexicográfica de oito Unidades Lexicais que evocam a cena de CORTE, a saber: aparar v., aparado adj., cortar v., fatiar v., ralar v., ralado v., recortar v., serrar v. A pesquisa foi feita com base em evidência de corpus através da busca em dezesseis corpora pertencentes ao projeto LINGUATECA e pretende documentar as propriedades combinatórias sintáticas e semânticas (as chamadas valências) das Unidades Lexicais referenciadas. A partir do referencial teórico da Semântica de Frames e inserida na agenda dos estudos sociocognitivos, a pesquisa desenvolvese nos moldes do Projeto FrameNet americano. Após a descrição do frame e a definição dos Elementos do Frame centrais e não-centrais, parte-se para pesquisa em corpus e, na sequência, é feita a anotação das sentenças a partir da postulação de três camadas principais: Elemento do Frame, Função Gramatical e Tipo de Sintagma. Pretende-se que as análises feitas contribuam para a construção da contraparte para o português brasileiro da rede semântica FrameNet. / This work is associated to the research project to implement Brazil FrameNet Project (SALOMÃO, 2008) and it aims at the lexical description of eight Lexical Units (LU) which evoke the CUT scene: aparar v., aparado adj., cortar v., fatiar v., ralar v., ralado v., recortar v., serrar v .The research was done based on corpus evidences taken from sixteen corpora inserted into the LINGUATECA project. This work also has the purpose to record the combinatory syntactic and semantic properties (the socalled valences) of these LU. Taking for granted the theoretical references from the Frames Semantics and inserted within the sociocognitive agenda, the research is developed from the schemes of Frame Net American project. Following the frame description and the definition of the Frame Elements (FE), the sentence annotation is operated from the postulation of its three main layers: the Frame Element, the Grammatical Function and the Syntagma Type. It is expected that the analyses developed in this work may contribute to the construction of Brazilian Portuguese counterpart to FrameNet semantic net.
|
5 |
A general purpose semantic parser using FrameNet and WordNet®.Shi, Lei 05 1900 (has links)
Syntactic parsing is one of the best understood language processing applications. Since language and grammar have been formally defined, it is easy for computers to parse the syntactic structure of natural language text. Does meaning have structure as well? If it has, how can we analyze the structure? Previous systems rely on a one-to-one correspondence between syntactic rules and semantic rules. But such systems can only be applied to limited fragments of English. In this thesis, we propose a general-purpose shallow semantic parser which utilizes a semantic network (WordNet), and a frame dataset (FrameNet). Semantic relations recognized by the parser are based on how human beings represent knowledge of the world. Parsing semantic structure allows semantic units and constituents to be accessed and processed in a more meaningful way than syntactic parsing, moving the automation of understanding natural language text to a higher level.
|
6 |
Modelagem linguístico-computacional das relações entre construções e frames no Constructicon da FrameNet BrasilLage, Ludmila Meireles 31 January 2018 (has links)
Submitted by Geandra Rodrigues (geandrar@gmail.com) on 2018-04-26T14:14:21Z
No. of bitstreams: 1
ludmilameireleslage.pdf: 2134733 bytes, checksum: 0d43cf8390146899f66347051eb6b7fc (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2018-05-09T12:15:11Z (GMT) No. of bitstreams: 1
ludmilameireleslage.pdf: 2134733 bytes, checksum: 0d43cf8390146899f66347051eb6b7fc (MD5) / Made available in DSpace on 2018-05-09T12:15:11Z (GMT). No. of bitstreams: 1
ludmilameireleslage.pdf: 2134733 bytes, checksum: 0d43cf8390146899f66347051eb6b7fc (MD5)
Previous issue date: 2018-01-31 / FAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas Gerais / Esta tese apresenta as discussões teórico-metodológicas que embasaram a
modelagem linguístico-computacional das relações entre construções e entre
construções e frames no Constructicon da FrameNet Brasil. Após a fase de
implementação de tal recurso, desenvolvido a fim de explicar os fenômenosnão
capturados pelas análises lexicográficas proporcionadas pela FrameNet, foi
necessário transformá-lo em uma rede. Uma vez que construções são
concebidas como construtos cognitivos que participam em redes relacionais, foi
modelada a relação de Herança (KAY & FILLMORE, 1999) para estruturar a rede
de construções. Outro importante avanço implementado foi a relação de
Evocação, que captura os casos em que uma construção evoca um frame,
conectando-os de modo a evidenciar a relação entre eles. As construções e os
frames, dois dos marcos teóricos mais importantes da Linguística Cognitiva, têm
sido, de fato, foco de estudos quanto às relações que estabelecem entre si.
Ademais, dado que o Constructicon foi desenvolvido em paralelo com a
FrameNet, teria sido um desperdício não conectar ambos os bancos de dados.
Contudo, alguns aspectos apresentados pelas construções sobrepujam as
generalizações capturadas pela herança e pela importação semântica
representada em termos de frames. Além disso, visto que o Constructicon da
FrameNet Brasil se volta sobretudo a tarefas em tecnologia da linguagem, era
preciso que o recurso proporcionasse informações que fossem legíveis não
apenas a leitores humanos, mas também a máquinas. Dessa forma, através da
modelagem das construções Aspectual Inceptiva (SIGILIANO, 2012, 2013) e
Dativo com Infinitivo (TORRENT, 2009; LAVIOLA 2015), observou-se a
necessidade de se adicionar ao Editor de Restrições mecanismos para dar
conta de tais aspectos. De tal modo, doravante é possível registrar as unidades
lexicais (ULs) que podem preencher um elemento da construção. Essa
propriedade pode ser implementada em três níveis diferentes: caso apenas
ULs específicas possam preencher um determinado slot; caso todas as ULs
que evocam um frame possam preencher o dado slot; ou caso uma família de
frames seja aceita em um slot. Assim sendo, através da implementação das
relações descritas, este trabalho contribui para o progresso do recurso
construcional, bem como para exibir os mecanismos necessários para diversas
aplicações computacionais, ao mesmo tempo em que operacionaliza, no
domínio computacional, a interdependência entre frames e construções há
tanto tempo pontuada pelos estudos descritivos em Linguística Cognitiva. / This work presents the theoretical-methodological discussions that support the
linguistic-computational modeling of the relations between constructions and
between constructions and frames in the FrameNet Brasil Constructicon. After
implementing the resource, developed to explain the phenomena not captured
by the lexicographic analyzes provided by FrameNet, it was necessary to turn it
into a network. Since constructions are conceived as cognitive constructs that
participate in relational networks, the Inheritance relation (KAY & FILLMORE,
1999) was modeled for structuring a constructions network. Another important
advance implemented was the Evokes relation, which captures the cases in
which a construction evokes a frame, connecting them to put in evidence the
relation between them. Constructions and frames, two of the most important
theoretical frameworks of Cognitive Linguistics, have been indeed the focus of
studies on the relations they establish with each other. In addition, since the
Constructicon was developed in parallel with FrameNet, it would have been a
waste to not connect both databases. However, some aspects of constructions
transcend the generalizations captured by inheritance and by the semantic import
represented in terms of frames. Moreover, since FrameNet Brasil Constructicon
focuses primarily on language technology tasks, the resource needed to provide
information that would be readable not only to human readers but also to
machines. Thus, through the modeling of Inceptive Aspectual (SIGILIANO, 2012,
2013) and Dative with Infinitive (TORRENT, 2009; LAVIOLA 2015)
constructions, it was observed the need to add some mechanisms to the
Constraint Editor to account for these aspects. Therefore, henceforth it is
possible to register the lexical units (LUs) that can fill a construction element.
This property can be implemented in three different levels: if only specific LUs
can fill a given slot; if all LUs evoking a frame can fill a given slot; or if a frame
family is accepted in a slot. Hence, through the implementation of the relations
described, this work contributes to the progress of the constructional resource,
as well as to display the mechanism applicable to several computational
applications, at the same time it operates, in the computational domain, the
interdependence between frames and constructions that have been long
punctuated by the descriptive studies in Cognitive Linguistics.
|
7 |
As construções interrogativas QU- no Constructicon da FrameNet BrasilMarção, Natália Duarte 10 September 2018 (has links)
Submitted by Geandra Rodrigues (geandrar@gmail.com) on 2018-10-11T12:09:21Z
No. of bitstreams: 1
nataliaduartemarcao.pdf: 3039217 bytes, checksum: 3d02a3fd20a157117bfde56d24c4edcc (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2018-10-16T13:53:22Z (GMT) No. of bitstreams: 1
nataliaduartemarcao.pdf: 3039217 bytes, checksum: 3d02a3fd20a157117bfde56d24c4edcc (MD5) / Made available in DSpace on 2018-10-16T13:53:22Z (GMT). No. of bitstreams: 1
nataliaduartemarcao.pdf: 3039217 bytes, checksum: 3d02a3fd20a157117bfde56d24c4edcc (MD5)
Previous issue date: 2018-09-10 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / O trabalho aqui apresentado insere-se no projeto Multilingual Knowledge Base ou simplesmente m.knob (www.mknob.com), o qual tem por objetivo criar um aplicativo web e vem sendo desenvolvido no Laboratório FrameNet Brasil de Linguística Computacional – FN-Br – (SALOMÃO, 2009). De forma mais geral, a FN-Br vem explorando a implementação das teorias da Semântica de Frames e da Gramática de Construções através da criação de recursos linguísticos computacionais, como o Lexicon e o Constructicon do Português Brasileiro (PB) – um repertório de construções disponível online. Dessa maneira, esta dissertação busca embasamento em pressupostos teóricos relacionados à Linguística Cognitiva, tais como a Semântica de Frames (FILLMORE, 1982) e a Gramática das Construções de Berkeley (KAY & FILLMORE, 1999) e segue a metodologia de análise da Berkeley FrameNet (FILLMORE ET AL. 2003). Nesse contexto, o objetivo dessa dissertação é apresentar a descrição e a modelagem linguístico-computacional (cf. DIAS-DA-SILVA, 1996) das construções Interrogativas QU- em PB na base construcional da FrameNet Brasil, a qual sustenta o aplicativo web m.knob. A pesquisa se justifica pelo fato de o chatbot – uma interface de recomendação de atrações turísticas baseada em compreensão de língua natural – empregado no aplicativo não suportar que o usuário interaja através de sentenças interrogativas. Nesse sentido, a descrição e modelagem das construções QU- se faz necessária de modo a contribuir para que o usuário, durante a interação com o aplicativo, busque informações adicionais acerca das atrações recomendadas através de perguntas. Para fomentar essa funcionalidade nova, esta dissertação traz uma proposta de modelagem de onze construções QU- em PB. Ao final, aplica-se um teste de prova de conceito para avaliação do modelo proposto. / This work is part of the Multilingual Knowledge Base project or simply m.knob (www.mknob.com). It aims to create a web application and has been developed at the FrameNet Brasil Laboratory of Computational Linguistics - FN-Br - (SALOMÃO, 2009). More generally, FN-Br has been exploring the implementation of Framing Semantics and Construction Grammar theories through the creation of computational linguistic resources, such as the Lexicon and Constructicon of the Brazilian Portuguese (PB) - a repertoire of constructions available online. In this way, this Master’s thesis is based on the theoretical assumptions of Cognitive Linguistics, such as Frame Semantics (FILLMORE, 1982) and the Berkeley Constructions Grammar (KAY & FILLMORE, 1999). Our analytical methodology is the one by Berkeley FrameNet (FILLMORE et al., 2003). In this context, this thesis aims to present the description and the linguistic-computational modeling (cf. DIAS-DA-SILVA, 1996) of the Interrogative Wh-constructions of the Brazilian Portuguese in the constructional base of FrameNet Brasil, which supports the web application m.knob. The research is justified by the fact that the chatbot - a tourist attraction recommendation interface based on natural-language understanding - employed in the application does not support the user interacting through interrogative sentences. In that sense, the description and modeling of Wh-constructions are necessary in order to help the user, during interaction with the application, look for additional information about the recommended attractions through questions. To foster this new function, this thesis presents a proposal for modeling eleven Wh-Constructions in Brazilian Portuguese. At the end, a proof of concept test is applied to evaluate the proposed model.
|
8 |
Criteria for the validation of specialized verb equivalents : application in bilingual terminographyPimentel, Janine 05 1900 (has links)
Multilingual terminological resources do not always include valid equivalents of legal terms for two main reasons. Firstly, legal systems can differ from one language community to another and even from one country to another because each has its own history and traditions. As a result, the non-isomorphism between legal and linguistic systems may render the identification of equivalents a particularly challenging task. Secondly, by focusing primarily on the definition of equivalence, a notion widely discussed in translation but not in terminology, the literature does not offer solid and systematic methodologies for assigning terminological equivalents. As a result, there is a lack of criteria to guide both terminologists and translators in the search and validation of equivalent terms.
This problem is even more evident in the case of predicative units, such as verbs. Although some terminologists (L‘Homme 1998; Lerat 2002; Lorente 2007) have worked on specialized verbs, terminological equivalence between units that belong to this part of speech would benefit from a thorough study. By proposing a novel methodology to assign the equivalents of specialized verbs, this research aims at defining validation criteria for this kind of predicative units, so as to contribute to a better understanding of the phenomenon of terminological equivalence as well as to the development of multilingual terminography in general, and to the development of legal terminography, in particular.
The study uses a Portuguese-English comparable corpus that consists of a single genre of texts, i.e. Supreme Court judgments, from which 100 Portuguese and 100 English specialized verbs were selected. The description of the verbs is based on the theory of Frame Semantics (Fillmore 1976, 1977, 1982, 1985; Fillmore and Atkins 1992), on the FrameNet methodology (Ruppenhofer et al. 2010), as well as on the methodology for compiling specialized lexical resources, such as DiCoInfo (L‘Homme 2008), developed in the Observatoire de linguistique Sens-Texte at the Université de Montréal. The research reviews contributions that have adopted the same theoretical and methodological framework to the compilation of lexical resources and proposes adaptations to the specific objectives of the project.
In contrast to the top-down approach adopted by FrameNet lexicographers, the approach described here is bottom-up, i.e. verbs are first analyzed and then grouped into frames for each language separately. Specialized verbs are said to evoke a semantic frame, a sort of conceptual scenario in which a number of mandatory elements (core Frame Elements) play specific roles (e.g. ARGUER, JUDGE, LAW), but specialized verbs are often accompanied by other optional information (non-core Frame Elements), such as the criteria and reasons used by the judge to reach a decision (statutes, codes, previous decisions). The information concerning the semantic frame that each verb evokes was encoded in an xml editor and about twenty contexts illustrating the specific way each specialized verb evokes a given frame were semantically and syntactically annotated. The labels attributed to each semantic frame (e.g. [Compliance], [Verdict]) were used to group together certain synonyms, antonyms as well as equivalent terms.
The research identified 165 pairs of candidate equivalents among the 200 Portuguese and English terms that were grouped together into 76 frames. 71% of the pairs of equivalents were considered full equivalents because not only do the verbs evoke the same conceptual scenario but their actantial structures, the linguistic realizations of the actants and their syntactic patterns were similar. 29% of the pairs of equivalents did not entirely meet these criteria and were considered partial equivalents. Reasons for partial equivalence are provided along with illustrative examples. Finally, the study describes the semasiological and onomasiological entry points that JuriDiCo, the bilingual lexical resource compiled during the project, offers to future users. / Les ressources multilingues portant sur le domaine juridique n‘incluent pas toujours d‘équivalents valides pour deux raisons. D‘abord, les systèmes juridiques peuvent différer d‘une communauté linguistique à l‘autre et même d‘un pays à l‘autre, car chacun a son histoire et ses traditions. Par conséquent, le phénomène de la non-isomorphie entre les systèmes juridiques et linguistiques rend difficile la tâche d‘identification des équivalents. En deuxième lieu, en se concentrant surtout sur la définition de la notion d‘équivalence, notion largement débattue en traductologie, mais non suffisamment en terminologie, la littérature ne propose pas de méthodologies solides et systématiques pour identifier les équivalents. On assiste donc à une absence de critères pouvant guider tant les terminologues que les traducteurs dans la recherche et la validation des équivalents des termes. Ce problème est encore plus évident dans le cas d‘unités prédicatives comme les verbes. Bien que certains terminologues (L'Homme, 1998; Lorente et Bevilacqua 2000; Costa et Silva 2004) aient déjà travaillé sur les verbes spécialisés, l‘équivalence terminologique, en ce qui concerne ce type d‘unités, bénéficierait d‘une étude approfondie. En proposant une méthodologie originale pour identifier les équivalents des verbes spécialisés, cette recherche consiste donc à définir des critères de validation de ce type d‘unités prédicatives afin de mieux comprendre le phénomène de l‘équivalence et aussi améliorer les ressources terminologiques multilingues, en général, et les ressources terminologiques multilingues couvrant le domaine juridique, en particulier.
Cette étude utilise un corpus comparable portugais-anglais contenant un seul genre de textes, à savoir les décisions des cours suprêmes, à partir duquel 100 verbes spécialisés ont été sélectionnés pour chaque langue. La description des verbes se base sur la théorie de la sémantique des cadres (Fillmore 1976, 1977, 1982, 1985; Fillmore and Atkins 1992), sur la méthodologie de FrameNet (Ruppenhofer et al. 2010), ainsi que sur la méthodologie développée à l‘Observatoire de linguistique Sens-Texte pour compiler des ressources lexicales spécialisées, telles que le DiCoInfo (L‘Homme 2008). La recherche examine d‘autres contributions ayant déjà utilisé ce cadre théorique et méthodologique et propose des adaptations objectives du projet. Au lieu de suivre une démarche descendante comme le font les lexicographes de FrameNet, la démarche que nous décrivons est ascendante, c‘est-à-dire, pour chaque langue séparément, les verbes sont d‘abord analysés puis regroupés par cadres sémantiques. Dans cette recherche, chacun des verbes « évoque » un cadre ou frame, une sorte de scénario conceptuel, dans lequel un certain nombre d‘acteurs obligatoires (core Frame Elements) jouent des rôles spécifiques (le rôle de juge, le rôle d‘appelant, le rôle de la loi). Mis en discours, les termes sont souvent accompagnés d‘autres renseignements optionnels (non-core Frame Elements) comme ceux des critères utilisés par le juge pour rendre une décision (des lois, des codes, d‘autres décisions antérieures). Tous les renseignements concernant les cadres sémantiques que chacun des verbes évoque ont été encodés dans un éditeur xml et une vingtaine de contextes illustrant la façon spécifique dont chacun des verbes évoque un cadre donné ont été annotés. Les étiquettes attribuées à chaque cadre sémantique (ex. [Compliance], [Verdict]) ont servi à relier certains termes synonymes, certains termes antonymes ainsi que des candidats équivalents.
Parmi les 200 termes portugais et anglais regroupés en 76 cadres, 165 paires de candidats équivalents ont été identifiés. 71% des paires d‘équivalents sont des équivalents parfaits parce que les verbes évoquent le même scénario conceptuel, leurs structures actancielles sont identiques, les réalisations linguistiques de chacun des actants sont équivalentes, et les patrons syntaxiques des verbes sont similaires. 29% des paires d‘équivalents correspondent à des équivalents partiels parce qu‘ils ne remplissent pas tous ces critères. Au moyen d‘exemples, l‘étude illustre tous les cas de figure observés et termine en présentant les différentes façons dont les futurs utilisateurs peuvent consulter le JuriDiCo, la ressource lexicale qui a été compilée pendant ce projet.
|
9 |
Criteria for the validation of specialized verb equivalents : application in bilingual terminographyPimentel, Janine 05 1900 (has links)
Multilingual terminological resources do not always include valid equivalents of legal terms for two main reasons. Firstly, legal systems can differ from one language community to another and even from one country to another because each has its own history and traditions. As a result, the non-isomorphism between legal and linguistic systems may render the identification of equivalents a particularly challenging task. Secondly, by focusing primarily on the definition of equivalence, a notion widely discussed in translation but not in terminology, the literature does not offer solid and systematic methodologies for assigning terminological equivalents. As a result, there is a lack of criteria to guide both terminologists and translators in the search and validation of equivalent terms.
This problem is even more evident in the case of predicative units, such as verbs. Although some terminologists (L‘Homme 1998; Lerat 2002; Lorente 2007) have worked on specialized verbs, terminological equivalence between units that belong to this part of speech would benefit from a thorough study. By proposing a novel methodology to assign the equivalents of specialized verbs, this research aims at defining validation criteria for this kind of predicative units, so as to contribute to a better understanding of the phenomenon of terminological equivalence as well as to the development of multilingual terminography in general, and to the development of legal terminography, in particular.
The study uses a Portuguese-English comparable corpus that consists of a single genre of texts, i.e. Supreme Court judgments, from which 100 Portuguese and 100 English specialized verbs were selected. The description of the verbs is based on the theory of Frame Semantics (Fillmore 1976, 1977, 1982, 1985; Fillmore and Atkins 1992), on the FrameNet methodology (Ruppenhofer et al. 2010), as well as on the methodology for compiling specialized lexical resources, such as DiCoInfo (L‘Homme 2008), developed in the Observatoire de linguistique Sens-Texte at the Université de Montréal. The research reviews contributions that have adopted the same theoretical and methodological framework to the compilation of lexical resources and proposes adaptations to the specific objectives of the project.
In contrast to the top-down approach adopted by FrameNet lexicographers, the approach described here is bottom-up, i.e. verbs are first analyzed and then grouped into frames for each language separately. Specialized verbs are said to evoke a semantic frame, a sort of conceptual scenario in which a number of mandatory elements (core Frame Elements) play specific roles (e.g. ARGUER, JUDGE, LAW), but specialized verbs are often accompanied by other optional information (non-core Frame Elements), such as the criteria and reasons used by the judge to reach a decision (statutes, codes, previous decisions). The information concerning the semantic frame that each verb evokes was encoded in an xml editor and about twenty contexts illustrating the specific way each specialized verb evokes a given frame were semantically and syntactically annotated. The labels attributed to each semantic frame (e.g. [Compliance], [Verdict]) were used to group together certain synonyms, antonyms as well as equivalent terms.
The research identified 165 pairs of candidate equivalents among the 200 Portuguese and English terms that were grouped together into 76 frames. 71% of the pairs of equivalents were considered full equivalents because not only do the verbs evoke the same conceptual scenario but their actantial structures, the linguistic realizations of the actants and their syntactic patterns were similar. 29% of the pairs of equivalents did not entirely meet these criteria and were considered partial equivalents. Reasons for partial equivalence are provided along with illustrative examples. Finally, the study describes the semasiological and onomasiological entry points that JuriDiCo, the bilingual lexical resource compiled during the project, offers to future users. / Les ressources multilingues portant sur le domaine juridique n‘incluent pas toujours d‘équivalents valides pour deux raisons. D‘abord, les systèmes juridiques peuvent différer d‘une communauté linguistique à l‘autre et même d‘un pays à l‘autre, car chacun a son histoire et ses traditions. Par conséquent, le phénomène de la non-isomorphie entre les systèmes juridiques et linguistiques rend difficile la tâche d‘identification des équivalents. En deuxième lieu, en se concentrant surtout sur la définition de la notion d‘équivalence, notion largement débattue en traductologie, mais non suffisamment en terminologie, la littérature ne propose pas de méthodologies solides et systématiques pour identifier les équivalents. On assiste donc à une absence de critères pouvant guider tant les terminologues que les traducteurs dans la recherche et la validation des équivalents des termes. Ce problème est encore plus évident dans le cas d‘unités prédicatives comme les verbes. Bien que certains terminologues (L'Homme, 1998; Lorente et Bevilacqua 2000; Costa et Silva 2004) aient déjà travaillé sur les verbes spécialisés, l‘équivalence terminologique, en ce qui concerne ce type d‘unités, bénéficierait d‘une étude approfondie. En proposant une méthodologie originale pour identifier les équivalents des verbes spécialisés, cette recherche consiste donc à définir des critères de validation de ce type d‘unités prédicatives afin de mieux comprendre le phénomène de l‘équivalence et aussi améliorer les ressources terminologiques multilingues, en général, et les ressources terminologiques multilingues couvrant le domaine juridique, en particulier.
Cette étude utilise un corpus comparable portugais-anglais contenant un seul genre de textes, à savoir les décisions des cours suprêmes, à partir duquel 100 verbes spécialisés ont été sélectionnés pour chaque langue. La description des verbes se base sur la théorie de la sémantique des cadres (Fillmore 1976, 1977, 1982, 1985; Fillmore and Atkins 1992), sur la méthodologie de FrameNet (Ruppenhofer et al. 2010), ainsi que sur la méthodologie développée à l‘Observatoire de linguistique Sens-Texte pour compiler des ressources lexicales spécialisées, telles que le DiCoInfo (L‘Homme 2008). La recherche examine d‘autres contributions ayant déjà utilisé ce cadre théorique et méthodologique et propose des adaptations objectives du projet. Au lieu de suivre une démarche descendante comme le font les lexicographes de FrameNet, la démarche que nous décrivons est ascendante, c‘est-à-dire, pour chaque langue séparément, les verbes sont d‘abord analysés puis regroupés par cadres sémantiques. Dans cette recherche, chacun des verbes « évoque » un cadre ou frame, une sorte de scénario conceptuel, dans lequel un certain nombre d‘acteurs obligatoires (core Frame Elements) jouent des rôles spécifiques (le rôle de juge, le rôle d‘appelant, le rôle de la loi). Mis en discours, les termes sont souvent accompagnés d‘autres renseignements optionnels (non-core Frame Elements) comme ceux des critères utilisés par le juge pour rendre une décision (des lois, des codes, d‘autres décisions antérieures). Tous les renseignements concernant les cadres sémantiques que chacun des verbes évoque ont été encodés dans un éditeur xml et une vingtaine de contextes illustrant la façon spécifique dont chacun des verbes évoque un cadre donné ont été annotés. Les étiquettes attribuées à chaque cadre sémantique (ex. [Compliance], [Verdict]) ont servi à relier certains termes synonymes, certains termes antonymes ainsi que des candidats équivalents.
Parmi les 200 termes portugais et anglais regroupés en 76 cadres, 165 paires de candidats équivalents ont été identifiés. 71% des paires d‘équivalents sont des équivalents parfaits parce que les verbes évoquent le même scénario conceptuel, leurs structures actancielles sont identiques, les réalisations linguistiques de chacun des actants sont équivalentes, et les patrons syntaxiques des verbes sont similaires. 29% des paires d‘équivalents correspondent à des équivalents partiels parce qu‘ils ne remplissent pas tous ces critères. Au moyen d‘exemples, l‘étude illustre tous les cas de figure observés et termine en présentant les différentes façons dont les futurs utilisateurs peuvent consulter le JuriDiCo, la ressource lexicale qui a été compilée pendant ce projet.
|
10 |
O tratamento da relação semântica partitiva em recurso lexical dedicado à legibilidade por máquina: estendendo a anotação da FrameNet / The treatment of the partitive semantic relation in lexical resource dedicated to machine readability: extending FrameNet annotationCampos, Julia Aparecida Gonçalves 07 July 2017 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-11-06T14:10:40Z
No. of bitstreams: 1
juliaaparecidagoncalvescampos.pdf: 6444524 bytes, checksum: f10ae45b44ceb730affe5711343a7b2d (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-11-09T14:24:20Z (GMT) No. of bitstreams: 1
juliaaparecidagoncalvescampos.pdf: 6444524 bytes, checksum: f10ae45b44ceb730affe5711343a7b2d (MD5) / Made available in DSpace on 2017-11-09T14:24:20Z (GMT). No. of bitstreams: 1
juliaaparecidagoncalvescampos.pdf: 6444524 bytes, checksum: f10ae45b44ceb730affe5711343a7b2d (MD5)
Previous issue date: 2017-07-07 / O presente trabalho, no escopo da Linguística Cognitiva, buscou os elementos-chave para a compreensão do fenômeno da significação e a proposição da formalização em um modelo conceitual ontológico, que subsidiasse processos de inferência na interpretação por máquinas da relação partitiva. O objetivo principal foi estudar a relação semântica partitiva tal como manifestada na semântica das línguas humanas e para isso, avança o estudo da relação a partir da perspectiva clássica lexical, limitada às expressões “é parte de” ou “tem parte” presentes em sentenças criadas, como “a asa é parte da xícara” ou “xícara tem parte asa”, para ocorrências reais extraídas de corpus, como na sentença: “a asa da xícara quebrou”. Na perspectiva da Semântica de Frames, para entender um item linguístico, é necessário entender o frame que ele evoca e as relações deste frame com outros frames. Além da descrição da relação partitiva no escopo da Semântica de Frames, o objetivo foi formalizar a relação partitiva entre frames em um modelo ontológico, apresentando a proposta de uma ferramenta para o reconhecimento da relação partitiva em sentenças. Para isso, combinamos a notação descritiva, representada na FrameNet, com informações provenientes da WordNet e da ontologia SIMPLE, recursos lexicais que assumem a Teoria do Léxico Gerativo (Pustejovsky, 1996). A criação de uma base de dados ontológica (das entidades e suas partes), como resultado do esforço de anotação de frames de uma língua, auxiliou o reconhecimento da relação partitiva em sentenças diversas e a promoveu sua legibilidade por máquinas, através da explicitação dos tipos semânticos dos elementos vinculados partitivamente. / In the scope of Cognitive Linguistics, the present work sought the key elements for the understanding of the phenomenon of signification and the proposition of formalization in an ontological conceptual model, which subsidized processes of inference in the machine interpretation of the partitive relation. The main objective was to study the partitive semantic relation as manifested in the semantics of human languages and, for this purpose, the study advances from the classical lexical perspective, limited to the expressions “is part of” or “has part” from created sentences, as “the handle is part of the cup” or “cup has the handle as part”, for real occurrences extracted from corpus, as in the sentence: “a asa da xícara quebrou” [the cup handle broke]. In the Frame Semantics approach, in order to understand a linguistic item, it is necessary to understand the frame that it evokes and the relations of this frame with other frames. In addition to the description of the partitive relation in the scope of Frame Semantics, the objective was to formalize the partitive relation between frames in an ontological model, presenting the proposal of a tool for the recognition of the partitive relation in sentences. For this purpose, we combine the descriptive notation, represented in FrameNet, with information from WordNet and the SIMPLE ontology, which are lexical resources that assume the Generative Lexicon Theory (PUSTEJOVSKY, 1996). The creation of an ontological database (of the entities and their parts), as a result of the effort of annotation of frames of a language, helped the recognition of the partitive relation in various sentences and promoted its readability by machines, through the explicitness of the semantic types of the elements linked by partitive relation.
|
Page generated in 0.0467 seconds