• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 8
  • 7
  • 1
  • 1
  • 1
  • Tagged with
  • 35
  • 23
  • 22
  • 17
  • 15
  • 14
  • 14
  • 12
  • 12
  • 11
  • 8
  • 7
  • 7
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Sémantická informace ze sítě FrameNet a možnosti jejího využití pro česká data / Semantic information from FrameNet and the possibility of its transfer to Czech data

Limburská, Adéla January 2016 (has links)
The thesis focuses on transferring FrameNet annotation from English to Czech and the possibilities of using the resulting data for automatic frame prediction in Czech. The first part, annotation transfer, has been performed in two ways. First, a parallel corpus of English sentences and their human created Czech translations (PCEDT) was used. Second, a much larger parallel corpus was created using ma- chine translation of FrameNet example sentences. This corpus was then used to transfer the annotation as well. The resulting data were partially evaluated and some of the automatically detectable errors were filtered out. Subsequently, the data were used as an input for two machine learning methods, decision trees and support vector machines. Since neither of the machine learning experiments brought impressive results, further manual correction of the data annotation was performed, which helped increase the accuracy of the prediction. However, as the accuracy reported in related papers is notably higher, the thesis also discusses dif- ferent approaches to feature selection and the possibility of further improvement of the prediction results using these methods. 1
22

Frame Shifts und Frame-Vergleichbarkeit bei Englisch-Deutscher Übersetzung am Beispiel einer Volltextannotation mit FrameNet

Triesch, Susanne 07 January 2019 (has links)
In dieser Masterarbeit wird eine selbst erstelle Volltextannotation einer deutschen Übersetzung aus dem Englischen untersucht und die beiden Textversionen und ihre Annotation miteinander verglichen. Im Mittelpunkt stehen die Fragen, inwieweit sich die in FrameNet für das Englische formulierten Frames für die Volltextannotation eines deutschen Textes nutzen lassen und welche Abweichungen zwischen englischem Original und deutscher Übersetzung auf Ebene der Frames und ihrer Lexikalisierung bestehen. Zu Beginn der Arbeit wird ein Überblick über den relevanten theoretischen Hintergrund und Forschungsstand der Frame-Semantik einschließlich ihrer Anwendung im FrameNet Project, in der Annotation und in der sprachübergreifenden Forschung gegeben. Im dritten Kapitel wird die Untersuchung einschließlich des verwendeten Materials und der Methoden vorgestellt. Darauf aufbauend folgt die Präsentation der Ergebnisse, die in Kapitel fünf mit Bezug auf die Forschungsfragen ausgewertet werden. Den Abschluss bilden Schlussfolgerungen zur sprachübergreifenden Nutzung von FrameNet und zur Volltextannotation sowie ein Ausblick auf weitere Forschungsfelder.:1 Einleitung 2 Hintergrund und Forschungsstand 2.1 Ansätze der kognitiven Linguistik 2.2 Frame-Semantik 2.3 FrameNet Project 2.4 Framesemantische Annotation 2.5 Sprachübergreifende Annotation und Frame-Semantik 3 Vorstellung der Untersuchung 3.1 Textmaterial 3.2 Methodisches Vorgehen 3.2.1 Volltextannotation 3.2.2 Vergleich der annotierten Textversionen 4 Ergebnisse und Diskussion 4.1 Volltextannotation der deutschen Übersetzung 4.1.1 Übertragbarkeit von FrameNet-Frames auf den deutschen Text 4.2 Vergleich der annotierten Textversionen 4.2.1 Frame Shifts 4.3 Evaluation der Methoden 5 Fazit und Ausblick 5.1 Volltextannotation 5.2 Sprachübergreifende Nutzung von FrameNet 5.3 Verbindung von Frame-Semantik und Konstruktionsgrammatik 5.4 Abschließendes Fazit Literaturverzeichnis
23

Du terme prédicatif au cadre sémantique : méthodologie de compilation d'une ressource terminologique pour les termes arabes de l'informatique

Ghazzawi, Nizar 08 1900 (has links)
La description des termes dans les ressources terminologiques traditionnelles se limite à certaines informations, comme le terme (principalement nominal), sa définition et son équivalent dans une langue étrangère. Cette description donne rarement d’autres informations qui peuvent être très utiles pour l’utilisateur, surtout s’il consulte les ressources dans le but d’approfondir ses connaissances dans un domaine de spécialité, maitriser la rédaction professionnelle ou trouver des contextes où le terme recherché est réalisé. Les informations pouvant être utiles dans ce sens comprennent la description de la structure actancielle des termes, des contextes provenant de sources authentiques et l’inclusion d’autres parties du discours comme les verbes. Les verbes et les noms déverbaux, ou les unités terminologiques prédicatives (UTP), souvent ignorés par la terminologie classique, revêtent une grande importance lorsqu’il s’agit d’exprimer une action, un processus ou un évènement. Or, la description de ces unités nécessite un modèle de description terminologique qui rend compte de leurs particularités. Un certain nombre de terminologues (Condamines 1993, Mathieu-Colas 2002, Gross et Mathieu-Colas 2001 et L’Homme 2012, 2015) ont d’ailleurs proposé des modèles de description basés sur différents cadres théoriques. Notre recherche consiste à proposer une méthodologie de description terminologique des UTP de la langue arabe, notamment l’arabe standard moderne (ASM), selon la théorie de la Sémantique des cadres (Frame Semantics) de Fillmore (1976, 1977, 1982, 1985) et son application, le projet FrameNet (Ruppenhofer et al. 2010). Le domaine de spécialité qui nous intéresse est l’informatique. Dans notre recherche, nous nous appuyons sur un corpus recueilli du web et nous nous inspirons d’une ressource terminologique existante, le DiCoInfo (L’Homme 2008), pour compiler notre propre ressource. Nos objectifs se résument comme suit. Premièrement, nous souhaitons jeter les premières bases d’une version en ASM de cette ressource. Cette version a ses propres particularités : 1) nous visons des unités bien spécifiques, à savoir les UTP verbales et déverbales; 2) la méthodologie développée pour la compilation du DiCoInfo original devra être adaptée pour prendre en compte une langue sémitique. Par la suite, nous souhaitons créer une version en cadres de cette ressource, où nous regroupons les UTP dans des cadres sémantiques, en nous inspirant du modèle de FrameNet. À cette ressource, nous ajoutons les UTP anglaises et françaises, puisque cette partie du travail a une portée multilingue. La méthodologie consiste à extraire automatiquement les unités terminologiques verbales et nominales (UTV et UTN), comme Ham~ala (حمل) (télécharger) et taHmiyl (تحميل) (téléchargement). Pour ce faire, nous avons adapté un extracteur automatique existant, TermoStat (Drouin 2004). Ensuite, à l’aide des critères de validation terminologique (L’Homme 2004), nous validons le statut terminologique d’une partie des candidats. Après la validation, nous procédons à la création de fiches terminologiques, à l’aide d’un éditeur XML, pour chaque UTV et UTN retenue. Ces fiches comprennent certains éléments comme la structure actancielle des UTP et jusqu’à vingt contextes annotés. La dernière étape consiste à créer des cadres sémantiques à partir des UTP de l’ASM. Nous associons également des UTP anglaises et françaises en fonction des cadres créés. Cette association a mené à la création d’une ressource terminologique appelée « DiCoInfo : A Framed Version ». Dans cette ressource, les UTP qui partagent les mêmes propriétés sémantiques et structures actancielles sont regroupées dans des cadres sémantiques. Par exemple, le cadre sémantique Product_development regroupe des UTP comme Taw~ara (طور) (développer), to develop et développer. À la suite de ces étapes, nous avons obtenu un total de 106 UTP ASM compilées dans la version en ASM du DiCoInfo et 57 cadres sémantiques associés à ces unités dans la version en cadres du DiCoInfo. Notre recherche montre que l’ASM peut être décrite avec la méthodologie que nous avons mise au point. / The description of terms in traditional terminological resources is limited to certain details, such as the term (which is usually a noun), its definition, and its equivalent. This description seldom takes into account other details, which can be of high importance for the users, especially if they consult resources to enhance their knowledge of the domain, to improve professional writing, or to find contexts where the term is realized. The information that might be useful includes the description of the actantial structure of the terms, contexts from authentic resources and the inclusion of other parts of speech such as verbs. Verbs and deverbal nouns, or predicative terminological units (PTUs), which are often ignored by traditional terminology, are of great importance especially for expressing actions, processes or events. But the description of these units requires a model of terminological description that takes into account their special features. Some terminologists (Condamines 1993, Mathieu-Colas 2002, Gross et Mathieu-Colas 2001 et L’Homme 2012, 2015) proposed description models based on different theoretical frameworks. Our research consists of proposing a methodology of terminological description of PTUs of the Arabic language, in particular Modern Standard Arabic (MSA), according to the theory of Frame Semantics of Fillmore (1976, 1977, 1982, 1985) and its application, the FrameNet project (Ruppenhofer et al. 2010). The specialized domain in which we are interested is computing. In our research, we compiled a corpus that we collected from online material and we based our method on an existing online terminological resource called the DiCoInfo (L’Homme 2008) in our pursuit to compile our own. Our objectives are the following. First, we will lay the foundations of an MSA version of the aforementioned resource. This version has its own features: 1) we target specific units, namely verbal and deverbal PTUs; 2) the developed methodology for the compilation of the original DiCoInfo should be adapted to take into account a Semitic language. Afterwards, we will create a framed version of this resource. In this version, we organize the PTUs in semantic frames according to the model of FrameNet. Since this frame version has a multilingual dimension, we add English and French PTUs to the resource. Our methodology consists of automatically extracting the verbal and nominal terminological units (VTUs and NTUs) such as Ham~ala (حمل) (download). To do this, we integrated the MSA to an existing automatic extractor, TermoStat (Drouin 2004). Then, with the help of terminological validation criteria, we validate the terminological status of the candidates. After the validation, we create terminological files with an XML editor for each VTU and NTU. These files contain elements, such as the actantial structure of the PTUs and up to 20 annotated contexts. The last step consists of creating semantic frames from the MSA PTUs. We also associate English and French PTUs to the created frames. This association resulted in the creation of a second terminological resource called “DiCoInfo: A Framed Version”. In this resource, the PTUs that share the same semantic features and actantial structures are organized in semantic frames. For example, the semantic frame Product_development groups PTUs such as Taw~ara (طور) (develop), to develop and développer. As a result of our methodology, we obtained a total of 106 PTUs in MSA compiled in the MSA version of DiCoInfo and 57 semantic frames associated to these units in the framed version. Our research shows that the MSA can be described using the methodology that we set up.
24

A semântica da emoção: um estudo contrastivo a partir da FrameNet e da roda das emoções

Foschiera, Silvia Matturro Panzardi 31 July 2012 (has links)
Submitted by Fabricia Fialho Reginato (fabriciar) on 2015-07-04T00:45:28Z No. of bitstreams: 1 SilviaFoschiera.pdf: 3755161 bytes, checksum: 0f631548f2054c557658d1a50094a5ac (MD5) / Made available in DSpace on 2015-07-04T00:45:28Z (GMT). No. of bitstreams: 1 SilviaFoschiera.pdf: 3755161 bytes, checksum: 0f631548f2054c557658d1a50094a5ac (MD5) Previous issue date: 2012-07-31 / Nenhuma / O objetivo principal desta investigação é verificar em que aspectos a Semântica de Frames (FILLMORE, 1982; 1985) e o modelo denominado Roda das Emoções (SCHERER, 2005) contribuem na relação entre a linguagem e o fenômeno da emoção, considerando os idiomas português e espanhol. A Semântica de Frames, perspectiva teórica vinculada à Linguística Cognitiva, fundamenta a análise semântica e sintática por meio de um estudo exploratório do maquinário da FrameNet (FILLMORE et al., 2003). Com base nesse arcabouço teórico, realizamos um levantamento dos frames e elementos de frame de verbos e adjetivos que descrevem a emoção, associando categorias semânticas e sintáticas. Verificamos, também, a possibilidade de mapear o holder e o tópico de opinião considerando o corpus de sentenças do Twitter. A segunda perspectiva teórica está relacionada à Psicologia Cognitiva, por meio do modelo denominado Roda das Emoções. Considerando os traços semânticos sugeridos nessa ferramenta, observa-se o quanto, levando em conta aplicações computacionais, ela vem enriquecer um estudo de Análise de Sentimento. A Roda das Emoções serve para identificar a polaridade das opiniões constantes por meio dos adjetivos nas sentenças da amostra. Os resultados evidenciam que as duas perspectivas se revelam produtivas para aplicações computacionais em Análise de Sentimento. / The main objective of this research is to ascertain which aspects of Frame Semantics (Fillmore, 1982; 1985) and the model called Wheel of Emotions (Scherer, 2005) contribute to the relationship between language and the phenomenon of emotion, in regards to the Portuguese and Spanish languages. Frame Semantics -a theoretical construct linked to cognitive linguistics- underlies the syntactic and semantic analysis by means of an exploratory study of the FrameNet database (Fillmore et al., 2003). Based on this theoretical framework, we conducted a survey of the frames and frame elements that describe emotions, attaching semantic and syntactic categories to them. We also contemplated the possibility of mapping the holder and the subject of opinion in the corpus of sentences from Twitter. The second theoretical perspective is related to cognitive psychology through the Wheel of Emotions. Considering the semantic aspects offered by this tool, it is observed to what extent –including computer applications- it enriches the study of Sentiment Analysis. The Wheel of Emotions helps to identify the polarity of opinions contained in the sample sentences. The results show that the two perspectives prove productive for computer applications in Sentiment Analysis.
25

Copa 2014 FrameNet Brasil: diretrizes para a constituição de um dicionário eletrônico trilíngue a partir da análise de frames da experiência turística

Gamonal, Maucha Andrade 11 March 2013 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-04-05T11:25:54Z No. of bitstreams: 1 mauchaandradegamonal.pdf: 5111281 bytes, checksum: 1a21973abbc0ee27bb4401bad94f7099 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-04-24T03:47:58Z (GMT) No. of bitstreams: 1 mauchaandradegamonal.pdf: 5111281 bytes, checksum: 1a21973abbc0ee27bb4401bad94f7099 (MD5) / Made available in DSpace on 2016-04-24T03:47:58Z (GMT). No. of bitstreams: 1 mauchaandradegamonal.pdf: 5111281 bytes, checksum: 1a21973abbc0ee27bb4401bad94f7099 (MD5) Previous issue date: 2013-03-11 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Esta dissertação é parte do subprojeto Copa 2014 FrameNet Brasil (SALOMÃO ET AL., 2011), iniciativa da FrameNet Brasil em parceria com o projeto FrameCorp (CHISHMAN ET AL., 2008) e com a Berkeley FrameNet (FILLMORE ET AL., 2003) que propõe a elaboração de dicionário eletrônico trilíngue – Português, Inglês, Espanhol – para os domínios da Copa do Mundo, do Futebol e do Turismo. A elaboração deste recurso se mostra diferente dos demais dicionários eletrônicos por ser estruturado a partir da teoria da Semântica de Frames (FILLMORE, 1982, 1985; PETRUCK, 1996) e da metodologia da FrameNet (FILLMORE ET AL., 2003, 2003a; RUPPENHOFER ET AL., 2010). A contribuição desta pesquisa é estabelecer as diretrizes para a estruturação deste dicionário a partir da modelagem de frames da experiência turística. Para tanto, algumas perguntas guiam o desenvolvimento deste trabalho: i) Em que medida os frames do domínio turístico modelados com corpora compilados da língua portuguesa do Brasil servem para representar os frames do Turismo para as demais línguas do dicionário? ii) Como a FrameNet responde aos desafios colocados na estruturação de recursos lexicais multilíngues? É possível utilizar frames como interlíngua? iii) Que avaliação se pode fazer do Kicktionary, dicionário multilíngue do futebol, como produto que utiliza a FrameNet e a WordNet (MILLER, 1993, 1995; FELLBAUM, 1998)? As respostas a essas reflexões apontaram que: i) os frames do domínio turístico são modelados da mesma forma pelas diferentes culturas; ii) a rede semântica FrameNet precisa se adaptar às especificidades impostas pela lexicografia multilíngue, mas, devido ao caráter transcultural do Turismo e também da Copa do Mundo e do Futebol, os frames podem ser utilizados como interlíngua; ii) o Kicktionary, como dicionário multilíngue de domínio especializado que utiliza frames e synsets, deveria rever a funcionalidade de cada teoria na estruturação dos bancos de dados. As relações intralinguísticas poderiam acontecer via WordNet, e as relações interlinguísticas, via FrameNet. / This work is part of the subproject 2014 World Cup FrameNet Brasil (SALOMÃO ET AL., 2011), an initiative of FrameNet Brazil in partnership with the FrameCorp project (CHISHMAN ET AL., 2008) and Berkeley FrameNet (FILLMORE ET AL. 2003), which proposes developing a trilingual electronic dictionary - English, Portuguese, Spanish - for the domains of the World Cup, Soccer, and Tourism. The development of this resource is different from the other electronic dictionaries because it is structured within the theory of Frame Semantics (FILLMORE, 1982, 1985; PETRUCK, 1996) and the methodology of FrameNet (FILLMORE ET AL., 2003, 2003a; RUPPENHOFER ET AL., 2010). The contribution of this research is to establish guidelines for structuring this dictionary through the analysis of tourist experience frames. Therefore, some questions guide the development of this work: i) How does this resource respond to the challenges involved in structuring multilingual lexical resources? ii) Is it possible to use frames as an interlingual representation? iii) What evaluation can be made of Kicktionary, a multilingual dictionary of football as a product that uses both FrameNet and WordNet (Miller, 1993, 1995; FELLBAUM, 1998)? The answers to these considerations have shown that: i) frames of the tourist domain are modeled in the same way by different cultures; ii) FrameNet needs to adapt to the specificities imposed by multilingual lexicography, but due to the transcultural nature of Tourism and also the Soccer and the World Cup, frames can be used as interlingua; ii) the Kicktionary, as specialized multilingual dictionary which uses frames and synsets, should review the functionality of each theory in structuring databases. Intralinguistic relations could happen via WordNet, and interlingual relations via FrameNet.
26

Composition sémantique pour la langue orale / Semantic composition for spoken language understanding

Duvert, Frédéric 10 November 2010 (has links)
La thèse présentée ici a pour but de proposer des systèmes de détection, de composition de constituants sémantiques et d’interprétation dans la compréhension de la langue naturelle parlée. Cette compréhension se base sur un système de reconnaissance automatique de la parole qui traduit les signaux oraux en énoncés utilisables par la machine. Le signal de la parole, ainsi transcrit, comporte un ensemble d’erreurs liées aux erreurs de reconnaissance (bruits, parasites, mauvaise prononciation...). L’interprétation de cet énoncé est d’autant plus difficile qu’il est issu d’un discours parlé, soumis à la disfluence du discours, aux auto-corrections... L’énoncé est de plus agrammatical, car le discours parlé lui-même est agrammatical. L’application de méthodes d’analyses grammaticales ne produit pas de bons résultats d’interprétation, sur des textes issus de transcriptions de la parole. L’utilisation de méthodes d’analyses syntaxiques profondes est à éviter. De ce fait, une analyse superficielle est envisagée. Un des premiers objectifs est de proposer une représentation du sens. Il s’agit de considérer des ontologies afin de conceptualiser le monde que l’on décrit. On peut exprimer les composants sémantiques en logique du premier ordre avec des prédicats. Dans les travaux décrits ici, nous représentons les éléments sémantiques par des frames (FrameNet ). Les structures de frames sont hiérarchisées, et sont des fragments de connaissances auxquels on peut insérer, fusionner ou inférer d’autres fragments de connaissances. Les structures de frames sont dérivables en formules logiques. Nous proposons un système de compréhension de la parole à partir de règles logiques avec le support d’une ontologie, afin de pouvoir créer des liens à partir de composants sémantiques. Puis, nous avons mené une étude sur la découverte des supports syntaxiques des relations sémantiques. Nous proposons une expérience de composition sémantique afin d’enrichir les composants sémantiques de base. Enfin, nous présentons un système de détection de lambda-expression pour mettre en hypothèse les relations à trouver à travers le discours / The thesis presented here is intended to provide detection systems, composition of components and semantic interpretation in the natural spoken language understanding. This understanding is based on an automatic speech recognition system that translates the signals into oral statements used by the machine. The transcribed speech signal, contains a series of errors related to recognition errors (noise, poor pronunciation...). The interpretation of this statement is difficult because it is derived from a spoken discourse, subject to the disfluency of speech, forself-correction... The statement is more ungrammatical, because the spoken discourse itself is ungrammatical. The application of grammatical analysis methods do not produce good results interpretation, on the outcome of speech transcription. The use of deep syntactic analysis methods should be avoided. Thus, a superficial analysis is considered. A primary objective is to provide a representation of meaning. It is considered ontologies to conceptualize the world we describe. We can express the semantic components in first order logic with predicates. In the work described here, we represent the semantic elements by frames (FrameNet ). The frames are hierarchical structures, and are fragments of knowledge which can be inserted, merge or infer other fragments of knowledge. The frames are differentiable structures in logical formulas. We propose a system for speech understanding from logical rules with the support of an ontology in order to create links from semantic components. Then, we conducted a study on the discovery supports syntactic semantic relationships. We propose a compositional semantics experience to enrich the basic semantic components. Finally, we present a detection system for lambda-expression hypothesis to find the relationship through discourse
27

Frame semantics for the field of climate change : d iscovering frames based on chinese and english terms

Zheng, Ying 12 1900 (has links)
La plupart des dictionnaires spécialisés de termes environnementaux en mandarin sont des dictionnaires papier, compilés et révisés il y a plus de dix ans, et contiennent principalement des termes nominaux. Les informations terminologiques se limitent aux connaissances véhiculées par le terme et son ou ses équivalents anglais. Pour les lecteurs qui souhaitent connaître les propriétés sémantiques ou syntaxiques des termes et pour les lecteurs qui veulent voir l’usage des termes dans des contextes réels de textes spécialisés, les informations fournies par les dictionnaires existants sont insuffisantes. Dans cette recherche, nous avons compilé une ressource terminologique en ligne du mandarin, décrivant les termes verbaux chinois dans le domaine du changement climatique. Cette ressource comble certaines des lacunes des dictionnaires environnementaux mandarin existants, en révélant le(s) sens du terme à travers la(les) structure(s) actantielle(s) et en montrant, à travers des contextes annotés, les propriétés sémantiques et syntaxiques du terme ainsi que ses usages pratiques dans des textes spécialisés. Cette ressource répondra mieux aux besoins du public. La base théorique qui sous-tend cette recherche est la Sémantique des cadres (Fillmore, 1976, 1977, 1982, 1985; Fillmore & Atkins, 1992), et le FrameNet construit à partir de celle-ci. L’objectif principal de cette recherche est de découvrir et de définir des cadres sémantiques chinois dans le domaine du changement climatique, et d’établir des relations entre les cadres chinois définis. Les cadres sémantiques chinois sont découverts à l’aide de la méthodologie du dictionnaire environnemental multilingue DiCoEnviro (et de sa ressource d’accompagnement Framed DiCoEnviro) (L’Homme, 2018; L’Homme et al., 2020). Afin de rendre cette méthodologie applicable à une langue sino-tibétaine, le chinois, nous avons modifié et adapté cette méthodologie pour qu’elle convienne à la description des termes chinois et à la définition des cadres sémantiques chinois. Certaines de ces modifications et adaptations sont basées sur le Chinese FrameNet (CFN) (Liu & You, 2015). Afin de découvrir les cadres sémantiques chinois, un corpus monolingue en chinois mandarin sur le changement climatique (MCCC) a d’abord été compilé. Ce corpus contient 224 textes iv authentiques chinois spécialisés dans le domaine du changement climatique, qui totalisent 1,228,333 caractères chinois, soit 547,592 mots chinois. Puis, les termes candidats ont été automatiquement extraits du MCCC à l’aide du logiciel de gestion et d’analyse de corpus – Sketch Engine. Après une analyse et une validation manuelle, nous avons déterminé quels termes candidats sont des termes réels. Par la suite, la structure actancielle de chaque terme a été écrite en analysant les contextes où le terme apparaît. Ensuite, chaque sens d’un terme polysémique a été placé dans une entrée séparée et 16-20 contextes ont été sélectionnés pour chaque entrée. Puis, chaque contexte a été annoté en fonction de trois couches – structure sémantique, fonction syntaxique et groupe syntaxique. Ensuite, les termes ont été classés en fonction des scénarios qu’ils évoquent. Les termes qui dépeignent la même scène ou situation dans le domaine du changement climatique, qui ont une structure actantielle similaire et qui partagent la majorité des circonstants sont classés dans un seul cadre sémantique (critères basés sur le projet DiCoEnviro (L’Homme, 2018; L’Homme et al., 2020)). Après avoir identifié les cadres sémantiques chinois, chaque cadre a été défini. Enfin, les cadres chinois découverts ont été reliés selon les huit types de relations entre cadres proposés par Ruppenhofer et al. (2016). Pour être affichés en ligne, les entrées de termes et les cadres sémantiques ont été encodés dans des fichiers XML. Guidés par cette méthodologie de recherche, nous avons finalement relevé 23 cadres sémantiques chinois et nous les avons définis. Le résultat final de cette recherche est une ressource terminologique en chinois mandarin basée sur des cadres et spécialisée dans le domaine du changement climatique. Cette ressource terminologique se compose de deux parties. La première partie est la description d’un total de 39 termes verbaux chinois. Chaque sens d’un terme verbal polysémique étant placé dans une entrée séparée, il y a au total 59 entrées (chaque entrée contient la structure actantielle et les contextes annotés). Au total, 1,027 contextes ont été annotés. La deuxième partie de cette ressource présente les 23 cadres sémantiques chinois identifiés ainsi que les relations entre les cadres. / Most of the existing Mandarin Chinese specialised dictionaries of environmental terms are paper dictionaries, compiled and revised more than ten years ago, and contain mainly noun terms. Terminological information is restricted to knowledge conveyed by the term and its English equivalent(s). For readers who want to learn about semantic or syntactic properties of terms and for readers who want to see usage of terms in real contexts of specialised texts, information provided in existing dictionaries is insufficient. In this research, we compiled an online Mandarin Chinese terminological resource, describing Chinese verb terms in the field of climate change. This resource makes up for some of the deficiencies of existing Chinese environmental dictionaries, revealing meaning(s) of the term through actantial structure(s) and showing, through annotated contexts, semantic and syntactic properties of the term as well as its practical usages in specialised texts. This resource better meets the needs of the audience. The theoretical basis underpinning this research is Frame Semantics (Fillmore, 1976, 1977, 1982, 1985; Fillmore & Atkins, 1992), and the FrameNet built from it. The main objective of this research is to discover and define Chinese semantic frames in the field of climate change, and to establish relations between the Chinese frames defined. The Chinese semantic frames are discovered with the help of the methodology of the multilingual environmental dictionary DiCoEnviro (and its accompanying resource Framed DiCoEnviro) (L’Homme, 2018; L’Homme et al., 2020). In order to make this methodology applicable to a Sino-Tibetan language, Chinese, we modified and adapted this methodology to suit the description of Chinese terms and definition of Chinese semantic frames. Some of the changes and adaptations are based on the Chinese FrameNet (CFN) (Liu & You, 2015). In order to discover Chinese semantic frames, a monolingual Mandarin (Chinese) Climate Change Corpus (MCCC) was first compiled. This corpus contains 224 authentic Chinese specialised texts in the field of climate change, totaling 1,228,333 Chinese characters, which is 547,592 Chinese words. Following this, candidate terms were automatically extracted from MCCC using the corpus ii management and analysing software – Sketch Engine. After manual analysis and validation, which of the candidate terms are true terms was clarified. Subsequently, the actantial structure of each term was written by analysing the contexts where the term occurs. Next, each sense of a polysemous term was placed in a separate entry and 16-20 contexts were selected for each entry. Then, each context was annotated in terms of three layers – semantic structure, syntactic function and syntactic group. After this, the terms were classified according to the scenarios they evoke. Terms that depict the same scene or situation in the field of climate change, have similar actantial structure, and share the majority of circumstants are categorised into one semantic frame (criteria based on the project DiCoEnviro (L’Homme, 2018; L’Homme et al., 2020)). After Chinese semantic frames were identified, each frame was defined. Finally, the discovered Chinese frames were linked according to the eight types of frame relations proposed by Ruppenhofer et al. (2016). To be displayed online, term entries and semantic frames were encoded in XML files. Guided by this research methodology, we eventually discovered and defined 23 Chinese semantic frames. The end result of this research is a frame-based Mandarin Chinese terminological resource specialised in the field of climate change. This terminological resource consists of two parts. The first part is the description of a total of 39 Chinese verb terms. With each meaning of a polysemous verb term placed in a separate entry, there are a total of 59 entries (each entry contains the actantial structure and annotated contexts). A total of 1,027 contexts were annotated. The second part of this resource presents the 23 Chinese semantic frames identified as well as the relations between frames.
28

Jogada de letra: um estudo sobre colocações à luz da semântica de frames

Souza, Diego Spader de 30 March 2015 (has links)
Submitted by Maicon Juliano Schmidt (maicons) on 2015-06-17T13:30:29Z No. of bitstreams: 1 Diego Spader de Souza.pdf: 2206624 bytes, checksum: 30715b12e44b6bedea8e7f523b159982 (MD5) / Made available in DSpace on 2015-06-17T13:30:29Z (GMT). No. of bitstreams: 1 Diego Spader de Souza.pdf: 2206624 bytes, checksum: 30715b12e44b6bedea8e7f523b159982 (MD5) Previous issue date: 2015-03-30 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / O objetivo da presente dissertação é discutir a relação existente entre o fenômeno linguístico das colocações e os conceitos da teoria da Semântica de Frames (FILLMORE, 1982; 1985). O trabalho se insere no contexto de dois projetos de pesquisa desenvolvidos pelo grupo SemanTec, o Field – Dicionário de Expressões do Futebol (CHISHMAN, 2014), já disponível para consulta na web, e o Dicionário Eletrônico Modalidades Olímpicas (CHISHMAN, 2014), ainda em fase inicial. Os dois dicionários citados se organizam a partir da noção de frame semântico proposta por Fillmore (1982; 1985), de forma que a dissertação busca evidenciar de que forma esse conceito (e os conceitos que o cercam) repercutem no tratamento lexicográfico dispensado às colocações. Nesse sentido, a revisão da literatura, apresentada nos capítulos 2 e 3, discute as bases teóricas para o estudo das colocações e da Semântica de Frames. O método da pesquisa consiste na análise de 74 colocações da linguagem do futebol. A escolha dessas estruturas parte do estudo de 500 combinações lexicais extraídas a partir de um corpus em português brasileiro do discurso do futebol através do software Sketch Engine. A análise das 74 colocações selecionadas acontece em duas fases: a primeira se dedica a averiguar os aspectos quantitativos do conjunto de dados e as características estruturais das colocações da linguagem do futebol; a segunda etapa foca na relação dessas combinações com os preceitos teóricos da Semântica de Frames e da sua contraparte computacional, a FrameNet, a fim de perceber de que modo esse arcabouço teórico oferece subsídios para o tratamento das colocações em contextos lexicográficos. Entre os principais resultados da primeira fase de análise, destaca-se o fato de que a maior parte das colocações do futebol designa estruturas verbais, como fazer gol e mandar bola, o que demonstra que a linguagem esportiva é marcada pela dinâmica das ações e dos eventos que ocorrem durante a partida. Além disso, foi possível perceber que as colocações nominais estão fortemente ligadas aos materiais, participantes e locais do contexto futebolístico. A segunda parte demonstrou que as colocações, no âmbito de dicionários baseados em frames, atuam como unidades lexicais, conceito proveniente da FrameNet. Ao serem consideradas unidades lexicais, as colocações são evocadoras de frame, o que as caracteriza como termos que devem estar presentes na lista principal de verbetes. Foi possível notar, contudo, que a evocação de frame a partir das colocações muitas vezes não segue o modelo tradicional presente na FrameNet, especialmente quando se trata das colocações nominais, que não evocam acontecimentos, mas entidades estáticas, como cartão vermelho e tabela de classificação. A presente dissertação evidencia a relevância da Semântica de Frames e da FrameNet para o estudo de unidades complexas como as colocações em contextos lexicográficos. Outro aspecto a ser mencionado é a importância dos recursos metodológicos da Linguística de Corpus para a área em que o estudo se insere. / The present thesis aims at the discussion of the relation that exists between the linguistic phenomenon of collocations and the concepts of Frame Semantics theory (FILLMORE, 1982; 1985). The study has arisen in the context of two research projects developed by the SemanTec group, Field – Football Expressions Dictionary (CHISHMAN, 2014), already available on the web, and Olympic Modalities Electronic Dictionary (CHISHMAN, 2014), still in early stage. Both dictionaries are organized around the notion of semantic frame proposed by Fillmore (1982; 1985), in such a way that the thesis seeks to demonstrate in which way this concept (and the concepts surrounding it) are related to the lexicographic treatment given to collocations. Thus, the literature review, presented in chapters 2 and 3, discusses the theoretical basis of the studies of collocations and Frame Semantics. The research method consists of the analysis of 74 collocations of football language. The choice of these structures was made after the study of 500 lexical combinations extracted from a Brazilian Portuguese corpus of football discourse through the Sketch Engine software. The analysis of the 74 collocations happens in two steps: the first one is dedicated to investigate the quantitative aspects of the data set and the structural characteristics of football language collocations; the second phase focuses on the relation between these combinations and the theoretical assumptions of Frame Semantics and its computational counterpart, FrameNet, in order to see in which way this theoretical outline treats collocations in lexicographic contexts. Among the main results of the first phase of analysis is the fact that a major part of football collocations are verbal, such as score goal and send the ball, which demonstrates that sport language is marked by the dynamics of actions and events that take place in a game. Besides, it was also possible to realize that nominal collocations are strongly connected to the materials, participants and places of football context. The second phase demonstrated that collocations in the scope of frame-based dictionaries act as lexical units, concept arising from FrameNet. Because they are considered lexical units, collocations are seen as frame evokers, thus characterizing them as terms that must be displayed in the main list of entries. However, it was also possible to note, however, that the frame evoking by collocations many times does not follow the traditional model of FrameNet, especially when it comes to nominal collocations that do not evoke events, but static entities, such as red card and classification table. The present thesis evidences the relevance of Frame Semantics and FrameNet for the study of complex units such as collocations in lexicographic contexts. Another aspect to be mentioned is the importance of the methodological resources of Corpus Linguistics to the area in which this study is included.
29

Composition sémantique pour la langue orale

Duvert, Frédéric 10 November 2010 (has links) (PDF)
La thèse présentée ici a pour but de proposer des systèmes de détection, de composition de constituants sémantiques et d'interprétation dans la compréhension de la langue naturelle parlée. Cette compréhension se base sur un système de reconnaissance automatique de la parole qui traduit les signaux oraux en énoncés utilisables par la machine. Le signal de la parole, ainsi transcrit, comporte un ensemble d'erreurs liées aux erreurs de reconnaissance (bruits, parasites, mauvaise prononciation...). L'interprétation de cet énoncé est d'autant plus difficile qu'il est issu d'un discours parlé, soumis à la disfluence du discours, aux auto-corrections... L'énoncé est de plus agrammatical, car le discours parlé lui-même est agrammatical. L'application de méthodes d'analyses grammaticales ne produit pas de bons résultats d'interprétation, sur des textes issus de transcriptions de la parole. L'utilisation de méthodes d'analyses syntaxiques profondes est à éviter. De ce fait, une analyse superficielle est envisagée. Un des premiers objectifs est de proposer une représentation du sens. Il s'agit de considérer des ontologies afin de conceptualiser le monde que l'on décrit. On peut exprimer les composants sémantiques en logique du premier ordre avec des prédicats. Dans les travaux décrits ici, nous représentons les éléments sémantiques par des frames (FrameNet ). Les structures de frames sont hiérarchisées, et sont des fragments de connaissances auxquels on peut insérer, fusionner ou inférer d'autres fragments de connaissances. Les structures de frames sont dérivables en formules logiques. Nous proposons un système de compréhension de la parole à partir de règles logiques avec le support d'une ontologie, afin de pouvoir créer des liens à partir de composants sémantiques. Puis, nous avons mené une étude sur la découverte des supports syntaxiques des relations sémantiques. Nous proposons une expérience de composition sémantique afin d'enrichir les composants sémantiques de base. Enfin, nous présentons un système de détection de lambda-expression pour mettre en hypothèse les relations à trouver à travers le discours
30

'Consider' and its Swedish equivalents in relation to machine translation

Andersson, Karin January 2007 (has links)
This study describes the English verb ’consider’ and the characteristics of some of its senses. An investigation of this kind may be useful, since a machine translation program, SYSTRAN, has invariably translated ’consider’ with the Swedish verbs ’betrakta’ (Eng: ’view’, regard’) and ’anse’ (Eng: ’regard’). This handling of ’consider’ is not satisfactory in all contexts. Since ’consider’ is a cogitative verb, it is fascinating to observe that both the theory of semantic primes and universals and conceptual semantics are concerned with cogitation in various ways. Anna Wierzbicka, who is one of the advocates of semantic primes and universals, argues that THINK should be considered as a semantic prime. Moreover, one of the prime issues of conceptual semantics is to describe how thoughts are constructed by virtue of e.g. linguistic components, perception and experience. In order to define and clarify the distinctions between the different senses, we have taken advantage of the theory of mental spaces. This thesis has been structured in accordance with the meanings that have been indicated in WordNet as to ’consider’. As a consequence, the senses that ’consider’ represents have been organized to form the subsequent groups: ’Observation’, ’Opinion’ together with its sub-group ’Likelihood’ and ’Cogitation’ followed by its sub-group ’Attention/Consideration’. A concordance tool, http://www.nla.se/culler, provided us with 90 literary quotations that were collected in a corpus. Afterwards, these citations were distributed between the groups mentioned above and translated into Swedish by SYSTRAN. Furthermore, the meanings as to ’consider’ have also been related to the senses, recorded by the FrameNet scholars. Here, ’consider’ is regarded as a verb of ’Cogitation’ and ’Categorization’. When this study was accomplished, it could be inferred that certain senses are connected to specific syntactic constructions. In other cases, however, the distinctions between various meanings can only be explained by virtue of semantics. To conclude, it appears to be likely that an implementation is facilitated if a specific syntactic construction can be tied to a particular sense. This may be the case concerning some meanings of ’consider’. Machine translation is presumably a much more laborious task, if one is solely governed by semantic conditions.

Page generated in 0.0364 seconds