• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 8
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 27
  • 27
  • 18
  • 15
  • 15
  • 14
  • 8
  • 7
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Elaboração textual via definição de entidades mencionadas e de perguntas relacionadas aos verbos em textos simplificados do português / Text elaboration through named entities definition and questions related to verbs in simplified portuguese texts

Amancio, Marcelo Adriano 15 June 2011 (has links)
Esta pesquisa aborda o tema da Elaboração Textual para um público alvo que tem letramento nos níveis básicos e rudimentar, de acordo com a classificação do Indicador Nacional de Alfabetismo Funcional (INAF, 2009). A Elaboração Textual é definida como um conjunto de técnicas que acrescentam material redundante em textos, sendo tradicionalmente usadas a adição de definições, sinônimos, antônimos, ou qualquer informação externa com o objetivo de auxiliar na compreensão do texto. O objetivo deste projeto de mestrado foi a proposta de dois métodos originais de elaboração textual: (1) via definição das entidades mencionadas que aparecem em um texto e (2) via definições de perguntas elaboradas direcionadas aos verbos das orações de um texto. Para a primeira tarefa, usou-se um sistema de reconhecimento de entidades mencionadas da literatura, o Rembrandt, e definições curtas da enciclopédia Wikipédia, sendo este método incorporado no sistema Web FACILITA EDUCATIVO, uma das ferramentas desenvolvidas no projeto PorSimples. O método foi avaliado de forma preliminar com um pequeno grupo de leitores com baixo nível de letramento e a avaliação foi positiva, indicando que este auxílio facilitou a leitura dos usuários da avaliação. O método de geração de perguntas elaboradas aos verbos de uma oração é uma tarefa nova que foi definida, estudada, implementada e avaliada neste mestrado. A avaliação não foi realizada junto ao público alvo e sim com especialistas em processamento de língua natural que avaliaram positivamente o método e indicaram quais erros influenciam negativamente na qualidade das perguntas geradas automaticamente. Existem boas indicações de que os métodos de elaboração desenvolvidos podem ser úteis na melhoria da compreensão da leitura para o público alvo em questão, as pessoas com baixo nível de letramento / This research addresses the topic of Textual Elaboration for low-literacy readers, i.e. people at the rudimentary and basic literacy levels according to the National Indicator of Functional Literacy (INAF, 2009). Text Elaboration consists of a set of techniques that adds extra material in texts using, traditionally, definitions, synonyms, antonyms, or any external information to assist in text understanding. The main goal of this research was the proposal of two methods of Textual Elaboration: (1) the use of short definitions for Named Entities in texts and (2) assignment of wh-questions related to verbs in text. The first task used the Rembrandt named entity recognition system and short definitions of Wikipedia. It was implemented in PorSimples web Educational Facilita tool. This method was preliminarily evaluated with a small group of low-literacy readers. The evaluation results were positive, what indicates that the tool was useful for improving the text understanding. The assignment of wh-questions related to verbs task was defined, studied, implemented and assessed during this research. Its evaluation was conducted with NLP researches instead of with low-literacy readers. There are good evidences that the text elaboration methods and resources developed here are useful in helping text understanding for low-literacy readers
22

Elaboração textual via definição de entidades mencionadas e de perguntas relacionadas aos verbos em textos simplificados do português / Text elaboration through named entities definition and questions related to verbs in simplified portuguese texts

Marcelo Adriano Amancio 15 June 2011 (has links)
Esta pesquisa aborda o tema da Elaboração Textual para um público alvo que tem letramento nos níveis básicos e rudimentar, de acordo com a classificação do Indicador Nacional de Alfabetismo Funcional (INAF, 2009). A Elaboração Textual é definida como um conjunto de técnicas que acrescentam material redundante em textos, sendo tradicionalmente usadas a adição de definições, sinônimos, antônimos, ou qualquer informação externa com o objetivo de auxiliar na compreensão do texto. O objetivo deste projeto de mestrado foi a proposta de dois métodos originais de elaboração textual: (1) via definição das entidades mencionadas que aparecem em um texto e (2) via definições de perguntas elaboradas direcionadas aos verbos das orações de um texto. Para a primeira tarefa, usou-se um sistema de reconhecimento de entidades mencionadas da literatura, o Rembrandt, e definições curtas da enciclopédia Wikipédia, sendo este método incorporado no sistema Web FACILITA EDUCATIVO, uma das ferramentas desenvolvidas no projeto PorSimples. O método foi avaliado de forma preliminar com um pequeno grupo de leitores com baixo nível de letramento e a avaliação foi positiva, indicando que este auxílio facilitou a leitura dos usuários da avaliação. O método de geração de perguntas elaboradas aos verbos de uma oração é uma tarefa nova que foi definida, estudada, implementada e avaliada neste mestrado. A avaliação não foi realizada junto ao público alvo e sim com especialistas em processamento de língua natural que avaliaram positivamente o método e indicaram quais erros influenciam negativamente na qualidade das perguntas geradas automaticamente. Existem boas indicações de que os métodos de elaboração desenvolvidos podem ser úteis na melhoria da compreensão da leitura para o público alvo em questão, as pessoas com baixo nível de letramento / This research addresses the topic of Textual Elaboration for low-literacy readers, i.e. people at the rudimentary and basic literacy levels according to the National Indicator of Functional Literacy (INAF, 2009). Text Elaboration consists of a set of techniques that adds extra material in texts using, traditionally, definitions, synonyms, antonyms, or any external information to assist in text understanding. The main goal of this research was the proposal of two methods of Textual Elaboration: (1) the use of short definitions for Named Entities in texts and (2) assignment of wh-questions related to verbs in text. The first task used the Rembrandt named entity recognition system and short definitions of Wikipedia. It was implemented in PorSimples web Educational Facilita tool. This method was preliminarily evaluated with a small group of low-literacy readers. The evaluation results were positive, what indicates that the tool was useful for improving the text understanding. The assignment of wh-questions related to verbs task was defined, studied, implemented and assessed during this research. Its evaluation was conducted with NLP researches instead of with low-literacy readers. There are good evidences that the text elaboration methods and resources developed here are useful in helping text understanding for low-literacy readers
23

Utilisation de représentations de mots pour l’étiquetage de rôles sémantiques suivant FrameNet

Léchelle, William 01 1900 (has links)
Dans la sémantique des cadres de Fillmore, les mots prennent leur sens par rapport au contexte événementiel ou situationnel dans lequel ils s’inscrivent. FrameNet, une ressource lexicale pour l’anglais, définit environ 1000 cadres conceptuels, couvrant l’essentiel des contextes possibles. Dans un cadre conceptuel, un prédicat appelle des arguments pour remplir les différents rôles sémantiques associés au cadre (par exemple : Victime, Manière, Receveur, Locuteur). Nous cherchons à annoter automatiquement ces rôles sémantiques, étant donné le cadre sémantique et le prédicat. Pour cela, nous entrainons un algorithme d’apprentissage machine sur des arguments dont le rôle est connu, pour généraliser aux arguments dont le rôle est inconnu. On utilisera notamment des propriétés lexicales de proximité sémantique des mots les plus représentatifs des arguments, en particulier en utilisant des représentations vectorielles des mots du lexique. / According to Frame Semantics (Fillmore 1976), word meanings are best understood considering the semantic frame they play a role in, for the frame is what gives them context. FrameNet is a lexical database that defines about 1000 semantic frames, along with the roles to be filled by arguments to the predicate calling the frame in a sentence. Our task is to automatically label argument roles, given their position, the frame, and the predicate (sometimes refered to as semantic role labelling). For this task, I make use of distributed word representations, in order to improve generalisation over the few training exemples available for each frame. A maximum entropy classifier using common features of the arguments is used as a strong baseline to be improved upon.
24

利用馬可夫邏輯網路模型與自動化生成的模板加強生醫文獻之語意角色標註 / Biomedical semantic role labeling with a Markov Logic network and automatically generated patterns

賴柏廷 Unknown Date (has links)
背景: 生醫文獻語意角色標註(Semantic Role Labeling, SRL)是一種自然語言處理的技術,其可用來將描述生物過程的語句以predicate-argument structures ( PASs ) 表示。SRL 經常受限於arguments的unbalance problem而且需要花費許多的時間和記憶體空間在學習 arguments 之間的相依性。 方法: 我們提出一Markov Logic Network ( MLN ) -based SRL之系統,且此系統使用自動化生成之SRL 模板同時辨識constituents與候選之語意角色。 結果及結論: 我們的方法在BioProp語料上來評估。實驗結果顯示我們的方法勝過目前最先進的系統。此外,使用SRL模板後,在時間及記憶體之花費上亦大幅的減少,而且我們自動化生成之模板亦能幫助建立這些模板。我們認為本論文提出之方法可以透過增加新的SRL模板例如:由生物學家手動寫的模板,而得到進一步的提升,而且本方法也為於需要處理大量SRL 語料時,提供一種可能的解法。 / Background: Biomedical semantic role labeling ( SRL ) is a natural language processing technique that expresses the sentences that describe biological processes as predicate-argument structures ( PASs ) . SRL usually suffers from the unbalanced problem of arguments and consuming time and memory on learning the dependencies between the arguments. Method: We constructed a Markov Logic Network ( MLN ) -based SRL system, and the system uses SRL patterns, which utilizes automatically generated approaches, to simultaneously recognize the constituents and candidates of semantic roles. Results and conclusions: Our method is evaluated on the BioProp corpus. The experimental result shows that our method outperforms the state-of-the-art system. Furthermore, after applying SRL patterns, the costs of the time and memory are greatly reduced, and our automatically generated patterns are helpful in the development of these patterns. We consider that our method can be further improved by adding new SRL patterns such as biological experts manually written patterns and it also provide a possible solution to process large SRL corpus.
25

Verbittömät tapahtumanilmaukset:suunnannäyttäjinä LÄHDE- ja KOHDE-konstruktio

Västi, K. (Katja) 28 November 2012 (has links)
Abstract My thesis discusses the semantic and syntactic properties of Finnish verbless expressions of events in the light of two examples, the SOURCE construction (e.g. Oikeusasiamieheltä huomautus lit. ‘From the ombudsman a complaint’) and the GOAL construction (e.g. Pakistanille rangaistus lit. ‘For Pakistan a punishment’). I show that they express dynamic events in a way comparable to finite constructions. However, they are not elliptic structures but genuinely verbless constructions. The study is founded on Cognitive Construction Grammar and frame semantics, and the analysis is based on two types of data and two methods. First, I collected three sets of 500 newspaper headlines mostly from the Finnish Language Bank (the SOURCE construction, the GOAL construction, and several different verbless constructions), and analyzed them intuitively. Second, I collected paraphrases of 15 instances of the SOURCE construction and 20 instances of the GOAL construction in order to obtain information on how other native speakers of Finnish construe the data which I analyzed intuitively by myself. As a result of this experimental semantic test, I had 169–215 paraphrases per instance of the SOURCE construction and 133–165 paraphrases per instance of the GOAL construction. I then analyzed these paraphrases both semantically and syntactically. The thesis comprises four articles. In the article A case in search of an independent life: The semantics of the initial allative in a Finnish verbless construction, I show that the GOAL construction is polysemous and define eight senses for it. With the help of the established senses, I also justify analyzing the construction as an independent argument structure construction and not as an elliptic structure. The article Elävä LÄHDE: Alkuasemaisen ablatiivin merkitystyypit verbittömässä konstruktiossa is a similar treatment of the SOURCE construction for which I define four senses. In the article Mihin verbittömien konstruktioiden merkitystyypit perustuvat? Skemaattiset ja polyseemiset tapahtumanilmaukset, I connect the previous results explicitly with my theoretical framework. I provide an explanation for these constructions’ polysemy and mechanisms of expressing events. Usually, both of these properties of argument structure constructions are associated with verbs. In the joint article Semantic roles and verbless constructions: A Finnish challenge for verb-centered approaches written with Seppo Kittilä, the discussion is extended to cover also other Finnish verbless constructions. The article provides a more theoretical perspective on the topic: we link the dynamic meanings of the constructions to the concept of semantic roles and argue for the view that the concept should be divided into the concepts of argument roles and participant roles. / Tiivistelmä Käsittelen tutkimuksessani suomen kielen verbittömien tapahtumanilmausten semanttisia ja syntaktisia ominaisuuksia kahden esimerkkitapauksen avulla: LÄHDE-konstruktion (esim. Oikeusasiamieheltä huomautus) ja KOHDE-konstruktion (esim. Pakistanille rangaistus). Osoitan, että ne ilmaisevat dynaamisia tapahtumia siinä missä verbilliset konstruktiotkin mutta eivät kuitenkaan ole elliptisiä rakenteita vaan aidosti verbittömiä. Tutkimukseni rakentuu kognitiivisen konstruktiokieliopin ja kehyssemantiikan sekä kahdenlaisen aineiston ja menetelmän varaan. Ensinnäkin koostin pääasiassa Kielipankin osakokoelmista mutta osin myös muista lähteistä kolme 500 otsikon kokoista toteutuma-aineistoa (LÄHDE-konstruktio, KOHDE-konstruktio ja useat erilaiset verbittömät konstruktiot). Niiden analyysi perustuu intuitiiviseen semanttiseen luokitteluun. LÄHDE- ja KOHDE-konstruktiosta käytössäni olivat lisäksi kokeellisen semantiikan tuottamat parafraasiaineistot: selvitin, miten koehenkilöt tulkitsevat 15 LÄHDE-konstruktion ja 20 KOHDE-konstruktion intuitiivisesti analysoimaani toteutumaa. LÄHDE-konstruktion testi tuotti 169–215 parafraasia otsikkoa kohden, KOHDE-konstruktion 133–165. Analysoin ne sekä semanttisesti että syntaktisesti. Tutkimukseni koostuu neljästä artikkelista. Artikkelissa A case in search of an independent life. The semantics of the initial allative in a Finnish verbless construction esitän, että KOHDE-konstruktio on polyseeminen. Määrittelen sille kahdeksan merkitystyyppiä, joiden avulla myös perustelen, miksi kyseessä on itsenäinen argumenttirakennekonstruktio eikä elliptinen rakenne. Artikkelissa Elävä LÄHDE. Alkuasemaisen ablatiivin merkitystyypit verbittömässä konstruktiossa käsittelen vastaavasti LÄHDE-konstruktiota, jolle määrittelen neljä merkitystyyppiä. Artikkelissa Mihin verbittömien konstruktioiden merkitystyypit perustuvat? Skemaattiset ja polyseemiset tapahtumanilmaukset kytken tulokset selvemmin teoreettiseen viitekehykseeni. Selitän, miksi tarkastelemani konstruktiot ilmaisevat tapahtumia ja ovat polyseemisia, vaikka niissä ei ole verbiä, johon sekä tapahtuman ilmaiseminen että argumenttirakennekonstruktion polyseemisyys yleensä yhdistetään. Seppo Kittilän kanssa kirjoittamassani yhteisartikkelissa Semantic roles and verbless constructions. A Finnish challenge for verb-centered approaches laajennamme suomen verbittömien konstruktioiden tarkastelun LÄHDE- ja KOHDE-konstruktion ulkopuolelle ja yhä teoreettisemmalle tasolle. Liitämme verbittömien konstruktioiden tapahtumamerkitykset semanttisen roolin käsitteeseen sekä perustelemme, miksi se pitäisi jakaa argumentti- ja osallistujaroolin käsitteiksi.
26

Предикати перцепције у руском и српском језику / Predikati percepcije u ruskom i srpskom jeziku / Predicates of Perception in Russian and Serbian

Popović Dragana 23 June 2016 (has links)
<p>Ovim se istraživanjem na primeru osnovnih predikata (glagola) percepcije ruskog i srpskog jezika odgovara na pitanja vezana za sistemske odnose u oblasti leksike, klasifikaciju jezičkih jedinica, definisanje leksema, međusobnu zavisnost značenja leksema i njihovih morfolo&scaron;kih i sintaksičkih obeležja. Osnovni predikati (glagoli) percepcije ruskog i srpskog jezika pozicioniraju se unutar semantičkih paradigmi, zasnovanih na interakciji diferencijalnih i zajedničkih komponenata značenja svojih članova. Članovi paradigmi izdvajaju se na osnovu kriterijuma određenih u skladu s principima organizacije centra i periferije leksičkog sistema. Pozicioniranje izdvojenih predstavnika vizuelne, auditivne, olfaktorne, gustativne i taktilne percepcije, kao i njihovih vidskih korelata, rezultira utvrđivanjem strukture paradigmi i smerova semantičke derivacije u njima.</p> / <p>This dissertation focuses on systemic relationships among the basic predicates (verbs) of perception in Russian and Serbian. It investigates issues related to the lexicon, the classification of linguistic units, the relationships between the meanings of lexemes and their morphological and syntactic features, as well as the definition of the main members of the analysed lexico-semantic group. The basic predicates of perception in Russian and Serbian are positioned within the semantic paradigms, based on the interaction of differential and general components of meaning of their members. The members of the paradigms are selected based on criteria established in accordance with the principle of the organization of lexical systems into core and periphery. The positioning of the selected representatives of visual, auditory, olfactory, gustative and tactile perception, as well as their aspectual correlates, results in determining the structure of the paradigms and the directions of semantic derivation in them.</p>
27

BERTie Bott’s Every Flavor Labels : A Tasty Guide to Developing a Semantic Role Labeling Model for Galician

Bruton, Micaella January 2023 (has links)
For the vast majority of languages, Natural Language Processing (NLP) tools are either absent entirely, or leave much to be desired in their final performance. Despite having nearly 4 million speakers, one such low-resource language is Galician. In an effort to expand available NLP resources, this project sought to construct a dataset for Semantic Role Labeling (SRL) and produce a baseline for future research to use in comparisons. SRL is a task which has shown success in amplifying the final output for various NLP systems, including Machine Translation and other interactive language models. This project was successful in that fact and produced 24 SRL models and two SRL datasets; one Galician and one Spanish. mBERT and XLM-R were chosen as the baseline architectures; additional models were first pre-trained on the SRL task in a language other than the target to measure the effects of transfer-learning. Scores are reported on a scale of 0.0-1.0. The best performing Galician SRL model achieved an f1 score of 0.74, introducing a baseline for future Galician SRL systems. The best performing Spanish SRL model achieved an f1 score of 0.83, outperforming the baseline set by the 2009 CoNLL Shared Task by 0.025. A pre-processing method, verbal indexing, was also introduced which allowed for increased performance in the SRL parsing of highly complex sentences; effects were amplified in scenarios where the model was both pre-trained and fine-tuned on datasets utilizing the method, but still visible even when only used during fine-tuning. / För de allra flesta språken saknas språkteknologiska verktyg (NLP) helt, eller för dem de var i finns tillgängliga är dessa verktygs prestanda minst sagt, sämre än medelmåttig. Trots sina nästan 4 miljoner talare, är galiciska ett språk med brist på tillräckliga resurser. I ett försök att utöka tillgängliga NLP-resurser för språket, konstruerades i detta projekt en uppsättning data för så kallat Semantic Role Labeling (SRL) som sedan användes för att utveckla grundläggande SRL-modeller att falla tillbaka på och jämföra  med i framtida forskning. SRL är en uppgift som har visat framgång när det gäller att förstärka slutresultatet för olika NLP-system, inklusive maskinöversättning och andra interaktiva språkmodeller. I detta avseende visade detta projekt på framgång och som del av det utvecklades 24 SRL-modeller och två SRL-datauppsåttningar; en galicisk och en spansk. mBERT och XLM-R valdes som baslinjearkitekturer; ytterligare modeller tränades först på en SRL-uppgift på ett språk annat än målspråket för att mäta effekterna av överföringsinlärning (Transfer Learning) Poäng redovisas på en skala från 0.0-1.0. Den galiciska SRL-modellen med bäst prestanda uppnådde ett f1-poäng på 0.74, vilket introducerar en baslinje för framtida galiciska SRL-system. Den bästa spanska SRL-modellen uppnådde ett f1-poäng på 0.83, vilket överträffade baslinjen +0.025 som sattes under CoNLL Shared Task 2009. I detta projekt introduceras även en ny metod för behandling av lingvistisk data, så kallad verbalindexering, som ökade prestandan av mycket komplexa meningar. Denna prestandaökning först märktes ytterligare i de scenarier och är en modell både förtränats och finjusterats på uppsättningar data som behandlats med metoden, men visade även på märkbara förbättringar då en modell endast genomgått finjustering. / Para la gran mayoría de los idiomas, las herramientas de procesamiento del lenguaje natural (NLP) están completamente ausentes o dejan mucho que desear en su desempeño final. A pesar de tener casi 4 millones de hablantes, el gallego continúa siendo un idioma de bajos recursos. En un esfuerzo por expandir los recursos de NLP disponibles, el objetivo de este proyecto fue construir un conjunto de datos para el Etiquetado de Roles Semánticos (SRL) y producir una referencia para que futuras investigaciones puedan utilizar en sus comparaciones. SRL es una tarea que ha tenido éxito en la amplificación del resultado final de varios sistemas NLP, incluida la traducción automática, y otros modelos de lenguaje interactivo. Este proyecto fue exitoso en ese hecho y produjo 24 modelos SRL y dos conjuntos de datos SRL; uno en gallego y otro en español. Se eligieron mBERT y XLM-R como las arquitecturas de referencia; previamente se entrenaron modelos adicionales en la tarea SRL en un idioma distinto al idioma de destino para medir los efectos del aprendizaje por transferencia. Las puntuaciones se informan en una escala de 0.0 a 1.0. El modelo SRL gallego con mejor rendimiento logró una puntuación de f1 de 0.74, introduciendo un objetivo de referencia para los futuros sistemas SRL gallegos. El modelo español de SRL con mejor rendimiento logró una puntuación de f1 de 0.83, superando la línea base establecida por la Tarea Compartida CoNLL de 2009 en 0.025. También se introdujo un método de preprocesamiento, indexación verbal, que permitió un mayor rendimiento en el análisis SRL de oraciones muy complejas; los efectos se amplificaron cuando el modelo primero se entrenó y luego se ajustó con los conjuntos de datos que utilizaban el método, pero los efectos aún fueron visibles incluso cuando se lo utilizó solo durante el ajuste.

Page generated in 0.0904 seconds