• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 20
  • 11
  • 4
  • 3
  • 1
  • 1
  • Tagged with
  • 45
  • 19
  • 17
  • 17
  • 13
  • 10
  • 10
  • 9
  • 9
  • 8
  • 7
  • 7
  • 6
  • 6
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Resolução de correferência em múltiplos documentos utilizando aprendizado não supervisionado / Co-reference resolution in multiples documents through unsupervised learning

Jefferson Fontinele da Silva 05 May 2011 (has links)
Um dos problemas encontrados em sistemas de Processamento de Línguas Naturais (PLN) é a dificuldade de se identificar que elementos textuais referem-se à mesma entidade. Esse fenômeno, no qual o conjunto de elementos textuais remete a uma mesma entidade, é denominado de correferência. Sistemas de resolução de correferência podem melhorar o desempenho de diversas aplicações do PLN, como: sumarização, extração de informação, sistemas de perguntas e respostas. Recentemente, pesquisas em PLN têm explorado a possibilidade de identificar os elementos correferentes em múltiplos documentos. Neste contexto, este trabalho tem como foco o desenvolvimento de um método aprendizado não supervisionado para resolução de correferência em múltiplos documentos, utilizando como língua-alvo o português. Não se conhece, até o momento, nenhum sistema com essa finalidade para o português. Os resultados dos experimentos feitos com o sistema sugerem que o método desenvolvido é superior a métodos baseados em concordância de cadeias de caracteres / One of the problems found in Natural Language Processing (NLP) systems is the difficulty of identifying textual elements that refer to the same entity. This phenomenon, in which the set of textual elements refers to a single entity, is called coreference. Coreference resolution systems can improve the performance of various NLP applications, such as automatic summarization, information extraction systems, question answering systems. Recently, research in NLP has explored the possibility of identifying the coreferent elements in multiple documents. In this context, this work focuses on the development of an unsupervised method for coreference resolution in multiple documents, using Portuguese as the target language. Until now, it is not known any system for this purpose for the Portuguese. The results of the experiments with the system suggest that the developed method is superior to methods based on string matching
32

Populando ontologias através de informações em HTML - o caso do currículo lattes / Populating ontologies using HTML information - the currículo lattes case

André Casado Castaño 06 May 2008 (has links)
A Plataforma Lattes é, hoje, a principal base de currículos dos pesquisadores brasileiros. Os currículos da Plataforma Lattes armazenam de forma padronizada dados profissionais, acadêmicos, de produções bibliográficas e outras informações dos pesquisadores. Através de uma base de Currículos Lattes, podem ser gerados vários tipos de relatórios consolidados. As ferramentas existentes da Plataforma Lattes não são capazes de detectar alguns problemas que aparecem na geração dos relatórios consolidados como duplicidades de citações ou produções bibliográficas classificadas de maneiras distintas por cada autor, gerando um número total de publicações errado. Esse problema faz com que os relatórios gerados necessitem ser revistos pelos pesquisadores e essas falhas deste processo são a principal inspiração deste projeto. Neste trabalho, utilizamos como fonte de informações currículos da Plataforma Lattes para popular uma ontologia e utilizá-la principalmente como uma base de dados a ser consultada para geração de relatórios. Analisamos todo o processo de extração de informações a partir de arquivos HTML e seu posterior processamento para inserí-las corretamente dentro da ontologia, de acordo com sua semântica. Com a ontologia corretamente populada, mostramos também algumas consultas que podem ser realizadas e fazemos uma análise dos métodos e abordagens utilizadas em todo processo, comentando seus pontos fracos e fortes, visando detalhar todas as dificuldades existentes no processo de população (instanciação) automática de uma ontologia. / Lattes Platform is the main database of Brazilian researchers resumés in use nowadays. It stores in a standardized form professional, academic, bibliographical productions and other data from these researchers. From these Lattes resumés database, several types of reports can be generated. The tools available for Lattes platform are unable to detect some of the problems that emerge when generating consolidated reports, such as citation duplicity or bibliographical productions misclassified by their authors, generating an incorrect number of publications. This problem demands a revision performed by the researcher on the reports generated, and the flaws of this process are the main inspiration for this project. In this work we use the Lattes platform resumés database as the source for populating an ontology that is intended to be used to generate reports. We analyze the whole process of information gathering from HTML files and its post-processing to insert them correctly in the ontology, according to its semantics. With this ontology correctly populated, we show some new reports that can be generated and we perform also an analysis of the methods and approaches used in the whole process, highlighting their strengths and weaknesses, detailing the dificulties faced in the automated populating process (instantiation) of an ontology.
33

Event Centric Approaches in Natural Language Processing / 自然言語処理におけるイベント中心のアプローチ

Huang, Yin Jou 26 July 2021 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23438号 / 情博第768号 / 新制||情||131(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 黒橋 禎夫, 教授 河原 達也, 教授 伊藤 孝行 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
34

Extracting Clinical Event Timelines : Temporal Information Extraction and Coreference Resolution in Electronic Health Records / Création de Chronologies d'Événements Médicaux : Extraction d'Informations Temporelles et Résolution de la Coréférence dans les Dossiers Patients Électroniques

Tourille, Julien 18 December 2018 (has links)
Les dossiers patients électroniques contiennent des informations importantes pour la santé publique. La majeure partie de ces informations est contenue dans des documents rédigés en langue naturelle. Bien que le texte texte soit pertinent pour décrire des concepts médicaux complexes, il est difficile d'utiliser cette source de données pour l'aide à la décision, la recherche clinique ou l'analyse statistique.Parmi toutes les informations cliniques intéressantes présentes dans ces dossiers, la chronologie médicale du patient est l'une des plus importantes. Être capable d'extraire automatiquement cette chronologie permettrait d'acquérir une meilleure connaissance de certains phénomènes cliniques tels que la progression des maladies et les effets à long-terme des médicaments. De plus, cela permettrait d'améliorer la qualité des systèmes de question--réponse et de prédiction de résultats cliniques. Par ailleurs, accéder aux chronologiesmédicales est nécessaire pour évaluer la qualité du parcours de soins en le comparant aux recommandations officielles et pour mettre en lumière les étapes de ce parcours auxquelles une attention particulière doit être portée.Dans notre thèse, nous nous concentrons sur la création de ces chronologies médicales en abordant deux questions connexes en traitement automatique des langues: l'extraction d'informations temporelles et la résolution de la coréférence dans des documents cliniques.Concernant l'extraction d'informations temporelles, nous présentons une approche générique pour l'extraction de relations temporelles basée sur des traits catégoriels. Cette approche peut être appliquée sur des documents écrits en anglais ou en français. Puis, nous décrivons une approche neuronale pour l'extraction d'informations temporelles qui inclut des traits catégoriels.La deuxième partie de notre thèse porte sur la résolution de la coréférence. Nous décrivons une approche neuronale pour la résolution de la coréférence dans les documents cliniques. Nous menons une étude empirique visant à mesurer l'effet de différents composants neuronaux, tels que les mécanismes d'attention ou les représentations au niveau des caractères, sur la performance de notre approche. / Important information for public health is contained within Electronic Health Records (EHRs). The vast majority of clinical data available in these records takes the form of narratives written in natural language. Although free text is convenient to describe complex medical concepts, it is difficult to use for medical decision support, clinical research or statistical analysis.Among all the clinical aspects that are of interest in these records, the patient timeline is one of the most important. Being able to retrieve clinical timelines would allow for a better understanding of some clinical phenomena such as disease progression and longitudinal effects of medications. It would also allow to improve medical question answering and clinical outcome prediction systems. Accessing the clinical timeline is needed to evaluate the quality of the healthcare pathway by comparing it to clinical guidelines, and to highlight the steps of the pathway where specific care should be provided.In this thesis, we focus on building such timelines by addressing two related natural language processing topics which are temporal information extraction and clinical event coreference resolution.Our main contributions include a generic feature-based approach for temporal relation extraction that can be applied to documents written in English and in French. We devise a neural based approach for temporal information extraction which includes categorical features.We present a neural entity-based approach for coreference resolution in clinical narratives. We perform an empirical study to evaluate how categorical features and neural network components such as attention mechanisms and token character-level representations influence the performance of our coreference resolution approach.
35

Coreference Resolution for Swedish / Koreferenslösning för svenska

Vällfors, Lisa January 2022 (has links)
This report explores possible avenues for developing coreference resolution methods for Swedish. Coreference resolution is an important topic within natural language processing, as it is used as a preprocessing step in various information extraction tasks. The topic has been studied extensively for English, but much less so for smaller languages such as Swedish. In this report we adapt two coreference resolution algorithms that were originally used for English, for use on Swedish texts. One algorithm is entirely rule-based, while the other uses machine learning. We have also annotated a Swedish dataset to be used for training and evaluation. Both algorithms showed promising results and as none clearly outperformed the other we can conclude that both would be good candidates for further development. For the rule-based algorithm more advanced rules, especially ones that could incorporate some semantic knowledge, was identified as the most important avenue of improvement. For the machine learning algorithm more training data would likely be the most beneficial. For both algorithms improved detection of mention spans would also help, as this was identified as one of the most error-prone components. / I denna rapport undersöks möjliga metoder för koreferenslösning för svenska. Koreferenslösning är en viktig uppgift inom språkteknologi, eftersom det utgör ett första steg i många typer av informationsextraktion. Uppgiften har studerats utförligt för flera större språk, framförallt engelska, men är ännu relativt outforskad för svenska och andra mindre språk. I denna rapport har vi anpassat två algoritmer som ursprungligen utvecklades för engelska för användning på svensk text. Den ena algoritmen bygger på maskininlärning och den andra är helt regelbaserad. Vi har också annoterat delar av Talbankens korpus med koreferensrelationer, för att användas för träning och utvärdering av koreferenslösningsalgoritmer. Båda algoritmerna visade lovande resultat, och ingen var tydligt bättre än den andra. Bägge vore därför lämpliga alternativ för vidareutveckling. För ML-algoritmen vore mer träningsdata den viktigaste punkten för förbättring, medan den regelbaserade algoritmen skulle kunna förbättras med mer komplexa regler, för att inkorporera exempelvis semantisk information i besluten. Ett annat viktigt utvecklingsområde är identifieringen av de fraser som utvärderas för möjlig koreferens, eftersom detta steg introducerade många fel i bägge algoritmerna.
36

L'acquisition de la coréférence chez les enfants ayant un trouble développemental du langage : revue méta-analytique des facteurs influençant ce phénomène

Murphy-Pilon, Joanie 07 1900 (has links)
Le présent projet vise à mieux comprendre les difficultés reliées à l’acquisition de la coréférence chez les enfants francophones présentant un trouble développemental du langage (TDL) et à déterminer les différents facteurs influençant son acquisition et sa maitrise. La définition actuelle du TDL indique qu’il s’agit d’une difficulté du langage oral qui affecte à la fois la compréhension et l’expression. Il s’agit d’un trouble neurodéveloppemental caractérisé par des retards développementaux très variables dans une ou plusieurs sphères langagières. Deux théories sont vues en détail : la théorie de la complexité des structures syntaxiques (van der Lely et Stollwerck, 1997) et la théorie du déficit de la mémoire de travail (Montgomery et Evans, 2009). La première propose que les difficultés d’utilisation de la coréférence soient dues à la représentation innée de la syntaxe qui serait immature pour les enfants TDL et, en particulier, le principe B qui ne serait pas acquis. En revanche, Montgomery et Evans soutiennent que cette difficulté de compréhension et d’utilisation provient d’une limitation quant à la mémoire de travail plus précisément avec l’allocation et la capacité des ressources attentionnelles. Nous concluons que les différents facteurs influençant l’acquisition de la coréférence chez les enfants ayant un TDL sont les suivants : premièrement, l’enfant doit posséder les connaissances lexicales et sémantiques reliées aux pronoms et aux anaphores ; deuxièmement, l’enfant doit acquérir des connaissances syntaxiques afin de connaitre les antécédents possibles pour les pronoms et les anaphores ainsi que les règles les reliant. Finalement, la mémoire de travail et l’allocation et la capacité des ressources mentales jouent un rôle important dans la résolution des anaphores. Il est donc clair, selon nous, que les théories ne sont pas totalement suffisantes pour expliquer les troubles de la coréférence, mais qu’elles permettent d’expliquer en partie d’autres types de problèmes qui sont nécessaires pour la résolution de l’anaphore. La résolution des anaphores est un phénomène important surtout chez les enfants francophones puisqu’il s’agirait d’un marqueur clinique du trouble en français. / This project aims to understand the difficulties related to the acquisition of coreference of French speaking children with developmental language disorder (DLD) and to determine the various factors influencing its acquisition and mastery. The current definition of DLD indicates that it is a spoken language difficulty that affects both comprehension and expression. It is a neurodevelopmental disorder characterized by highly variable developmental delays in one or more language spheres. Two theories are seen in detail: the computational grammatical complexity (CGC) hypothesis (van der Lely and Stollwerck, 1997) and the working memory–based account (Montgomery and Evans, 2009). The CGC theory proposes that the difficulties of using coreference are due to the innate representation of the syntax which would be immature for DLD children and particularly the principle B which would not be acquired. In contrast, Montgomery and Evans argue that this difficulty in understanding and using anaphoras stems from a limitation in working memory and specifically with the allocation and capacity of attentional resources. Different factors influencing the acquisition of coreference in children with are discussed. First, the child must have lexical and semantic knowledge related to pronouns and anaphoras. Second, the child must acquire syntactic knowledge in order to know the possible antecedents for pronouns and anaphoras as well as the rules connecting them. Finally, working memory and the allocation and capacity of mental resources play an important role in the resolution of anaphoras. It is thus clear, according to us, that the theories are not totally sufficient to explain the deficit in coreference, but they partly explain other types of problems which are necessary for the resolution of the anaphora. The resolution of anaphoras is an important phenomenon for a good understanding of developmental language disorder, especially in French speaking children, since it is a clinical marker of the disorder in French.
37

CorrefSum: revisão da coesão referencial em sumários extrativos

Gonçalves, Patrícia Nunes 28 February 2008 (has links)
Made available in DSpace on 2015-03-05T13:59:43Z (GMT). No. of bitstreams: 0 Previous issue date: 28 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Com o avanço da Internet, cada vez mais convivemos com a sobrecarga de informação. É nesse contexto que a área de sumarização automática de textos tem se tornado uma área proeminente de pesquisa. A sumarização é o processo de discernir as informações mais importantes dos textos para produzir uma versão resumida. Sumarizadores extrativos escolhem as sentenças mais relevantes do texto e as reagrupam para formar o sumário. Muitas vezes, as frases selecionadas do texto não preservam a coesão referencial necessária para o entendimento do texto. O foco deste trabalho é, portanto, na análise e recuperação da coesão referencial desses sumários. O objetivo é desenvolver um sistema que realiza a manutenção da coesão referencial dos sumários extrativos usando como fonte de informação as cadeias de correferência presentes no texto-fonte. Para experimentos e avaliação dos resultados foram utilizados dois sumarizadores: Gist-Summ e SuPor-2. Foram utilizadas duas formas de avaliação: automática e subjetiva. Os resultados / With the advance of Internet technology we see the problem of information overload. In this context, automatic summarization is an important research area. Summarization is the process of identifying the most relevant information brought about in a text and on that basis to rewrite a short version of it. Extractive summarizers choose the most relevant sentences in a text and regroup them to form the summary. Usually the juxtaposition of the selected sentences violate the referential cohesion that is needed for the interpretation of the text. This work focuses on the analysis and recovery of referential cohesion of extractive summaries on the basis of knowledge about correference chains as presented in the source text. Some experiments were undertaken considering the summarizers GistSumm and SuPor-2. Evaluation was done in two ways, automatically and subjectively. The results indicate that this is a promising area of work and ways of advancing in this research are discussed
38

Entity-based coherence in statistical machine translation : a modelling and evaluation perspective

Wetzel, Dominikus Emanuel January 2018 (has links)
Natural language documents exhibit coherence and cohesion by means of interrelated structures both within and across sentences. Sentences do not stand in isolation from each other and only a coherent structure makes them understandable and sound natural to humans. In Statistical Machine Translation (SMT) only little research exists on translating a document from a source language into a coherent document in the target language. The dominant paradigm is still one that considers sentences independently from each other. There is both a need for a deeper understanding of how to handle specific discourse phenomena, and for automatic evaluation of how well these phenomena are handled in SMT. In this thesis we explore an approach how to treat sentences as dependent on each other by focussing on the problem of pronoun translation as an instance of a discourse-related non-local phenomenon. We direct our attention to pronoun translation in the form of cross-lingual pronoun prediction (CLPP) and develop a model to tackle this problem. We obtain state-of-the-art results exhibiting the benefit of having access to the antecedent of a pronoun for predicting the right translation of that pronoun. Experiments also showed that features from the target side are more informative than features from the source side, confirming linguistic knowledge that referential pronouns need to agree in gender and number with their target-side antecedent. We show our approach to be applicable across the two language pairs English-French and English-German. The experimental setting for CLPP is artificially restricted, both to enable automatic evaluation and to provide a controlled environment. This is a limitation which does not yet allow us to test the full potential of CLPP systems within a more realistic setting that is closer to a full SMT scenario. We provide an annotation scheme, a tool and a corpus that enable evaluation of pronoun prediction in a more realistic setting. The annotated corpus consists of parallel documents translated by a state-of-the-art neural machine translation (NMT) system, where the appropriate target-side pronouns have been chosen by annotators. With this corpus, we exhibit a weakness of our current CLPP systems in that they are outperformed by a state-of-the-art NMT system in this more realistic context. This corpus provides a basis for future CLPP shared tasks and allows the research community to further understand and test their methods. The lack of appropriate evaluation metrics that explicitly capture non-local phenomena is one of the main reasons why handling non-local phenomena has not yet been widely adopted in SMT. To overcome this obstacle and evaluate the coherence of translated documents, we define a bilingual model of entity-based coherence, inspired by work on monolingual coherence modelling, and frame it as a learning-to-rank problem. We first evaluate this model on a corpus where we artificially introduce coherence errors based on typical errors CLPP systems make. This allows us to assess the quality of the model in a controlled environment with automatically provided gold coherence rankings. Results show that this model can distinguish with high accuracy between a human-authored translation and one with coherence errors, that it can also distinguish between document pairs from two corpora with different degrees of coherence errors, and that the learnt model can be successfully applied when the test set distribution of errors comes from a different one than the one from the training data, showing its generalization potentials. To test our bilingual model of coherence as a discourse-aware SMT evaluation metric, we apply it to more realistic data. We use it to evaluate a state-of-the-art NMT system against post-editing systems with pronouns corrected by our CLPP systems. For verifying our metric, we reuse our annotated parallel corpus and consider the pronoun annotations as proxy for human document-level coherence judgements. Experiments show far lower accuracy in ranking translations according to their entity-based coherence than on the artificial corpus, suggesting that the metric has difficulties generalizing to a more realistic setting. Analysis reveals that the system translations in our test corpus do not differ in their pronoun translations in almost half of the document pairs. To circumvent this data sparsity issue, and to remove the need for parameter learning, we define a score-based SMT evaluation metric which directly uses features from our bilingual coherence model.
39

Relações entre memória procedimental e linguagem em pessoas que gaguejam: um estudo com base no processamento da correferência anafórica em português brasileiro

Correia, Débora Vasconcelos 26 March 2014 (has links)
Made available in DSpace on 2015-05-14T12:43:03Z (GMT). No. of bitstreams: 1 arquivototal.pdf: 1310227 bytes, checksum: 04da33952d4cd23496fa53ef618ff840 (MD5) Previous issue date: 2014-03-26 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / This dissertation aims to explain how is the processing of coreference in people who stutter (PWS), reflecting on the possibility of an association between stuttering and the presence of difficulties in procedural memory, from the relationship between Alm's Dual Premotor Model (2005) and Ullman´s Declarative/Procedural Model (2001). It is proposed, then, a hypothesis about the connection between the presence of dysfunctions in procedural memory and the linguistic processing of PQG, which was investigated through the ASRT test (Alternating Serial Reaction Time) of procedural memory and two experiments of self-paced reading to the investigation of the phenomenon of inter and intrasentential coreference. In the ASRT test (experiment 1) performed to measure the degree of implicit learning of the participants, the findings suggested a tendency of the groups (PQC and FF) to behave distinctively. PQG showed a pattern of ascending curve, with a positive Spearman's coefficient for the variable cycle, expressing an increase in time of reaction as it increased the number of cycles (stimuli). Which we interpreted as a possible difficulty in the PQG in implicit learning of motor sequences. And the FF showed a descending curve, confirmed by a negative Spearman's coefficient for the variable cycle. Demonstrating that the procedural learning for this group occurred quickly, i.e., the reaction time of the FF reduced as there was an increase in the number of cycles. With these indications that PQG present difficulties in procedural memory, which could interfere in the processing of grammatical aspects according to our hypothesis, we set out to the investigation of the linguistic processing. In experiment 2, the intersentential coreference, performed with the aim at investigating the processing of lexical pronoun (PR) and the repeated name (NR) in the object position between FF and PQC, the results showed that there is no difference in this type of processing between FF and PQC, since both groups showed similar patterns in the average reading time of the critical segment. However, there were a significant effect for the variable tipo de retomada, showing that PR are processed faster than the NR, as previously found by Leitão (2005). Thus, in order to investigate how was grammar functioning in PQG and to attest the hypothesis defended in this dissertation, we set out to the analysis of the phenomenon of coreference in the intrassentential level, in order to isolate the grammatical aspect and eliminate possible interference from the pragmatic and contextual factors. The results pointed to the absence of main effect for the variable group, however, we found a marginally significant interaction effect between the variables group and type of sentence. This interaction can be explained by the fact that the groups react differently to the conditions, departing from the observation that there is an inverse behavior between them, i.e., to the extent that FF are faster in the grammatical condition and slower in agramatical condition, PQG show the opposite pattern. Which corroborates our hypothesis that PQG would have difficulties in perception of breach of grammatical principle. This possibility, confirmed by the statistical evidence foreseen for our findings with the increase of sample, that it directs our search for rejecting the null hypothesis. / Esta dissertação tem por objetivo explanar como se dá o processamento da correferência em pessoas que gaguejam (PQG), refletindo sobre a possibilidade de associação entre a gagueira e a presença de dificuldades na memória procedimental, a partir da relação entre o Modelo Pré-Motor Duplo de Alm (2005) e o Modelo Declarativo/Procedimental de Ullman (2001). Lança-se, então, uma hipótese acerca da conexão entre a presença de disfunções na memória procedimental e o processamento linguístico das PQG, investigada por meio do teste ASRT (Alternating Serial Reaction Time) de memória procedimental e dois experimentos de leitura automonitorada para a investigação do fenômeno da correferência inter e intrassentencial. No teste ASRT (experimento 1) realizado para medir o grau de aprendizagem implícita dos participantes, os resultados encontrados apontaram para uma tendência dos grupos (PQG e FF) a comportarem-se de maneira distinta. As PQG evidenciaram um padrão de curva ascendente, com coeficiente de Spearman positivo para a variável ciclo, expressando um aumento do tempo de reação à medida que se aumentava o número de ciclos (estímulos). O que interpretamos como uma possível dificuldade das PQG na aprendizagem implícita das sequências motoras. E os FF evidenciaram uma curva descendente, confirmada pelo coeficiente de Spearman negativo para a variável ciclo. Demonstrando que a aprendizagem procedimental para este grupo ocorreu de maneira mais rápida, ou seja, o tempo de reação dos FF reduzia à medida que se aumentava o número de ciclos. De posse desses indícios de que as PQG apresentam dificuldades na memória procedimental, o que poderia interferir no processamento dos aspectos gramaticais de acordo com a nossa hipótese, partimos para a investigação do processamento linguístico. No experimento 2, de correferência intersentencial, realizado com o intuito de investigar o processamento do pronome lexical (PR) e do nome repetido (NR) em posição de objeto entre FF e PQG, os resultados obtidos evidenciaram que não há diferença nesse tipo de processamento entre FF e PQG, uma vez que ambos os grupos apresentaram padrões semelhantes no tempo médio de leitura do segmento crítico. No entanto, houve efeito significativo para a variável tipo de retomada, constatando que os PR são mais rapidamente processados do que o NR, conforme já encontrado em Leitão (2005). Dessa forma, a fim de investigar como se dava o funcionamento da gramática nas PQG e atestar de modo mais categórico a hipótese defendida nesta dissertação, partimos para a análise do fenômeno da correferência em nível intrassentencial, objetivando isolar o aspecto gramatical e eliminar as possíveis interferências dos fatores pragmáticos e contextuais. Os resultados obtidos apontaram a ausência de efeito principal para a variável grupo, no entanto, constatou-se um efeito de interação marginalmente significativo entre as variáveis grupo e tipo de sentença. Essa interação pode ser explicada pelo fato de os grupos reagirem diferentemente às condições, partindo da observação que há um comportamento invertido entre eles, ou seja, na medida em que os FF s são mais rápidos na condição gramatical e mais lentos na condição agramatical, as PQG apresentam o padrão oposto. O que corrobora com a nossa hipótese de que as PQG teriam dificuldades na percepção da violação do princípio gramatical. Possibilidade essa, confirmada por meio das evidências estatísticas previstas para os nossos resultados com o aumento da amostra, que direciona a nossa pesquisa para a rejeição da hipótese nula.
40

Metody extrakce informací / Methods of Information Extraction

Adamček, Adam January 2015 (has links)
The goal of information extraction is to retrieve relational data from texts written in natural human language. Applications of such obtained information is wide - from text summarization, through ontology creation up to answering questions by QA systems. This work describes design and implementation of a system working in computer cluster which transforms a dump of Wikipedia articles to a set of extracted information that is stored in distributed RDF database with a possibility to query it using created user interface.

Page generated in 0.0433 seconds