Global ETD Search

Return to search

[pt] ARQUITETURA PROFUNDA PARA EXTRAÇÃO DE CITAÇÕES / [en] DEEP ARCHITECTURE FOR QUOTATION EXTRACTION

[pt] A Extração e Atribuição de Citações é a tarefa de identificar citações de um texto e associá-las a seus autores. Neste trabalho, apresentamos um sistema de Extração e Atribuição de Citações para a língua portuguesa. A tarefa de Extração e Atribuição de Citações foi abordada anteriormente utilizando diversas técnicas e para uma variedade de linguagens e datasets. Os modelos tradicionais para a tarefa consistem em extrair manualmente um rico conjunto de atributos e usá-los para alimentar um classificador
raso. Neste trabalho, ao contrário da abordagem tradicional, evitamos usar atributos projetados à mão, usando técnicas de aprendizagem não supervisionadas e redes neurais profundas para automaticamente aprender atributos relevantes para resolver a tarefa. Ao evitar a criação manual de atributos, nosso modelo de aprendizagem de máquina tornou-se facilmente adaptável a outros domínios e linguagens. Nosso modelo foi treinado e avaliado no corpus GloboQuotes e sua métrica de desempenho F1 é igual a 89.43 por cento. / [en] Quotation Extraction and Attribution is the task of identifying quotations from a given text and associating them to their authors. In this work, we present a Quotation Extraction and Attribution system for the Portuguese language. The Quotation Extraction and Attribution task has been previously approached using various techniques and for a variety of languages and datasets. Traditional models to this task consist of extracting a rich set of hand-designed features and using them to feed a shallow classifier. In this work, unlike the traditional approach, we avoid using hand-designed features using unsupervised learning techniques and deep neural networks to automatically learn relevant features to solve the task. By avoiding design features by hand, our machine learning model became easily adaptable to other languages and domains. Our model is trained and evaluated at the GloboQuotes corpus, and its F1 performance metric is equal to 89.43 percent.

[pt] REDE NEURAL

[pt] APRENDIZADO PROFUNDO

[pt] EXTRACAO DE CITACOES

[pt] PROCESSAMENTO DE LINGUAGEM NATURAL

[pt] APRENDIZADO DE MAQUINA

[en] NEURAL NETWORKS

[en] DEEP LEARNING

[en] QUOTATION EXTRACTION

[en] NATURAL LANGUAGE PROCESSING

[en] MACHINE LEARNING

Identifer	oai:union.ndltd.org:puc-rio.br/oai:MAXWELL.puc-rio.br:30734
Date	28 July 2017
Creators	LUIS FELIPE MULLER DE OLIVEIRA HENRIQUES
Contributors	RUY LUIZ MILIDIU
Publisher	MAXWELL
Source Sets	PUC Rio
Language	English
Detected Language	English
Type	TEXTO

Page generated in 0.0022 seconds

[pt] ARQUITETURA PROFUNDA PARA EXTRAÇÃO DE CITAÇÕES / [en] DEEP ARCHITECTURE FOR QUOTATION EXTRACTION

Description

Links & Downloads

Tags

Additional Fields