• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 128
  • 41
  • 13
  • 12
  • 6
  • 4
  • 3
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 242
  • 73
  • 68
  • 67
  • 64
  • 59
  • 51
  • 45
  • 38
  • 38
  • 35
  • 34
  • 32
  • 31
  • 28
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Exploração de métodos de sumarização automática multidocumento com base em conhecimento semântico-discursivo / Exploration of automatic methods for multi-document summarization using discourse models

Cardoso, Paula Christina Figueira 05 September 2014 (has links)
A sumarização automática multidocumento visa à produção de um sumário a partir de um conjunto de textos relacionados, para ser utilizado por um usuário particular e/ou para determinada tarefa. Com o crescimento exponencial das informações disponíveis e a necessidade das pessoas obterem a informação em um curto espaço de tempo, a tarefa de sumarização automática tem recebido muita atenção nos últimos tempos. Sabe-se que em um conjunto de textos relacionados existem informações redundantes, contraditórias e complementares, que representam os fenômenos multidocumento. Em cada texto-fonte, o assunto principal é descrito em uma sequência de subtópicos. Além disso, as sentenças de um texto-fonte possuem graus de relevância diferentes. Nesse contexto, espera-se que um sumário multidocumento consista das informações relevantes que representem o total de textos do conjunto. No entanto, as estratégias de sumarização automática multidocumento adotadas até o presente utilizam somente os relacionamentos entre textos e descartam a análise da estrutura textual de cada texto-fonte, resultando em sumários que são pouco representativos dos subtópicos textuais e menos informativos do que poderiam ser. A fim de tratar adequadamente a relevância das informações, os fenômenos multidocumento e a distribuição de subtópicos, neste trabalho de doutorado, investigou-se como modelar o processo de sumarização automática usando o conhecimento semântico-discursivo em métodos de seleção de conteúdo e o impacto disso para a produção de sumários mais informativos e representativos dos textos-fonte. Na formalização do conhecimento semântico-discursivo, foram utilizadas as teorias semântico-discursivas RST (Rhetorical Structure Theory) e CST (Cross-document Structure Theory). Para apoiar o trabalho, um córpus multidocumento foi anotado com RST e subtópicos, consistindo em um recurso disponível para outras pesquisas. A partir da análise de córpus, foram propostos 10 métodos de segmentação em subtópicos e 13 métodos inovadores de sumarização automática. A avaliação dos métodos de segmentação em subtópicos mostrou que existe uma forte relação entre a estrutura de subtópicos e a análise retórica de um texto. Quanto à avaliação dos métodos de sumarização automática, os resultados indicam que o uso do conhecimento semântico-discursivo em boas estratégias de seleção de conteúdo afeta positivamente a produção de sumários informativos. / The multi-document summarization aims at producing a summary from a set of related texts to be used for an individual or/and a particular task. Nowadays, with the exponential growth of available information and the peoples need to obtain information in a short time, the task of automatic summarization has received wide attention. It is known that in a set of related texts there are pieces of redundant, contradictory and complementary information that represent the multi-document phenomenon. In each source text, the main subject is described in a sequence of subtopics. Furthermore, some sentences in the same text are more relevant than others. Considering this context, it is expected that a multi-document summary consists of relevant information that represents a set of texts. However, strategies for automatic multi-document summarization adopted until now have used only the relationships between texts and dismissed the analysis of textual structure of each source text, resulting in summaries that are less representative of subtopics and less informative than they could be. In order to properly treat the relevance of information, multi-document phenomena and distribution of subtopics, in this thesis, we investigated how to model the summarization process using the semantic-discursive knowledge and its impact for producing more informative and representative summaries from source texts. In order to formalize the semantic-discursive knowledge, we adopted RST (Rhetorical Structure Theory) and CST (Cross-document Structure Theory) theories. To support the work, a multi-document corpus was annotated with RST and subtopics, consisting of a new resource available for other researchers. From the corpus analysis, 10 methods for subtopic segmentation and 13 orignal methods for automatic summarization were proposed. The assessment of methods for subtopic segmentation showed that there is a strong relationship between the subtopics structure and the rhetorical analysis of a text. In regards to the assessment of the methods for automatic summarization, the results indicate that the use of semantic-discursive knowledge in good strategies for content selection affects positively the production of informative summaries.
72

Résumé de Flots de Données : motifs, Cubes et Hiérarchies / Datastream Summarization : patterns, Data Cubes and Hierarchies

Pitarch, Yoann 10 May 2011 (has links)
L'explosion du volume de données disponibles due au développement des technologies de l'information et de la communication a démocratisé les flots qui peuvent être définis comme des séquences non bornées de données très précises et circulant à grande vitesse. Les stocker intégralement est par définition impossible. Il est alors essentiel de proposer des techniques de résumé permettant une analyse a posteriori de cet historique. En outre, un grand nombre de flots de données présentent un caractère multidimensionnel et multiniveaux que très peu d'approches existantes exploitent. Ainsi, l'objectif de ces travaux est de proposer des méthodes de résumé exploitant ces spécificités multidimensionnelles et applicables dans un contexte dynamique. Nous nous intéressons à l'adaptation des techniques OLAP (On Line Analytical Processing ) et plus particulièrement, à l'exploitation des hiérarchies de données pour réaliser cette tâche. Pour aborder cette problématique, nous avons mis en place trois angles d'attaque. Tout d'abord, après avoir discuté et mis en évidence le manque de solutions satisfaisantes, nous proposons deux approches permettant de construire un cube de données alimenté par un flot. Le deuxième angle d'attaque concerne le couplage des approches d'extractions de motifs fréquents (itemsets et séquences) et l'utilisation des hiérarchies pour produire un résumé conservant les tendances d'un flot. Enfin, les catégories de hiérarchies existantes ne permettent pas d'exploiter les connaissances expertes dans le processus de généralisation. Nous pallions ce manque en définissant une nouvelle catégorie de hiérarchies, dites contextuelles, et en proposant une modélisation conceptuelle, graphique et logique d'un entrepôt de données intégrant ces hiérarchies contextuelles. Cette thèse s'inscrivant dans un projet ANR (MIDAS), une plateforme de démonstration intégrant les principales approches de résumé a été mise au point. En outre, la présence de partenaires industriels tels que Orange Labs ou EDF RD dans le projet a permis de confronter nos approches à des jeux de données réelles. / Due to the rapid increase of information and communication technologies, the amount of generated and available data exploded and a new kind of data, the stream data, appeared. One possible and common definition of data stream is an unbounded sequence of very precise data incoming at an high rate. Thus, it is impossible to store such a stream to perform a posteriori analysis. Moreover, more and more data streams concern multidimensional and multilevel data and very few approaches tackle these specificities. Thus, in this work, we proposed some practical and efficient solutions to deal with such particular data in a dynamic context. More specifically, we were interested in adapting OLAP (On Line Analytical Processing ) and hierarchy techniques to build relevant summaries of the data. First, after describing and discussing existent similar approaches, we have proposed two solutions to build more efficiently data cube on stream data. Second, we were interested in combining frequent patterns and the use of hierarchies to build a summary based on the main trends of the stream. Third, even if it exists a lot of types of hierarchies in the literature, none of them integrates the expert knowledge during the generalization phase. However, such an integration could be very relevant to build semantically richer summaries. We tackled this issue and have proposed a new type of hierarchies, namely the contextual hierarchies. We provide with this new type of hierarchies a new conceptual, graphical and logical data warehouse model, namely the contextual data warehouse. Finally, since this work was founded by the ANR through the MIDAS project and thus, we had evaluated our approaches on real datasets provided by the industrial partners of this project (e.g., Orange Labs or EDF R&D).
73

Sumarização automática de opiniões baseada em aspectos / Automatic aspect-based opinion summarization

Condori, Roque Enrique López 24 August 2015 (has links)
A sumarização de opiniões, também conhecida como sumarização de sentimentos, é a tarefa que consiste em gerar automaticamente sumários para um conjunto de opiniões sobre uma entidade específica. Uma das principais abordagens para gerar sumários de opiniões é a sumarização baseada em aspectos. A sumarização baseada em aspectos produz sumários das opiniões para os principais aspectos de uma entidade. As entidades normalmente referem-se a produtos, serviços, organizações, entre outros, e os aspectos são atributos ou componentes das entidades. Nos últimos anos, essa tarefa tem ganhado muita relevância diante da grande quantidade de informação online disponível na web e do interesse cada vez maior em conhecer a avaliação dos usuários sobre produtos, empresas, pessoas e outros. Infelizmente, para o Português do Brasil, pouco se tem pesquisado nessa área. Nesse cenário, neste projeto de mestrado, investigou-se o desenvolvimento de alguns métodos de sumarização de opiniões com base em aspectos. Em particular, foram implementados quatro métodos clássicos da literatura, extrativos e abstrativos. Esses métodos foram analisados em cada uma de suas fases e, como consequência dessa análise, produziram-se duas propostas para gerar sumários de opiniões. Essas duas propostas tentam utilizar as principais vantagens dos métodos clássicos para gerar melhores sumários. A fim de analisar o desempenho dos métodos implementados, foram realizados experimentos em função de três medidas de avaliação tradicionais da área: informatividade, qualidade linguística e utilidade do sumário. Os resultados obtidos mostram que os métodos propostos neste trabalho são competitivos com os métodos da literatura e, em vários casos, os superam. / Opinion summarization, also known as sentiment summarization, is the task of automatically generating summaries for a set of opinions about a specific entity. One of the main approaches to generate opinion summaries is aspect-based opinion summarization. Aspect-based opinion summarization generates summaries of opinions for the main aspects of an entity. Entities could be products, services, organizations or others, and aspects are attributes or components of them. In the last years, this task has gained much importance because of the large amount of online information available on the web and the increasing interest in learning the user evaluation about products, companies, people and others. Unfortunately, for Brazilian Portuguese language, there are few researches in that area. In this scenario, this master\'s project investigated the development of some aspect-based opinion summarization methods. In particular, it was implemented four classical methods of the literature, extractive and abstractive ones. These methods were analyzed in each of its phases and, as a result of this analysis, it was produced two proposals to generate summaries of opinions. Both proposals attempt to use the main advantages of the classical methods to generate better summaries. In order to analyze the performance of the implemented methods, experiments were carried out according to three traditional evaluation measures: informativeness, linguistic quality and usefulness of the summary. The results show that the proposed methods in this work are competitive with the classical methods and, in many cases, they got the best performance.
74

Modelo para sumarização computacional de textos científicos. / Scientific text computational summarization model.

Tarafa Guzmán, Alejandro 07 March 2017 (has links)
Neste trabalho, propõe-se um modelo para a sumarização computacional extrativa de textos de artigos técnico-cientificos em inglês. A metodologia utilizada baseia-se em um módulo de avaliação de similaridade semântica textual entre sentenças, desenvolvido especialmente para integrar o modelo de sumarização. A aplicação deste módulo de similaridade à extração de sentenças é feita por intermédio do conceito de uma janela deslizante de comprimento variável, que facilita a detecção de equivalência semântica entre frases do artigo e aquelas de um léxico de frases típicas, atribuíveis a uma estrutura básica dos artigos. Os sumários obtidos em aplicações do modelo apresentam qualidade razoável e utilizável, para os efeitos de antecipar a informação contida nos artigos. / In this work a model is proposed for the computational extractive summarization of scientific papers in English. Its methodology is based on a semantic textual similarity module, for the evaluation of equivalence between sentences, specially developed to integrate the summarization model. A variable width window facilitates the application of this module to detect semantic similarity between phrases in the article and those in a basic structure, assignable to the articles. Practical summaries obtained with the model show usable quality to anticipate the information found in the papers.
75

Modelagem de discurso para o tratamento da concisão e preservação da idéia central na geração de textos / Discourse modeling for conciseness and gist preservation in text generation

Rino, Lucia Helena Machado 26 April 1996 (has links)
O foco deste trabalho esta, no processo automático de condensação de uma estrutura complexa de informação e de sua estruturação, para fazê-la apropriada para a expressão textual. A tese principal é que, sem um modelo de discurso, não podemos assegurar a preservação de uma idéia central, pois o processamento do discurso envolve não só a informação, como também metas comunicativas e critérios para ressaltar unidades de informação. Como resultado os métodos para produzir uma estrutura coerente de discurso de um sumário agregam tanto metas comunicativas quanto informações sobre o inter-relacionamentos entre as unidades de informação permitindo a organização do discurso com base em restrições progressivas de planejamento. Esse argumento tem duas implicações: a preservação da idéia central deve ser garantida em nível profundo de processamento e sua proeminência deve ser subordinada aos aspectos comunicativos e retóricos. Portanto, esta investigação se baseia em perspectivas intencionais e retóricas. Propomos um modelo de sumarização dirigido por objetivos, cuja função principal é mapear intenções em relações de coerência, observando ainda a dependência semântica indicada pela estrutura complexa de informação. As estruturas de discurso resultantes devem enfatizar a proposição central a veicular no discurso. Em termos teóricos, o aspecto inovador do modelo está na associação de relações de discurso em três níveis distintos de representação: intencionalidade. coerência e semântica. Em termos práticos, a solução proposta sugere o projeto de um planejador de textos que pode tornar a proposição central de um discurso a informação mais proeminente em uma estrutura de discurso e, assim, assegurar a preservação da idéia central durante a condensação de uma estrutura complexa de informação. Os resultados experimentais da aplicação desse modelo demonstram que é possível selecionar a informação relevante, distinguindo as unidades de conteúdo da estrutura original que são supérfluas ou complementares para a proposição central, e organizá-la coerentemente com o intuito de alcançar um objetivo comunicativo. Propomos a incorporação do modelo a um sumarizador automático cuja arquitetura é sugerida neste trabalho. / The focus of this work is on the automatic process of condensing a. complex information structure and structuring it in such a way as to make it appropriate for textual expression. The main thesis is that without a sound discourse model we cannot guarantee gist preservation because discourse processing comprises not only information, but also communicative goals and criteria to emphasize units of information. As a result, the methods to produce a coherent discourse structure of a summary aggregate both communicative goals and the inter-relationships between information units, allowing for discourse organization by progressively constraining planning decisions. Our thrust has two implications, namely that gist preservation must be guaranteed at the deep level of processing and gist proeminence must be subordinated to communicative and rhetorical settings. The current investigation thus relies on intentional and rhetorical perspectives. A goal-driven summarization model is proposed, whose main function is to map intentions onto coherence relations whilst still observing the semantic dependency indicated by the complex input structure. The resulting discourse structures must highlight the central proposition to be conveyed. In theoretical terms, the innovative contribution of the model relies on the association of discourse relations at three different levels of representation - the intentionality, coherence and semantics. In practical terms, the proposed solution allows for the design of a text planner that can make the central proposition of a discourse the most proeminent information in a discourse structure, thus ensuring the preservation of gist during the condensation of a complex information structure. The results of applying this model show that it is possible to both select relevant information by differentiating content units of the input structure that are superfluous or complementary to the central proposition and organize it coherently by aiming at achieving a communicative goal. The model is proposed to incorporate into an automatic summariser whose architecture suggested in this thesis.
76

The Effects of Cognitive Styles on Summarization of Expository Text

Mast, Cynda Overton 08 1900 (has links)
The study investigated the relationship among three cognitive styles and summarization abilities. Both summarization products and processes were examined. Summarizing products were scored and a canonical correlation analysis was performed to determine their relationship with three cognitive styles. Summarizing processes were examined by videotaping students as they provided think aloud protocols. Their processes were recorded on composing style sheets and analyzed qualitatively. Subjects were sixth-grade students in self-contained classes in a suburban school district. Summarizing products were collected over a two week period in the fall. Summarizing processes were collected over an eight week period in the spring of the same school year. The results of the summarizing products analysis suggest that cognitive styles are related to summarization abilities. Two canonical correlations among the two variable sets were statistically significant at the .05 level of significance (.33 and .29). The results further suggest that students who are field independent, reflective, and flexible in their attentional style may be more adept at organizing their ideas and using written mechanics while summarizing. Students who are impulsive and constricted in attentional style may exhibit strength in expressing their ideas while summarizing. Results of the summarizing processes analysis suggest that students of one cognitive style combination may exhibit different behaviors while summarizing than those of other cognitive style combinations. Students who are field independent, reflective, and flexible in their attentional style seem to display more mature, interactive behaviors while summarizing than their peers of other cognitive style combinations.
77

Sumarização automática de opiniões baseada em aspectos / Automatic aspect-based opinion summarization

Roque Enrique López Condori 24 August 2015 (has links)
A sumarização de opiniões, também conhecida como sumarização de sentimentos, é a tarefa que consiste em gerar automaticamente sumários para um conjunto de opiniões sobre uma entidade específica. Uma das principais abordagens para gerar sumários de opiniões é a sumarização baseada em aspectos. A sumarização baseada em aspectos produz sumários das opiniões para os principais aspectos de uma entidade. As entidades normalmente referem-se a produtos, serviços, organizações, entre outros, e os aspectos são atributos ou componentes das entidades. Nos últimos anos, essa tarefa tem ganhado muita relevância diante da grande quantidade de informação online disponível na web e do interesse cada vez maior em conhecer a avaliação dos usuários sobre produtos, empresas, pessoas e outros. Infelizmente, para o Português do Brasil, pouco se tem pesquisado nessa área. Nesse cenário, neste projeto de mestrado, investigou-se o desenvolvimento de alguns métodos de sumarização de opiniões com base em aspectos. Em particular, foram implementados quatro métodos clássicos da literatura, extrativos e abstrativos. Esses métodos foram analisados em cada uma de suas fases e, como consequência dessa análise, produziram-se duas propostas para gerar sumários de opiniões. Essas duas propostas tentam utilizar as principais vantagens dos métodos clássicos para gerar melhores sumários. A fim de analisar o desempenho dos métodos implementados, foram realizados experimentos em função de três medidas de avaliação tradicionais da área: informatividade, qualidade linguística e utilidade do sumário. Os resultados obtidos mostram que os métodos propostos neste trabalho são competitivos com os métodos da literatura e, em vários casos, os superam. / Opinion summarization, also known as sentiment summarization, is the task of automatically generating summaries for a set of opinions about a specific entity. One of the main approaches to generate opinion summaries is aspect-based opinion summarization. Aspect-based opinion summarization generates summaries of opinions for the main aspects of an entity. Entities could be products, services, organizations or others, and aspects are attributes or components of them. In the last years, this task has gained much importance because of the large amount of online information available on the web and the increasing interest in learning the user evaluation about products, companies, people and others. Unfortunately, for Brazilian Portuguese language, there are few researches in that area. In this scenario, this master\'s project investigated the development of some aspect-based opinion summarization methods. In particular, it was implemented four classical methods of the literature, extractive and abstractive ones. These methods were analyzed in each of its phases and, as a result of this analysis, it was produced two proposals to generate summaries of opinions. Both proposals attempt to use the main advantages of the classical methods to generate better summaries. In order to analyze the performance of the implemented methods, experiments were carried out according to three traditional evaluation measures: informativeness, linguistic quality and usefulness of the summary. The results show that the proposed methods in this work are competitive with the classical methods and, in many cases, they got the best performance.
78

The Effects of Explicit Main Idea and Summarization Instruction on Reading Comprehension of Expository Text for Alternative High School Students

Brown, Sally A. 01 August 2018 (has links)
Secondary students who struggle with reading often have deficits in the area of reading comprehension. The purpose of this study was to examine the effects of explicit main idea and summarization instruction on reading comprehension of expository text for alternative high school students. The lead researcher explicitly taught participants how to summarize expository passages. Participants were taught to generate a big idea topic of a passage, identify key words and phrases, locate or generate main ideas, and generate an oral summary. The three participants increased their performance on the researcher-developed oral summary measure and the summarization guide after receiving the reading comprehension intervention. Furthermore, participants felt they were able to learn how to summarize expository passages, perceived the intervention as effective, and that it helped their reading comprehension. Overall, results indicated that the intervention, which was explicit main idea and summarization instruction aimed to improve reading comprehension, is an effective practice for students who attend alternative high schools.
79

Towards Next Generation Bug Tracking Systems

Velly Lotufo, Rafael 06 June 2013 (has links)
Although bug tracking systems are fundamental to support virtually any software development process, they are currently suboptimal to support the needs and complexities of large communities. This dissertation first presents a study showing empirical evidence that the traditional interface used by current bug tracking systems invites much noise—unreliable, unuseful, and disorganized information—into the ecosystem. We find that noise comes from, not only low-quality contributions posted by inexperienced users or from conflicts that naturally arise in such ecosystems, but also from the difficulty of fitting the complex bug resolution process and knowledge into the linear sequence of comments that current bug tracking systems use to collect and organize information. Since productivity in bug tracking systems relies on bug reports with accessible and realible information, this leaves contributors struggling to work on and to make sense of the dumps of data submitted to bug reports and, thus, impacting productivity. Next generation bug tracking systems should be more than a tool for exchanging unstructured textual comments. They should be an ecosystem that is tailored for collaborative knowledge building, leveraging the power of the masses to collect reliable and useful information about bugs, providing mechanisms and incentives to verify the validity of such information and mechanisms to organize such information, thus, facilitating comprehension and reasoning. To bring bug tracking systems towards this vision, we present three orthogonal approaches aiming at increasing the usefulness and realiability of contributions and organizing information to improve understanding and reasoning. To improve the usefulness and realibility of contributions we propose the addition of game mechanisms to bug tracking systems, with the objective of motivating contributors to post higher-quality content. Through an empirical investigation of Stack Overflow we evaluate the effects of the mechanisms in such a collaborative software development ecosystem and map a promissing approach to use game mechanisms in bug tracking systems. To improve data organization, we propose two complementary approaches. The first is an automated approach to data organization, creating bug report summaries that make reading and working with bug reports easier, by highlighting the portions of bug reports that expert developers would focus on, if reading the bug report in a hurry. The second approach to improve data organization is a fundamental change on how data is collected and organized, eliminating comments as the main component of bug reports. Instead of comments, users contribute informational posts about bug diagnostics or solutions, allowing users to post contextual comments for each of the different diagnostic iiior solution posts. Our evaluations with real bug tracking system users find that they consider the bug report summaries to be very useful in facilitating common bug tracking system tasks, such as finding duplicate bug reports. In addition, users found that organzing content though diagnostic and solution posts to significanly facilitate reasoning about and searching for relevant information. Finally, we present future directions of work investigating how next generation bug tracking systems could combine the use of the three approaches, such that they benefit from and build upon the results of the other approaches. Next generation bug tracking systems should be more than a tool for exchanging unstructured textual comments. They should be an ecosystem that is tailored for collaborative knowledge building, leveraging the power of the masses to collect reliable and useful information about bugs, providing mechanisms and incentives to verify the validity of such information and mechanisms to organize such information, thus, facilitating comprehension and reasoning. To bring bug tracking systems towards this vision, we present three orthogonal approaches aiming at increasing the usefulness and realiability of contributions and organizing information to improve understanding and reasoning. To improve the usefulness and realibility of contributions we propose the addition of game mechanisms to bug tracking systems, with the objective of motivating contributors to post higher-quality content. Through an empirical investigation of Stack Overflow we evaluate the effects of the mechanisms in such a collaborative software development ecosystem and map a promissing approach to use game mechanisms in bug tracking systems. To improve data organization, we propose two complementary approaches. The first is an automated approach to data organization, creating bug report summaries that make reading and working with bug reports easier, by highlighting the portions of bug reports that expert developers would focus on, if reading the bug report in a hurry. The second approach to improve data organization is a fundamental change on how data is collected and organized, eliminating comments as the main component of bug reports. Instead of comments, users contribute informational posts about bug diagnostics or solutions, allowing users to post contextual comments for each of the different diagnostic iiior solution posts. Our evaluations with real bug tracking system users find that they consider the bug report summaries to be very useful in facilitating common bug tracking system tasks, such as finding duplicate bug reports. In addition, users found that organzing content though diagnostic and solution posts to significanly facilitate reasoning about and searching for relevant information. Finally, we present future directions of work investigating how next generation bug tracking systems could combine the use of the three approaches, such that they benefit from and build upon the results of the other approaches.
80

Sentence Compression by Removing Recursive Structure from Parse Tree

Matsubara, Shigeki, Kato, Yoshihide, Egawa, Seiji 04 December 2008 (has links)
PRICAI 2008: Trends in Artificial Intelligence 10th Pacific Rim International Conference on Artificial Intelligence, Hanoi, Vietnam, December 15-19, 2008. Proceedings

Page generated in 0.0523 seconds