Global ETD Search

1	Evaluation of Automatic Text Summarization Using Synthetic Facts Ahn, Jaewook 01 June 2022 (has links) (PDF) Automatic text summarization has achieved remarkable success with the development of deep neural networks and the availability of standardized benchmark datasets. It can generate fluent, human-like summaries. However, the unreliability of the existing evaluation metrics hinders its practical usage and slows down its progress. To address this issue, we propose an automatic reference-less text summarization evaluation system with dynamically generated synthetic facts. We hypothesize that if a system guarantees a summary that has all the facts that are 100% known in the synthetic document, it can provide natural interpretability and high feasibility in measuring factual consistency and comprehensiveness. To our knowledge, our system is the first system that measures the overarching quality of the text summarization models with factual consistency, comprehensiveness, and compression rate. We validate our system by comparing its correlation with human judgment with existing N-gram overlap-based metrics such as ROUGE and BLEU and a BERT-based evaluation metric, BERTScore. Our system's experimental evaluation of PEGASUS, BART, and T5 outperforms the current evaluation metrics in measuring factual consistency with a noticeable margin and demonstrates its statistical significance in measuring comprehensiveness and overall summary quality. Natural Language Processing Text summarization Summarization Evaluation Synthetic Facts
2	Automation of summarization evaluation methods and their application to the summarization process Nahnsen, Thade January 2011 (has links) Summarization is the process of creating a more compact textual representation of a document or a collection of documents. In view of the vast increase in electronically available information sources in the last decade, filters such as automatically generated summaries are becoming ever more important to facilitate the efficient acquisition and use of required information. Different methods using natural language processing (NLP) techniques are being used to this end. One of the shallowest approaches is the clustering of available documents and the representation of the resulting clusters by one of the documents; an example of this approach is the Google News website. It is also possible to augment the clustering of documents with a summarization process, which would result in a more balanced representation of the information in the cluster, NewsBlaster being an example. However, while some systems are already available on the web, summarization is still considered a difficult problem in the NLP community. One of the major problems hampering the development of proficient summarization systems is the evaluation of the (true) quality of system-generated summaries. This is exemplified by the fact that the current state-of-the-art evaluation method to assess the information content of summaries, the Pyramid evaluation scheme, is a manual procedure. In this light, this thesis has three main objectives. 1. The development of a fully automated evaluation method. The proposed scheme is rooted in the ideas underlying the Pyramid evaluation scheme and makes use of deep syntactic information and lexical semantics. Its performance improves notably on previous automated evaluation methods. 2. The development of an automatic summarization system which draws on the conceptual idea of the Pyramid evaluation scheme and the techniques developed for the proposed evaluation system. The approach features the algorithm for determining the pyramid and bases importance on the number of occurrences of the variable-sized contributors of the pyramid as opposed to word-based methods exploited elsewhere. 3. The development of a text coherence component that can be used for obtaining the best ordering of the sentences in a summary. 621.382
3	Summarizing User-generated Discourse Syed, Shahbaz 04 July 2024 (has links) Automatic text summarization is a long-standing task with its origins in summarizing scholarly documents by generating their abstracts. While older approaches mainly focused on generating extractive summaries, recent approaches using neural architectures have helped the task advance towards generating more abstractive, human-like summaries. Yet, the majority of the research in automatic text summarization has focused on summarizing professionally-written news articles due to easier availability of large-scale datasets with ground truth summaries in this domain. Moreover, the inverted pyramid writing style enforced in news articles places crucial information in the top sentences, essentially summarizing it. This allows for a more reliable identification of ground truth for constructing datasets. In contrast, user-generated discourse, such as social media forums or debate portals, has acquired comparably little attention, despite its evident importance. Possible reasons include the challenges posed by the informal nature of user-generated discourse, which often lacks a rigid structure, such as news articles, and the difficulty of obtaining high-quality ground truth summaries for this text register. This thesis aims to address this existing gap by delivering the following novel contributions in the form of datasets, methodologies, and evaluation strategies for automatically summarizing user-generated discourse: (1) three new datasets for the registers of social media posts and argumentative texts containing author-provided ground truth summaries as well as crowdsourced summaries for argumentative texts by adapting theoretical definitions of high-quality summaries; (2) methodologies for creating informative as well as indicative summaries for long discussions of controversial topics; (3) user-centric evaluation processes that emphasize the purpose and provenance of the summary for qualitative assessment of the summarization models; and (4) tools for facilitating the development and evaluation of summarization models that leverage visual analytics and interactive interfaces to enable a fine-grained inspection of the automatically generated summaries in relation to their source documents.:1 Introduction 1.1 Understanding User-Generated Discourse 1.2 The Role of Automatic Summarization 1.3 Research Questions and Contributions 1.4 Thesis Structure 1.5 Publication Record 2 The Task of Text Summarization 2.1 Decoding Human Summarization Practices 2.2 Exploring Automatic Summarization Methods 2.3 Evaluation of Automatic Summarization and its Challenges 2.4 Summary 3 Defining Good Summaries: Examining News Editorials 3.1 Key Characteristics of News Editorials 3.2 Operationalizing High-Quality Summaries 3.3 Evaluating and Ensuring Summary Quality 3.4 Automatic Extractive Summarization of News Editorials 3.5 Summary 4 Mining Social Media for Author-provided Summaries 4.1 Leveraging Human Signals for Summary Identification 4.2 Constructing a Corpus of Abstractive Summaries 4.3 Insights from the TL;DR Challenge 4.4 Summary 5 Generating Conclusions for Argumentative Texts 5.1 Identifying Author-provided Conclusions 5.2 Enhancing Pretrained Models with External Knowledge 5.3 Evaluating Informative Conclusion Generation 5.4 Summary 6 Frame-Oriented Extractive Summarization of Argumentative Discussions 6.1 Importance of Summaries for Argumentative Discussions 6.2 Employing Argumentation Frames as Anchor Points 6.3 Extractive Summarization of Argumentative Discussions 6.4 Evaluation of Extractive Summaries via Relevance Judgments 6.5 Summary 7 Indicative Summarization of Long Discussions 7.1 Table of Contents as an Indicative Summary 7.2 Unsupervised Summarization with Large Language Models 7.3 Comprehensive Analysis of Prompt Engineering 7.4 Purpose-driven Evaluation of Summary Usefulness 7.5 Summary 8 Summary Explorer: Visual Analytics for the Qualitative Assessment of the State of the Art in Text Summarization 8.1 Limitations of Automatic Evaluation Metrics 8.2 Designing Interfaces for Visual Exploration of Summaries 8.3 Corpora, Models, and Case Studies 8.4 Summary 9 SummaryWorkbench: Reproducible Models and Metrics for Text Summarization 9.1 Addressing the Requirements for Summarization Researchers 9.2 AUnified Interface for Applying and Evaluating State-of-the-Art Models and Metrics 9.3 Models and Measures 9.4 Curated Artifacts and Interaction Scenarios 9.5 Interaction Use Cases 9.6 Summary 10 Conclusion 10.1 Key Contributions of the Thesis 10.2 Open Problems and FutureWork info:eu-repo/classification/ddc/000 ddc:000
4	ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization Jaykumar, Nishita 01 June 2016 (has links) No description available. Science Education Bioinformatics Biomedical Research Biomedical Engineering Computer Science Computer Engineering

1

Page generated in 0.1068 seconds