Global ETD Search

Return to search

Automatic Text Summarization Using Importance of Sentences for Email Corpus

abstract: With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the results are not effective towards the search query of the user. The search engine user has to go through certain webpages before getting at the webpage he/she wanted. This problem of Information Overload can be solved using Automatic Text Summarization. Summarization is a process of obtaining at abridged version of documents so that user can have a quick view to understand what exactly the document is about. Email threads from W3C are used in this system. Apart from common IR features like Term Frequency, Inverse Document Frequency, Term Rank, a variation of page rank based on graph model, which can cluster the words with respective to word ambiguity, is implemented. Term Rank also considers the possibility of co-occurrence of words with the corpus and evaluates the rank of the word accordingly. Sentences of email threads are ranked as per features and summaries are generated. System implemented the concept of pyramid evaluation in content selection. The system can be considered as a framework for Unsupervised Learning in text summarization. / Dissertation/Thesis / Masters Thesis Computer Science 2015

http://hdl.handle.net/2286/R.I.34929

Computer science

Data Mining

Machine Learning

Natural Language Processing

Pyramid Evaluation

Term Rank

Text Summarization

Identifer	oai:union.ndltd.org:asu.edu/item:34929
Date	January 2015
Contributors	Nadella, Sravan (Author), Davulcu, Hasan (Advisor), Li, Baoxin (Committee member), Sen, Arunabha (Committee member), Arizona State University (Publisher)
Source Sets	Arizona State University
Language	English
Detected Language	English
Type	Masters Thesis
Format	41 pages
Rights	http://rightsstatements.org/vocab/InC/1.0/, All Rights Reserved

Page generated in 0.0019 seconds

Automatic Text Summarization Using Importance of Sentences for Email Corpus

Description

Links & Downloads

Tags

Additional Fields