Return to search

A hybrid approach to automatic text summarization

Automatic text summarization can efficiently and effectively save users¡¦ time while reading text documents. The objective of automatic text summarization is to extract essential sentences that cover almost all the concepts of a document so that
users are able to comprehend the ideas the document tries to address by simply reading through the corresponding summary. This research focuses on developing a hybrid automatic text summarization
approach, KCS, to enhancing the quality of summaries.
This approach basically consists of two major components: first, it employs the K-mixture probabilistic model to calculate term weights in a statistical sense; it then identifies the term relationship
between nouns and nouns as well as nouns and verbs, which results in the connective strength (CS) of nouns. With the connective strengths available scores of sentences can be calculated and ranked to be extracted.
We conduct three experiments to justify the proposed approach. The quality of summary is examined by its capability of increasing accuracy of text classification,while the classifier employed, the Naïve Bayes classifier, is kept the same through all experiments. The results show that the K-mixture model is more contributive to document classification than traditional TFIDF weighting scheme. It, however, is still no better than CS, a more complex linguistic-based approach. More importantly, our proposed approach, KCS, performs best among all approaches considered. It implies that KCS can extract more representative sentences from the document and its feasibility in text summarization applications is thus justified.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-1018107-100507
Date18 October 2007
CreatorsYuan, Li-An
ContributorsHsiao Wen Feng, Sun Pei Chen, Chang Te Min
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageCholon
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-1018107-100507
Rightsunrestricted, Copyright information available at source archive

Page generated in 0.0023 seconds