Return to search

應用文字探勘於影評文章自動摘要之研究 / A Study on Application of Text Mining for Automatic Text Summarization of Film Review

隨著網路世界的興起,在面臨選擇難題時,民眾不僅會接收口耳相傳的資訊,也會以關鍵字上網搜尋目標資訊,但是在海量資料的浪潮中,如何快速的整合資料是一大挑戰。電影影評文章摘要可以幫助民眾進電影院前了解電影的資訊,透過這樣的方式確認電影是自身有興趣的電影。
本研究以電影:復仇者聯盟2影評66篇4616句、蝙蝠俠對超人:正義曙光60篇9345句、動物方城市60篇5545句、星際效應50篇4616句、高年級實習生62篇5622句為資料來源,以分群概念結合摘句之方法生成影評摘要。其中,利用K-Means演算法將五部電影的多篇影評特徵詞、句子進行分群後,使用TFIDF評比各分群語句的重要性來選取高權重語句,再以WWA方法挑選分群中不同面向的語句,最後以相似度計算最佳範本與各分群內容的相似度來決定每一群聚的排序順序,產生一篇具有相似內容段落和段落順序的影評多篇摘要。
研究結果顯示,原本五部電影影評對最佳範本之相似度為15.87%,經由本研究方法產生之摘要對最佳範本單篇摘要之相似度為21.19%。另外,因為影評中各分群的順序是比對最佳範本相似度而產生的排序,整篇摘要會具有與最佳範本相似段落排序的摘要內容,其中內容包含了電影影評中廣泛提到的相似內容,不同的相似段落讓文章摘要的呈現更具廣泛性。藉由此摘要方法,可以幫助民眾藉由自動化彙整、萃取的摘要快速了解相關電影資訊內容和協助決策。 / Abstract
As Facing the Big Data issue, there are too many information on the website for reader to understand. How to perform and summarize essential information quickly is a challenge. People who want to go to a movie will also face this situation. Before choosing movies, they will search relative information of the movies. However, there are many film reviews all over the websites. Automatic text summarization can efficiently extract important information for readers, and conclude concepts of reviews on the websites. Through this method, readers can easily comprehend the best idea of all the reviews and save their time.
The research presents a multi-concept and extractive film review summary for readers. It generates film review summary from the most popular blog platform, PIXNET, with extract-based method and clustering concept. The method using K-Means algorism let the film review summary focus on specific film to cluster the sentences by features, and having statistical sense and WWA method to measure the weight of sentences in order to choose the representative sentences. On the last step, it will compare to templates to decide the sequence of classified sentences and summary all represent sentences from each cluster. The research provides a multi-concept and extractive film review summary for people.
From the result, there are five movies, which are used summary method increase the average similarity to 21.19% that comparing between the film reviews summary and templates summary. It shows that the automatic film reviews summarization can extract the important sentences from the reviews. Also, with comparing template method to order the cluster, it can sequentially list the cluster of the sentences to generate a movie review, which saves readers’ time and easily comprehend.

Identiferoai:union.ndltd.org:CHENGCHI/G0103356032
Creators鄧亦安, Teng, I An
Publisher國立政治大學
Source SetsNational Chengchi University Libraries
Language中文
Detected LanguageEnglish
Typetext
RightsCopyright © nccu library on behalf of the copyright holders

Page generated in 0.0142 seconds