Return to search

應用情感分析於輿情之研究-以台灣2016總統選舉為例 / A Study of using sentiment analysis for emotion in Taiwan's presidential election of 2016

從2014年九合一選舉到今年總統大選,網路在選戰的影響度越來越大,後選人可透過網路上之熱門討論議題即時掌握民眾需求。
文字情感分析通常使用監督式或非監督式的方法來分析文件,監督式透過文件量化可達很高的正確率,但無法預期未知趨勢,耗費人力標注文章。
本研究針對網路上之政治新聞輿情,提出一個混合非監督式與監督式學習的中文情感分析方法,先透過非監督式方法標注新聞,再用監督式方法建立分類模型,驗證分類準確率。
在實驗結果中,主題標注方面,本研究發現因文本數量遠大於議題詞數量造成TFIDF矩陣過於稀疏,使得TFIDF-Kmeans主題模型分類效果不佳;而NPMI-Concor主題模型分類效果較佳但是所分出的議題詞數量不均衡,然而LDA主題模型基於所有主題被所有文章共享的特性,使得在字詞分群與主題分類準確度都優於TFIDF-Kmeans和NPMI-Concor主題模型,分類準確度高達97%,故後續採用LDA主題模型進行主題標注。
情緒傾向標注方面,證實本研究擴充後的情感詞集比起NTUSD有更好的字詞極性判斷效果,並且進一步使用ChineseWordnet 和 SentiWordNet,找出詞彙的情緒強度,使得在網友評論的情緒計算更加準確。亦發現所有文本的情緒指數皆具皆能反應民調指數,故本研究用文本的情緒指數來建立民調趨勢分類模型。
在關注議題分類結果的實驗,整體正確率達到95%,而在民調趨勢分類結果的實驗,整體正確率達到85%。另外建立全面性的視覺化報告以瞭解民眾的正反意見,提供候選人在選戰上之競爭智慧。 / From Taiwanese local elections, 2014 to Taiwan presidential elections, 2016. Network is in growing influence of the election. The nominee can immediately grasp the needs of the people through a popular subject of discussion on the website.
Sentiment Analysis research encompasses supervised and unsupervised methods for analyzing review text. The supervised learning is proved as a powerful method with high accuracy, but there are limits where future trend cannot be recognized, and the labels of individual classes must be made manually.
In the study, we propose a Chinese Sentiment Analysis method which combined supervised and unsupervised learning. First, we used unsupervised learning to label every articles. Secondly, we used supervised learning to build classification model and verified the result.
According to the result of finding subject labeling, we found that TFIDF-Kmeans model is not suitable because of document characteristic. NPMI-Concor model is better than TFIDF-Kmeans model. But the subject words is not balanced. However, LDA model has the feature that all subject is share by all articles. LDA model classification performance can reach 97% accuracy. So we choose it to decide article subject.
According to the result of sentimental labeling, the sentimental dictionary we build has higher accuracy than NTUSD on judging word polarity. Moreover, we used ChineseWordnet and SentiWordNet to calculate the strength of word. So we can have more accuracy on calculate public’s sentiment. So we use these sentiment index to build prediction model.
In the result of subject labeling, our accuracy is 95%. Meanwhile, In the result of prediction our accuracy is 85%. We also create the Visualization report for the nominee to understand the positive and the negative options of public. Our research can help the nominee by providing competitive wisdom.

Identiferoai:union.ndltd.org:CHENGCHI/G0103356020
Creators陳昭元, Chen, Chao-Yuan
Publisher國立政治大學
Source SetsNational Chengchi University Libraries
Language中文
Detected LanguageEnglish
Typetext
RightsCopyright © nccu library on behalf of the copyright holders

Page generated in 0.002 seconds