本研究嘗試建置一個包含兩個主要功能的系統,分別是網路評價搜尋以及情感分類。在網路評價搜尋的部份,我們使用Google搜尋並蒐集一攜帶型智慧裝置(智慧型手機、平板電腦與筆記型電腦)的網路評價搜尋結果;情感分類的部分則是將搜尋結果依照對該產品的意見分類為,共有正面/負面/中立、正面/負面、正面/非正面,以及負面/非負面等四種分類方式。為了建置此系統,我們首先從知名的網路論壇Mobile01和批踢踢蒐集和攜帶型智慧裝置有關的網路文章以及產品名稱,接著以人工的方式標記每篇文章,以及部分文章中的句子的情感。本研究設計了兩個層次的情感分類實驗,我們首先從語句層次出發,以監督式機器學習法訓練將句子分為正面/負面/中立等三個類別的分類模型後,再進入文章層次,將句子的意見彙整,並同樣以監督式機器學習法訓練四種不同文章層次的分類模型:正面/負面/中立、正面/負面、正面/非正面,以及負面/非負面。我們分別選出四種分類實驗中表現最佳的模型,並用於系統建置,其中表現最佳的是分類為正面/負面的分類模型,平均的F-measure為0.87;其次是分類為負面/非負面的模型,對負面類別的F-measure為0.83;接著是分類為正面/非正面的模型,對正面類別的F-measure為0.81;表現最差的是正面/負面/中立的分類,平均的F-measure為0.77。在正面/負面分類的準確率上,本研究的表現並不壞於過去以英文為主要語言的相關研究。最後,我們也以過去不經過語句層次的分類方法進行實驗並比較,其結果發現經過語句層次的情感分類比不經過語句層次的情感分類較佳。 / In this research, we implemented a system that retrieves the search results of mobile phones, tablets, and notebooks from Google, and then classifies them as: (1) positive, negative, or neutral, (2) positive or negative, (3) positive or non-positive, (4) negative or non-negative. To build this system, first we collected some documents about mobile phones, tablets, and notebooks on two popular web forums: mobile01.com and ptt.cc. Next, a sentiment label (positive, negative, or neutral) is attached to each document and each sentence of these documents. We designed a two-level supervised sentiment classification experiment. At sentence level, we trained classifiers that classify sentences as positive, negative, or neutral. The best sentence classifier was then used at document level. At document level, the sentiment labels of the sentences in documents are used. We trained classifiers in four different classification problems: (1) positive, negative, or neutral, (2) positive vs. negative, (3) positive vs. non-positive, (4) negative vs. non-negative. The best is the second classifier with an average F-measure of 0.87. The next is the fourth classifier with an F-measure of 0.83 on negative class, and then comes with the third classifier with an F-measure of 0.81 on positive class. The last is the first classifier with an average F-measure of 0.77. Our accuracy is not worse than the past English study on the classification of positive vs. negative. Finally, we conducted another classification experiment using document-level-only classification method, and the results showed that our two-level sentiment classification (first sentence level, then document level) outperforms document-level-only sentiment classification.
Identifer | oai:union.ndltd.org:CHENGCHI/G0101356016 |
Creators | 黃泓彰, Huang, Hung Chang |
Publisher | 國立政治大學 |
Source Sets | National Chengchi University Libraries |
Language | 中文 |
Detected Language | English |
Type | text |
Rights | Copyright © nccu library on behalf of the copyright holders |
Page generated in 0.0029 seconds