1 |
基於意見探勘與主題模型之部落格食記剖析研究 / A Study of Opinion Mining and Topic Model Analysis on Food Diaries賴柏帆, Lai, Po Fan Unknown Date (has links)
隨著Web 2.0興起,社群網站在資訊傳遞與獲取所占比重相當高。以美食領域來看,人們在進餐廳前先行閱覽食記評論之情形越來越常見,而部落格文章因圖文並茂,常被消費者列入參考比較之來源。儘管這一類食記內容相對短篇食評來說較為完整,但評論分散於文章中,且多半沒有評分可供參考,讀者很難在第一時間獲悉評論樣貌,得花上一番心力進行閱覽,才能對餐廳整體有所評鑑。
本研究提出一套基於意見探勘與主題模型的食記剖析方法,由部落格中各餐廳貼文情緒量來反映正負面評價,將提及評論歸納為「食物」、「服務」及「環境」三個評分面向,進而提供該家餐廳的整體推薦分數,供讀者快速參閱之。實驗語料自痞客邦美食類貼文中選定添好運台灣-台北站前店、京星港式飲茶PART2、金泰日式料理-內湖店以及喀佈貍(一店)大眾和風串燒居酒洋食堂,合計4家餐廳與200篇語料。
透過LDA主題模型對食記敘述進行主題式分群,使擁有相近主題概念的句子分為一群,並歸類至各面向,例如喀佈貍(一店)之語料可分為10群主題語句,食物面向上有6群,服務與環境面向各為2群。另一方面,為了更有效辨別食記中含有的正負向情緒,本研究透過語意導向方法(SO-PMI)來計算食記中常出現情緒詞彙之極性,以建置該領域的意見詞詞庫。
實驗結果方面,以線上餐廳評論網站-iPeen愛評網作為驗證對象,顯示其語料的平均情緒量相近,於大眾觀感與評價上傾向一致,且相較一般評論網站,本研究能從較細微的面向來切入,並以情緒量反映真實的餐廳評價。最後提出未來欲探討與改善之處,供後續研究參考之。 / As the time of Web 2.0 rise, social media platform plays a crucial role in transferring and receiving information. More and more people get used to reading the related posts before having meal. Because of its richness in content and referring photographs, blog posts are most frequently used for reference. Although the blog posts are more complete regarding their content than other short reviews, the actual reviews are scattered among words that are simply descriptions, and there are no grading scale to take as reference. These all together gives the reader a hard time to efficiently organize the overview of the review, and for them to, therefore, make the decision if they should go to the restaurant.
Our study offers a method of analyzing food diaries based on opinion mining and topic model. The scale of emotion in a blog post about a restaurant is used as the reflection of its review's positive or negative. The comments are categorized into food, service and environment. And the restaurant will be graded based on these three aspects to further provide the user an overall score of recommendation.
We collected total of 200 articles written on 4 restaurants in PIXNET, then categorized the contents using LDA (Latent Dirichlet Allocation) model base on their theme. The sentences with similar theme with be put into a group, then be further categorized to the three aspects that was mentioned earlier. On the other hand, to better distinguish if the emotion in certain food diary is positive or negative, our study calculated the polarity of common opinion-based words in food diaries using semantic orientation (SO-PMI), and built an opinion corpus specifically for food diaries.
In terms of the result, using iPeen, a restaurant rating website, as test reference, it shows that the average scales of opinion of the restaurants we got using our method are close to iPeen, which in this case we can say are close to the public opinion and review. Furthermore, compare to common rating website, our study touches on even the minute aspect, and use the cumulative opinion to reflect the true blog authors' evaluation of the restaurant. Lastly, we would like to bring up what we intend to discuss and improve in the future for upcoming research's reference.
|
Page generated in 0.0223 seconds