Spelling suggestions: "subject:"[een] SENTIMENT ANALYSIS"" "subject:"[enn] SENTIMENT ANALYSIS""
271 |
Fouille de documents et d'opinions multilingue / Mining Documents and Sentiments in Cross-lingual ContextSaad, Motaz 20 January 2015 (has links)
L’objectif de cette thèse est d’étudier les sentiments dans les documents comparables. Premièrement, nous avons recueillis des corpus comparables en anglais, français et arabe de Wikipédia et d’Euronews, et nous avons aligné ces corpus au niveau document. Nous avons en plus collecté des documents d’informations des agences de presse locales et étrangères dans les langues anglaise et arabe. Les documents en anglais ont été recueillis du site de la BBC, ceux en arabe du site d’Al-Jazzera. Deuxièmement, nous avons présenté une mesure de similarité cross-linguistique des documents dans le but de récupérer et aligner automatiquement les documents comparables. Ensuite, nous avons proposé une méthode d’annotation cross-linguistique en termes de sentiments, afin d’étiqueter les documents source et cible avec des sentiments. Enfin, nous avons utilisé des mesures statistiques pour comparer l’accord des sentiments entre les documents comparables source et cible. Les méthodes présentées dans cette thèse ne dépendent pas d’une paire de langue bien déterminée, elles peuvent être appliquées sur toute autre couple de langue / The aim of this thesis is to study sentiments in comparable documents. First, we collect English, French and Arabic comparable corpora from Wikipedia and Euronews, and we align each corpus at the document level. We further gather English-Arabic news documents from local and foreign news agencies. The English documents are collected from BBC website and the Arabic documents are collected from Al-jazeera website. Second, we present a cross-lingual document similarity measure to automatically retrieve and align comparable documents. Then, we propose a cross-lingual sentiment annotation method to label source and target documents with sentiments. Finally, we use statistical measures to compare the agreement of sentiments in the source and the target pair of the comparable documents. The methods presented in this thesis are language independent and they can be applied on any language pair
|
272 |
Os efeitos das revisões críticas online sobre o mercado cinematográfico americano / The effects of online critical reviews over the American movie marketSouza, Thais Luiza Donega e 26 June 2017 (has links)
O mercado cinematográfico pode ser caracterizado como uma indústria de entretenimento com a produção de bens de informação que são também bens de experiência, cuja qualidade só é conhecida após o consumo. Deste modo, a revisão crítica se torna importante para induzir seu consumo, fornecendo previamente algum grau de informação sobre a qualidade do bem. Segue-se o trabalho de Reinstein e Snyder (2005) para determinar se as revisões críticas conduzidas por consumidores e por críticos profissionais online afetam o tempo de exibição de filmes no mercado americano de cinema, medido em quantidades de semanas, conforme modelos de duração/sobrevivência na literatura. Para esta finalidade foi gerado, a partir de sites de cinemas americanos (Box Office Mojo e Rotten Tomatoes), um banco de dados extremamente rico com informações semanais de todos os filmes disponíveis no cinema americano de 2004 a 2015. Especificamente, investigou-se os efeitos das revisões críticas de críticos profissionais de primeira linha (Tops) e de consumidores, conforme a média das notas atribuídas na semana de lançamento de cada filme. No que se refere à avaliação dos consumidores foi aplicada a computação afetiva, que reconhece o sentimento e a emoção em suas resenhas online para captar o efeito boca a boca potencializado pelas mídias sociais e fornecendo, portanto, uma análise mais profunda do boca a boca online. O estudo controla por possíveis problemas de endogeneidade decorrente de simultaneidade, usando as críticas somente antes e durante a semana de lançamento dos filmes. Os resultados sugerem que os críticos profissionais exercem grande influência no tempo de duração dos filmes em cartaz, bem como a positividade dos consumidores em relação ao filme. No entanto, o efeito dos críticos profissionais é em média 3 vezes maior do que dos consumidores. Adicionalmente, pode-se observar que algumas emoções afetam a expectativa de vida dos filmes a depender do gênero do mesmo / The movie market may be considered as entertainment industry, which produces experience goods that is also information goods, whose quality is only known only after consumption. Thus, critical reviews becomes important to induce consumption, since it provides some level of information about product quality. We follow Reinstein and Snyder (2005) works in order to determine if experts and consumers online critical reviews affect the survival time of movies at the American movie market, measured by number of weeks, according to survival analysis models in the literature. For this purpose, an extremely rich database with weekly information on all the films available in American cinema from 2004 to 2015 was generated from American movie sites (Box Office Mojo and Rotten Tomatoes). Specifically, we investigate the effects of critical reviews from top professionals and from consumers, according to the average ratings assigned in each movie\'s release week. As far as consumer assessment was concerned, affective computing was applied, which recognizes the sentiment (sentiment analysis) and emotion (emotion mining) in their online reviews to capture the word-of-mouth effect boosted by social media. The study controls for possible problems of endogeneity due to simultaneity, using the criticisms before and during the week of release of the films. The results suggest that the professional critics exert a great influence on the duration of the films in exhibition, as well as the positivity of the consumers in relation to the film. Thus, the effect of professionals are 5 times greater, generally, than the effect of the consumer critics. Additionally, it can be observed that some emotions affect movie life expectancy depending on the its genre
|
273 |
A semântica da emoção: um estudo contrastivo a partir da FrameNet e da roda das emoçõesFoschiera, Silvia Matturro Panzardi 31 July 2012 (has links)
Submitted by Fabricia Fialho Reginato (fabriciar) on 2015-07-04T00:45:28Z
No. of bitstreams: 1
SilviaFoschiera.pdf: 3755161 bytes, checksum: 0f631548f2054c557658d1a50094a5ac (MD5) / Made available in DSpace on 2015-07-04T00:45:28Z (GMT). No. of bitstreams: 1
SilviaFoschiera.pdf: 3755161 bytes, checksum: 0f631548f2054c557658d1a50094a5ac (MD5)
Previous issue date: 2012-07-31 / Nenhuma / O objetivo principal desta investigação é verificar em que aspectos a Semântica de Frames (FILLMORE, 1982; 1985) e o modelo denominado Roda das Emoções (SCHERER, 2005) contribuem na relação entre a linguagem e o fenômeno da emoção, considerando os idiomas português e espanhol. A Semântica de Frames, perspectiva teórica vinculada à Linguística Cognitiva, fundamenta a análise semântica e sintática por meio de um estudo exploratório do maquinário da FrameNet (FILLMORE et al., 2003). Com base nesse arcabouço teórico, realizamos um levantamento dos frames e elementos de frame de verbos e adjetivos que descrevem a emoção, associando categorias semânticas e sintáticas. Verificamos, também, a possibilidade de mapear o holder e o tópico de opinião considerando o corpus de sentenças do Twitter. A segunda perspectiva teórica está relacionada à Psicologia Cognitiva, por meio do modelo denominado Roda das Emoções. Considerando os traços semânticos sugeridos nessa ferramenta, observa-se o quanto, levando em conta aplicações computacionais, ela vem enriquecer um estudo de Análise de Sentimento. A Roda das Emoções serve para identificar a polaridade das opiniões constantes por meio dos adjetivos nas sentenças da amostra. Os resultados evidenciam que as duas perspectivas se revelam produtivas para aplicações computacionais em Análise de Sentimento. / The main objective of this research is to ascertain which aspects of Frame Semantics (Fillmore, 1982; 1985) and the model called Wheel of Emotions (Scherer, 2005) contribute to the relationship between language and the phenomenon of emotion, in regards to the Portuguese and Spanish languages. Frame Semantics -a theoretical construct linked to cognitive linguistics- underlies the syntactic and semantic analysis by means of an exploratory study of the FrameNet database (Fillmore et al., 2003). Based on this theoretical framework, we conducted a survey of the frames and frame elements that describe emotions, attaching semantic and syntactic categories to them. We also contemplated the possibility of mapping the holder and the subject of opinion in the corpus of sentences from Twitter. The second theoretical perspective is related to cognitive psychology through the Wheel of Emotions. Considering the semantic aspects offered by this tool, it is observed to what extent –including computer applications- it enriches the study of Sentiment Analysis. The Wheel of Emotions helps to identify the polarity of opinions contained in the sample sentences. The results show that the two perspectives prove productive for computer applications in Sentiment Analysis.
|
274 |
Os efeitos das revisões críticas online sobre o mercado cinematográfico americano / The effects of online critical reviews over the American movie marketThais Luiza Donega e Souza 26 June 2017 (has links)
O mercado cinematográfico pode ser caracterizado como uma indústria de entretenimento com a produção de bens de informação que são também bens de experiência, cuja qualidade só é conhecida após o consumo. Deste modo, a revisão crítica se torna importante para induzir seu consumo, fornecendo previamente algum grau de informação sobre a qualidade do bem. Segue-se o trabalho de Reinstein e Snyder (2005) para determinar se as revisões críticas conduzidas por consumidores e por críticos profissionais online afetam o tempo de exibição de filmes no mercado americano de cinema, medido em quantidades de semanas, conforme modelos de duração/sobrevivência na literatura. Para esta finalidade foi gerado, a partir de sites de cinemas americanos (Box Office Mojo e Rotten Tomatoes), um banco de dados extremamente rico com informações semanais de todos os filmes disponíveis no cinema americano de 2004 a 2015. Especificamente, investigou-se os efeitos das revisões críticas de críticos profissionais de primeira linha (Tops) e de consumidores, conforme a média das notas atribuídas na semana de lançamento de cada filme. No que se refere à avaliação dos consumidores foi aplicada a computação afetiva, que reconhece o sentimento e a emoção em suas resenhas online para captar o efeito boca a boca potencializado pelas mídias sociais e fornecendo, portanto, uma análise mais profunda do boca a boca online. O estudo controla por possíveis problemas de endogeneidade decorrente de simultaneidade, usando as críticas somente antes e durante a semana de lançamento dos filmes. Os resultados sugerem que os críticos profissionais exercem grande influência no tempo de duração dos filmes em cartaz, bem como a positividade dos consumidores em relação ao filme. No entanto, o efeito dos críticos profissionais é em média 3 vezes maior do que dos consumidores. Adicionalmente, pode-se observar que algumas emoções afetam a expectativa de vida dos filmes a depender do gênero do mesmo / The movie market may be considered as entertainment industry, which produces experience goods that is also information goods, whose quality is only known only after consumption. Thus, critical reviews becomes important to induce consumption, since it provides some level of information about product quality. We follow Reinstein and Snyder (2005) works in order to determine if experts and consumers online critical reviews affect the survival time of movies at the American movie market, measured by number of weeks, according to survival analysis models in the literature. For this purpose, an extremely rich database with weekly information on all the films available in American cinema from 2004 to 2015 was generated from American movie sites (Box Office Mojo and Rotten Tomatoes). Specifically, we investigate the effects of critical reviews from top professionals and from consumers, according to the average ratings assigned in each movie\'s release week. As far as consumer assessment was concerned, affective computing was applied, which recognizes the sentiment (sentiment analysis) and emotion (emotion mining) in their online reviews to capture the word-of-mouth effect boosted by social media. The study controls for possible problems of endogeneity due to simultaneity, using the criticisms before and during the week of release of the films. The results suggest that the professional critics exert a great influence on the duration of the films in exhibition, as well as the positivity of the consumers in relation to the film. Thus, the effect of professionals are 5 times greater, generally, than the effect of the consumer critics. Additionally, it can be observed that some emotions affect movie life expectancy depending on the its genre
|
275 |
Uma investigação empírica e comparativa da aplicação de RNAs ao problema de mineração de opiniões e análise de sentimentosMoraes, Rodrigo de 26 March 2013 (has links)
Submitted by Silvana Teresinha Dornelles Studzinski (sstudzinski) on 2015-05-04T17:25:43Z
No. of bitstreams: 1
Rodrigo Morais.pdf: 5083865 bytes, checksum: 69563cc7178422ac20ff08fe38ee97de (MD5) / Made available in DSpace on 2015-05-04T17:25:43Z (GMT). No. of bitstreams: 1
Rodrigo Morais.pdf: 5083865 bytes, checksum: 69563cc7178422ac20ff08fe38ee97de (MD5)
Previous issue date: 2013 / Nenhuma / A área de Mineração de Opiniões e Análise de Sentimentos surgiu da necessidade de processamento automatizado de informações textuais referentes a opiniões postadas na web. Como principal motivação está o constante crescimento do volume desse tipo de informação, proporcionado pelas tecnologia trazidas pela Web 2.0, que torna inviável o acompanhamento e análise dessas opiniões úteis tanto para usuários com pretensão de compra de novos produtos quanto para empresas para a identificação de demanda de mercado. Atualmente, a maioria dos estudos em Mineração de Opiniões e Análise de Sentimentos que fazem o uso de mineração de dados se voltam para o desenvolvimentos de técnicas que procuram uma melhor representação do conhecimento e acabam utilizando técnicas de classificação comumente aplicadas, não explorando outras que apresentam bons resultados em outros problemas. Sendo assim, este trabalho tem como objetivo uma investigação empírica e comparativa da aplicação do modelo clássico de Redes Neurais Artificiais (RNAs), o multilayer perceptron , no problema de Mineração de Opiniões e Análise de Sentimentos. Para isso, bases de dados de opiniões são definidas e técnicas de representação de conhecimento textual são aplicadas sobre essas objetivando uma igual representação dos textos para os classificadores através de unigramas. A partir dessa reresentação, os classificadores Support Vector Machines (SVM), Naïve Bayes (NB) e RNAs são aplicados considerandos três diferentes contextos de base de dados: (i) bases de dados balanceadas, (ii) bases com diferentes níveis de desbalanceamento e (iii) bases em que a técnica para o tratamento do desbalanceamento undersampling randômico é aplicada. A investigação do contexto desbalanceado e de outros originados dele se mostra relevante uma vez que bases de opiniões disponíveis na web normalmente apresentam mais opiniões positivas do que negativas. Para a avaliação dos classificadores são utilizadas métricas tanto para a mensuração de desempenho de classificação quanto para a de tempo de execução. Os resultados obtidos sobre o contexto balanceado indicam que as RNAs conseguem superar significativamente os resultados dos demais classificadores e, apesar de apresentarem um grande custo computacional para treinamento, proporcionam tempos de classificação significantemente inferiores aos do classificador que apresentou os resultados de classificação mais próximos aos dos resultados das RNAs. Já para o contexto desbalanceado, as RNAs se mostram sensíveis ao aumento de ruído na representação dos dados e ao aumento do desbalanceamento, se destacando nestes experimentos, o classificador NB. Com a aplicação de undersampling as RNAs conseguem ser equivalentes aos demais classificadores apresentando resultados competitivos. Porém, podem não ser o classificador mais adequado de se adotar nesse contexto quando considerados os tempos de treinamento e classificação, e também a diferença pouco expressiva de acerto de classificação. / The area of Opinion Mining and Sentiment Analysis emerges from the need for automated processing of textual information about reviews posted in the web. The main motivation of this area is the constant volume growth of such information, provided by the technologies brought by Web 2.0, that makes impossible the monitoring and analysis of these reviews that are useful for users, who desire to purchase new products, and for companies to identify market demand as well. Currently, the most studies of Opinion Mining and Sentiment Analysis that make use of data mining aims to the development of techniques that seek a better knowledge representation and using classification techniques commonly applied and they not explore others classifiers that work well in other problems. Thus, this work aims a comparative empirical research of the ap-plication of the classical model of Artificial Neural Networks (ANN), the multilayer perceptron, in the Opinion Mining and Sentiment Analysis problem. For this, reviews datasets are defined and techniques for textual knowledge representation applied to these aiming an equal texts rep-resentation for the classifiers. From this representation, the classifiers Support Vector Machines (SVM), Naïve Bayes (NB) and ANN are applied considering three data context: (i) balanced datasets, (ii) datasets with different unbalanced ratio and (iii) datasets with the application of random undersampling technique for the unbalanced handling. The unbalanced context inves-tigation and of others originated from it becomes relevant once datasets available in the web ordinarily contain more positive opinions than negative. For the classifiers evaluation, metrics both for the classification perform and for run time are used. The results obtained in the bal-anced context indicate that ANN outperformed significantly the others classifiers and, although it has a large computation cost for the training fase, the ANN classifier provides classification time (real-time) significantly less than the classifier that obtained the results closer than ANN. For the unbalanced context, the ANN are sensitive to the growth of noise representation and the unbalanced growth while the NB classifier stood out. With the undersampling application, the ANN classifier is equivalent to the others classifiers attaining competitive results. However, it can not be the most appropriate classifier to this context when the training and classification time and its little advantage of classification accuracy are considered.
|
276 |
探索美國財務報表的主觀性詞彙與盈餘的關聯性:意見分析之應用 / Exploring the relationships between annual earnings and subjective expressions in US financial statements: opinion analysis applications陳建良, Chen, Chien Liang Unknown Date (has links)
財務報表中的主觀性詞彙往往影響市場中的參與者對於報導公司價值和獲利能力衡量的決策判斷。因此,公司的管理階層往往有高度的動機小心謹慎的選擇用詞以隱藏負面的消息而宣揚正面的消息。然而使用人工方式從文字量極大的財務報表挖掘有用的資訊往往不可行,因此本研究採用人工智慧方法驗證美國財務報表中的主觀性多字詞 (subjective MWEs) 和公司的財務狀況是否具有關聯性。多字詞模型往往比傳統的單字詞模型更能掌握句子中的語意情境,因此本研究應用條件隨機域模型 (conditional random field) 辨識多字詞形式的意見樣式。另外,本研究的實證結果發現一些跡象可以印證一般人對於財務報表的文字揭露往往與真實的財務數字存在有落差的印象;更發現在負向的盈餘變化情況下,公司管理階層通常輕描淡寫當下的短拙卻堅定地承諾璀璨的未來。 / Subjective assertions in financial statements influence the judgments of market participants when they assess the value and profitability of the reporting corporations. Hence, the managements of corporations may attempt to conceal the negative and to accentuate the positive with "prudent" wording. To excavate this accounting phenomenon hidden behind financial statements, we designed an artificial intelligence based strategy to investigate the linkage between financial status measured by annual earnings and subjective multi-word expressions (MWEs). We applied the conditional random field (CRF) models to identify opinion patterns in the form of MWEs, and our approach outperformed previous work employing unigram models. Moreover, our novel algorithms take the lead to discover the evidences that support the common belief that there are inconsistencies between the implications of the written statements and the reality indicated by the figures in the financial statements. Unexpected negative earnings are often accompanied by ambiguous and mild statements and sometimes by promises of glorious future.
|
277 |
Predicting Linguistic Structure with Incomplete and Cross-Lingual SupervisionTäckström, Oscar January 2013 (has links)
Contemporary approaches to natural language processing are predominantly based on statistical machine learning from large amounts of text, which has been manually annotated with the linguistic structure of interest. However, such complete supervision is currently only available for the world's major languages, in a limited number of domains and for a limited range of tasks. As an alternative, this dissertation considers methods for linguistic structure prediction that can make use of incomplete and cross-lingual supervision, with the prospect of making linguistic processing tools more widely available at a lower cost. An overarching theme of this work is the use of structured discriminative latent variable models for learning with indirect and ambiguous supervision; as instantiated, these models admit rich model features while retaining efficient learning and inference properties. The first contribution to this end is a latent-variable model for fine-grained sentiment analysis with coarse-grained indirect supervision. The second is a model for cross-lingual word-cluster induction and the application thereof to cross-lingual model transfer. The third is a method for adapting multi-source discriminative cross-lingual transfer models to target languages, by means of typologically informed selective parameter sharing. The fourth is an ambiguity-aware self- and ensemble-training algorithm, which is applied to target language adaptation and relexicalization of delexicalized cross-lingual transfer parsers. The fifth is a set of sequence-labeling models that combine constraints at the level of tokens and types, and an instantiation of these models for part-of-speech tagging with incomplete cross-lingual and crowdsourced supervision. In addition to these contributions, comprehensive overviews are provided of structured prediction with no or incomplete supervision, as well as of learning in the multilingual and cross-lingual settings. Through careful empirical evaluation, it is established that the proposed methods can be used to create substantially more accurate tools for linguistic processing, compared to both unsupervised methods and to recently proposed cross-lingual methods. The empirical support for this claim is particularly strong in the latter case; our models for syntactic dependency parsing and part-of-speech tagging achieve the hitherto best published results for a wide number of target languages, in the setting where no annotated training data is available in the target language.
|
278 |
巨量資料環境下之新聞主題暨輿情與股價關係之研究 / A Study of the Relevance between News Topics & Public Opinion and Stock Prices in Big Data張良杰, Chang, Liang Chieh Unknown Date (has links)
近年來科技、網路以及儲存媒介的發達,產生的資料量呈現爆炸性的成長,也宣告了巨量資料時代的來臨。擁有巨量資料代表了不必再依靠傳統抽樣的方式來蒐集資料,分析數據也不再有資料收集不足以致於無法代表母題的限制。突破傳統的限制後,巨量資料的精隨在於如何從中找出有價值的資訊。
以擁有大量輿論和人際互動資訊的社群網站為例,就有相關學者研究其情緒與股價具有正相關性,本研究也試著利用同樣具有巨量資料特性的網路新聞,抓取中央新聞社2013年7月至2014年5月之經濟類新聞共計30,879篇,結合新聞主題偵測與追蹤技術及情感分析,利用新聞事件相似的概念,透過連結匯聚成網絡並且分析新聞的情緒和股價指數的關係。
研究結果顯示,新聞事件間可以連結成一特定新聞主題,且能在龐大的網絡中找出不同的新聞主題,並透過新聞主題之連結產生新聞主題脈絡。對此提供一種新的方式來迅速了解巨量新聞內容,也能有效的回溯新聞主題及新聞事件。
在新聞情緒和股價指數方面,研究發現新聞情緒影響了股價指數之波動,其相關係數達到0.733562;且藉由情緒與心理線及買賣意願指標之比較,顯示新聞的情緒具有一定的程度能夠成為股價判斷之參考依據。 / In recent years, the technology, network, and storage media developed, the amount of generated data with the explosive growth, and also declared the new era of big data. Having big data let us no longer rely on the traditional sample ways to collect data, and no longer have the issue that could not represent the population which caused by the inadequate data collection. Once we break the limitations, the main spirit of big data is how to find out the valuable information in big data.
For example, the social network sites (SNS) have a lot of public opinions and interpersonal information, and scholars have founded that the emotions in SNS have a positive correlation with stock prices. Therefore, the thesis tried to focus on the news which have the same characteristic of big data, using the web crawl to catch total of 30,879 economics news articles form the Central News Agency, furthermore, took the “Topic Detection & Tracking” and “Sentiment Analysis” technology on these articles. Finally, based on the concept of the similarity between news articles, through the links converging networks and analyze the relevant between news sentiment and stock prices.
The results shows that news events can be linked to specific news topics, identify different news topics in a large network, and form the news topic context by linked news topics together. The thesis provides a new way to quickly understand the huge amount of news, and backtracking news topics and news event with effective.
In the aspect of news sentiment and stock prices, the results shows that the news sentiments impact the fluctuations of stock prices, and the correlation coefficient is 0.733562. By comparing the emotion with psychological lines & trading willingness indicators, the emotion is better than the two indicators in the stock prices determination.
|
279 |
對使用者評論之情感分析研究-以Google Play市集為例 / Research into App user opinions with Sentimental Analysis on the Google Play market林育龍, Lin, Yu Long Unknown Date (has links)
全球智慧型手機的出貨量持續提升,且熱門市集的App下載次數紛紛突破500億次。而在iOS和Android手機App市集中,App的評價和評論對App在市集的排序有很大的影響;對於App開發者而言,透過評論確實可掌握使用者的需求,並在產生抱怨前能快速反應避免危機。然而,每日多達上百篇的評論,透過人力逐篇查看,不止耗費時間,更無法整合性的瞭解使用者的需求與問題。
文字情感分析通常會使用監督式或非監督式的方法分析文字評論,其中監督式方法被證實透過簡單的文件量化方法就可達到很高的正確率。但監督式方法有無法預期未知趨勢的限制,且需要進行耗費人力的文章類別標注工作。
本研究透過情感傾向和熱門關注議題兩個面向來分析App評論,提出一個混合非監督式與監督式的中文情感分析方法。我們先透過非監督式方法標注評論類別,並作視覺化整理呈現,最後再用監督式方法建立分類模型,並驗證其效果。
在實驗結果中,利用中文詞彙網路所建立的情感詞集,確實可用來判斷評論的正反情緒,唯判斷負面評論效果不佳需作改善。在議題擷取方面,嘗試使用兩種不同分群方法,其中使用NPMI衡量字詞間關係強度,再配合社群網路分析的Concor方法結果有不錯的成效。最後在使用監督式學習的分類結果中,情感傾向的分類正確率達到87%,關注議題的分類正確率達到96%,皆有不錯表現。
本研究利用中文詞彙網路與社會網路分析,來發展一個非監督式的中文類別判斷方法,並建立一個中文情感分析的範例。另外透過建立全面性的視覺化報告來瞭解使用者的正反回饋意見,並可透過分類模型來掌握新評論的內容,以提供App開發者在市場上之競爭智慧。 / While the number of smartphone shipment is continuesly growing, the number of App downloads from the popular app markets has been already over 50 billion. By Apple App Store and Google Play, ratings and reviews play a more important role in influencing app difusion. While app developers can realize users’ needs by app reviews, more than thousands of reviews produced by user everday become difficult to be read and collated.
Sentiment Analysis researchs encompass supervised and unsupervised methods for analyzing review text. The supervised learning is proven as a useful method and can reach high accuracy, but there are limits where future trend can not be recognized and the labels of individual classes must be made manually.
We concentrate on two issues, viz Sentiment Orientation and Popular Topic, to propose a Chinese Sentiment Analysis method which combines supervised and unsupervised learning. At First, we use unsupervised learning to label every review articles and produce visualized reports. Secondly, we employee supervised learning to build classification model and verify the result.
In the experiment, the Chinese WordNet is used to build sentiment lexicon to determin review’s sentiment orientation, but the result shows it is weak to find out negative review opinions. In the Topic Extraction phase, we apply two clustering methods to extract Popular Topic classes and its result is excellent by using of NPMI Model with Social Network Analysis Method i.e. Concor. In the supervised learning phase, the accuracy of Sentiment Orientation class is 87% and the accuracy of Popular Topic class is 96%.
In this research, we conduct an exemplification of the unsupervised method by means of Chinese WorkNet and Social Network Analysis to determin the review classes. Also, we build a comprehensive visualized report to realize users’ feedbacks and utilize classification to explore new comments. Last but not least, with Chinese Sentiment Analysis of this research, and the competitive intelligence in App market can be provided to the App develops.
|
280 |
基於語意框架之讀者情緒偵測研究 / Semantic Frame-based Approach for Reader-Emotion Detection陳聖傑, Chen, Cen Chieh Unknown Date (has links)
過往對於情緒分析的研究顯少聚焦在讀者情緒,往往著眼於筆者情緒之研究。讀者情緒是指讀者閱讀文章後產生之情緒感受。然而相同一篇文章可能會引起讀者多種情緒反應,甚至產生與筆者迥異之情緒感受,也突顯其讀者情緒分析存在更複雜的問題。本研究之目的在於辨識讀者閱讀文章後之切確情緒,而文件分類的方法能有效地應用於讀者情緒偵測的研究,除了能辨識出正確的讀者情緒之外,並且能保留讀者情緒文件之相關內容。然而,目前的資訊檢索系統仍缺乏對隱含情緒之文件有效的辨識能力,特別是對於讀者情緒的辨識。除此之外,基於機器學習的方法難以讓人類理解,也很難查明辨識失敗的原因,進而無法了解何種文章引發讀者切確的情緒感受。有鑑於此,本研究提出一套基於語意框架(frame-based approach, FBA)之讀者情緒偵測研究的方法,FBA能模擬人類閱讀文章的方式外,並且可以有效地建構讀者情緒之基礎知識,以形成讀者情緒的知識庫。FBA具備高自動化抽取語意概念的基礎知識,除了利用語法結構的特徵,我們進一步考量周邊語境和語義關聯,將相似的知識整合成具有鑑別力之語意框架,並且透過序列比對(sequence alignment)的方式進行讀者情緒文件之匹配。經實驗結果顯示證明,本研究方法能有效地運用於讀者情緒偵測之相關研究。 / Previous studies on emotion classification mainly focus on the writer's emotional state. By contrast, this research emphasizes emotion detection from the readers' perspective. The classification of documents into reader-emotion categories can be applied in several ways, and one of the applications is to retain only the documents that cause desired emotions for enabling users to retrieve documents that contain relevant contents and at the same time instill proper emotions. However, current IR systems lack of ability to discern emotion within texts, reader-emotion has yet to achieve comparable performance. Moreover, the pervious machine learning-based approaches are generally not human understandable, thereby, it is difficult to pinpoint the reason for recognition failures and understand what emotions do articles trigger in their readers.
We propose a flexible semantic frame-based approach (FBA) for reader's emotion detection that simulates such process in human perception. FBA is a highly automated process that incorporates various knowledge sources to learn semantic frames that characterize an emotion and is comprehensible for humans from raw text. Generated frames are adopted to predict readers' emotion through an alignment-based matching algorithm that allows a semantic frame to be partially matched through a statistical scoring scheme. Experiment results demonstrate that our approach can effectively detect readers' emotion by exploiting the syntactic structures and semantic associations in the context as well as outperforms currently well-known statistical text classification methods and the stat-of-the-art reader-emotion detection method.
|
Page generated in 0.0546 seconds