• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 30
  • 24
  • 8
  • 8
  • 6
  • 5
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 91
  • 91
  • 51
  • 50
  • 15
  • 14
  • 14
  • 14
  • 14
  • 14
  • 14
  • 12
  • 12
  • 12
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Combining Granularity-based Topic-Dependent and Topic-Independent Evidences for Opinion Detection

Missen, Malik Muhammad Saad, Boughanem, Mohand, Cabanac, Guillaume 07 June 2011 (has links) (PDF)
Fouille des opinion, une sous-discipline dans la recherche d'information (IR) et la linguistique computationnelle, fait référence aux techniques de calcul pour l'extraction, la classification, la compréhension et l'évaluation des opinions exprimées par diverses sources de nouvelles en ligne, social commentaires des médias, et tout autre contenu généré par l'utilisateur. Il est également connu par de nombreux autres termes comme trouver l'opinion, la détection d'opinion, l'analyse des sentiments, la classification sentiment, de détection de polarité, etc. Définition dans le contexte plus spécifique et plus simple, fouille des opinion est la tâche de récupération des opinions contre son besoin aussi exprimé par l'utilisateur sous la forme d'une requête. Il ya de nombreux problèmes et défis liés à l'activité fouille des opinion. Dans cette thèse, nous nous concentrons sur quelques problèmes d'analyse d'opinion. L'un des défis majeurs de fouille des opinion est de trouver des opinions concernant spécifiquement le sujet donné (requête). Un document peut contenir des informations sur de nombreux sujets à la fois et il est possible qu'elle contienne opiniâtre texte sur chacun des sujet ou sur seulement quelques-uns. Par conséquent, il devient très important de choisir les segments du document pertinentes à sujet avec leurs opinions correspondantes. Nous abordons ce problème sur deux niveaux de granularité, des phrases et des passages. Dans notre première approche de niveau de phrase, nous utilisons des relations sémantiques de WordNet pour trouver cette association entre sujet et opinion. Dans notre deuxième approche pour le niveau de passage, nous utilisons plus robuste modèle de RI i.e. la language modèle de se concentrer sur ce problème. L'idée de base derrière les deux contributions pour l'association d'opinion-sujet est que si un document contient plus segments textuels (phrases ou passages) opiniâtre et pertinentes à sujet, il est plus opiniâtre qu'un document avec moins segments textuels opiniâtre et pertinentes. La plupart des approches d'apprentissage-machine basée à fouille des opinion sont dépendants du domaine i.e. leurs performances varient d'un domaine à d'autre. D'autre part, une approche indépendant de domaine ou un sujet est plus généralisée et peut maintenir son efficacité dans différents domaines. Cependant, les approches indépendant de domaine souffrent de mauvaises performances en général. C'est un grand défi dans le domaine de fouille des opinion à développer une approche qui est plus efficace et généralisé. Nos contributions de cette thèse incluent le développement d'une approche qui utilise de simples fonctions heuristiques pour trouver des documents opiniâtre. Fouille des opinion basée entité devient très populaire parmi les chercheurs de la communauté IR. Il vise à identifier les entités pertinentes pour un sujet donné et d'en extraire les opinions qui leur sont associées à partir d'un ensemble de documents textuels. Toutefois, l'identification et la détermination de la pertinence des entités est déjà une tâche difficile. Nous proposons un système qui prend en compte à la fois l'information de l'article de nouvelles en cours ainsi que des articles antérieurs pertinents afin de détecter les entités les plus importantes dans les nouvelles actuelles. En plus de cela, nous présentons également notre cadre d'analyse d'opinion et tâches relieés. Ce cadre est basée sur les évidences contents et les évidences sociales de la blogosphère pour les tâches de trouver des opinions, de prévision et d'avis de classement multidimensionnel. Cette contribution d'prématurée pose les bases pour nos travaux futurs. L'évaluation de nos méthodes comprennent l'utilisation de TREC 2006 Blog collection et de TREC Novelty track 2004 collection. La plupart des évaluations ont été réalisées dans le cadre de TREC Blog track.
62

Application of common sense computing for the development of a novel knowledge-based opinion mining engine

Erik, Cambria January 2011 (has links)
The ways people express their opinions and sentiments have radically changed in the past few years thanks to the advent of social networks, web communities, blogs, wikis and other online collaborative media. The distillation of knowledge from this huge amount of unstructured information can be a key factor for marketers who want to create an image or identity in the minds of their customers for their product, brand, or organisation. These online social data, however, remain hardly accessible to computers, as they are specifically meant for human consumption. The automatic analysis of online opinions, in fact, involves a deep understanding of natural language text by machines, from which we are still very far. Hitherto, online information retrieval has been mainly based on algorithms relying on the textual representation of web-pages. Such algorithms are very good at retrieving texts, splitting them into parts, checking the spelling and counting their words. But when it comes to interpreting sentences and extracting meaningful information, their capabilities are known to be very limited. Existing approaches to opinion mining and sentiment analysis, in particular, can be grouped into three main categories: keyword spotting, in which text is classified into categories based on the presence of fairly unambiguous affect words; lexical affinity, which assigns arbitrary words a probabilistic affinity for a particular emotion; statistical methods, which calculate the valence of affective keywords and word co-occurrence frequencies on the base of a large training corpus. Early works aimed to classify entire documents as containing overall positive or negative polarity, or rating scores of reviews. Such systems were mainly based on supervised approaches relying on manually labelled samples, such as movie or product reviews where the opinionist’s overall positive or negative attitude was explicitly indicated. However, opinions and sentiments do not occur only at document level, nor they are limited to a single valence or target. Contrary or complementary attitudes toward the same topic or multiple topics can be present across the span of a document. In more recent works, text analysis granularity has been taken down to segment and sentence level, e.g., by using presence of opinion-bearing lexical items (single words or n-grams) to detect subjective sentences, or by exploiting association rule mining for a feature-based analysis of product reviews. These approaches, however, are still far from being able to infer the cognitive and affective information associated with natural language as they mainly rely on knowledge bases that are still too limited to efficiently process text at sentence level. In this thesis, common sense computing techniques are further developed and applied to bridge the semantic gap between word-level natural language data and the concept-level opinions conveyed by these. In particular, the ensemble application of graph mining and multi-dimensionality reduction techniques on two common sense knowledge bases was exploited to develop a novel intelligent engine for open-domain opinion mining and sentiment analysis. The proposed approach, termed sentic computing, performs a clause-level semantic analysis of text, which allows the inference of both the conceptual and emotional information associated with natural language opinions and, hence, a more efficient passage from (unstructured) textual information to (structured) machine-processable data. The engine was tested on three different resources, namely a Twitter hashtag repository, a LiveJournal database and a PatientOpinion dataset, and its performance compared both with results obtained using standard sentiment analysis techniques and using different state-of-the-art knowledge bases such as Princeton’s WordNet, MIT’s ConceptNet and Microsoft’s Probase. Differently from most currently available opinion mining services, the developed engine does not base its analysis on a limited set of affect words and their co-occurrence frequencies, but rather on common sense concepts and the cognitive and affective valence conveyed by these. This allows the engine to be domain-independent and, hence, to be embedded in any opinion mining system for the development of intelligent applications in multiple fields such as Social Web, HCI and e-health. Looking ahead, the combined novel use of different knowledge bases and of common sense reasoning techniques for opinion mining proposed in this work, will, eventually, pave the way for development of more bio-inspired approaches to the design of natural language processing systems capable of handling knowledge, retrieving it when necessary, making analogies and learning from experience.
63

K lingvistické struktuře emocionálního významu v češtině / On the Linguistic Structure of Emotional Meaning in Czech

Veselovská, Kateřina January 2015 (has links)
Title: On the Linguistic Structure of Emotional Meaning in Czech Author: Mgr. Kateřina Veselovská Department: Institute of Formal and Applied Linguistics Supervisor: Prof. PhDr. Eva Hajičová, DrSc., Institute of Formal and Applied Linguistics Keywords: emotional meaning, linguistic structure, sentiment analysis, opinion mining, evaluative language Abstract: This thesis has two main goals. First, we provide an analysis of language means which together form an emotional meaning of written utterances in Czech. Sec- ond, we employ the findings concerning emotional language in computational applications. We provide a systematic overview of lexical, morphosyntactic, semantic and pragmatic aspects of emotional meaning in Czech utterances. Also, we propose two formal representations of emotional structures within the framework of the Prague Dependency Treebank and Construction Grammar. Regarding the computational applications, we focus on sentiment analysis, i.e. automatic extraction of emotions from text. We describe a creation of manually annotated emotional data resources in Czech and perform two main sentiment analysis tasks, polarity classification and opinion target identification on Czech data. In both of these tasks, we reach the state-of-the-art results.
64

Tell me why : uma arquitetura para fornecer explicações sobre revisões / Tell me why : an architecture to provide rich review explanations

Woloszyn, Vinicius January 2015 (has links)
O que as outras pessoas pensam sempre foi uma parte importante do processo de tomada de decisão. Por exemplo, as pessoas costumam consultar seus amigos para obter um parecer sobre um livro ou um filme ou um restaurante. Hoje em dia, os usuários publicam suas opiniões em sites de revisão colaborativa, como IMDB para filmes, Yelp para restaurantes e TripAdiviser para hotéis. Ao longo do tempo, esses sites têm construído um enorme banco de dados que conecta usuários, artigos e opiniões expressas por uma classificação numérica e um comentário de texto livre que explicam por que eles gostam ou não gostam de um item. Mas essa vasta quantidade de dados pode prejudicar o usuário a obter uma opinião. Muitos trabalhos relacionados fornecem uma interpretações de revisões para os usuários. Eles oferecem vantagens diferentes para vários tipos de resumos. No entanto, todos eles têm a mesma limitação: eles não fornecem resumos personalizados nem contrastantes comentários escritos por diferentes segmentos de colaboradores. Compreeder e contrastar comentários escritos por diferentes segmentos de revisores ainda é um problema de pesquisa em aberto. Assim, nosso trabalho propõe uma nova arquitetura, chamado Tell Me Why. TMW é um projeto desenvolvido no Laboratório de Informática Grenoble em cooperação com a Universidade Federal do Rio Grande do Sul para fornecer aos usuários uma melhor compreensão dos comentários. Propomos uma combinação de análise de texto a partir de comentários com a mineração de dados estruturado resultante do cruzamento de dimensões do avaliador e item. Além disso, este trabalho realiza uma investigação sobre métodos de sumarização utilizados na revisão de produtos. A saída de nossa arquitetura consiste em declarações personalizadas de texto usando Geração de Linguagem Natural composto por atributos de itens e comentários resumidos que explicam a opinião das pessoas sobre um determinado assunto. Os resultados obtidos a partir de uma avaliação comparativa com a Revisão Mais Útil da Amazon revelam que é uma abordagem promissora e útil na opinião do usuário. / What other people think has been always an important part of the process of decision-making. For instance, people usually consult their friends to get an opinion about a book, or a movie or a restaurant. Nowadays, users publish their opinions on collaborative reviewing sites such as IMDB for movies, Yelp for restaurants and TripAdvisor for hotels. Over the time, these sites have built a massive database that connects users, items and opinions expressed by a numeric rating and a free text review that explain why they like or dislike a specific item. But this vast amount of data can hamper the user to get an opinion. Several related work provide a review interpretations to the users. They offer different advantages for various types of summaries. However, they all have the same limitation: they do not provide personalized summaries nor contrasting reviews written by different segments of reviewers. Understanding and contrast reviews written by different segments of reviewers is still an open research problem. Our work proposes a new architecture, called Tell Me Why, which is a project developed at Grenoble Informatics Laboratory in cooperation with Federal University of Rio Grande do Sul to provide users a better understanding of reviews. We propose a combination of text analysis from reviews with mining structured data resulting from crossing reviewer and item dimensions. Additionally, this work performs an investigation of summarization methods utilized in review domain. The output of our architecture consists of personalized statement using Natural Language Generation that explain people’s opinion about a particular item. The evaluation reveal that it is a promising approach and useful in user’s opinion.
65

Tell me why : uma arquitetura para fornecer explicações sobre revisões / Tell me why : an architecture to provide rich review explanations

Woloszyn, Vinicius January 2015 (has links)
O que as outras pessoas pensam sempre foi uma parte importante do processo de tomada de decisão. Por exemplo, as pessoas costumam consultar seus amigos para obter um parecer sobre um livro ou um filme ou um restaurante. Hoje em dia, os usuários publicam suas opiniões em sites de revisão colaborativa, como IMDB para filmes, Yelp para restaurantes e TripAdiviser para hotéis. Ao longo do tempo, esses sites têm construído um enorme banco de dados que conecta usuários, artigos e opiniões expressas por uma classificação numérica e um comentário de texto livre que explicam por que eles gostam ou não gostam de um item. Mas essa vasta quantidade de dados pode prejudicar o usuário a obter uma opinião. Muitos trabalhos relacionados fornecem uma interpretações de revisões para os usuários. Eles oferecem vantagens diferentes para vários tipos de resumos. No entanto, todos eles têm a mesma limitação: eles não fornecem resumos personalizados nem contrastantes comentários escritos por diferentes segmentos de colaboradores. Compreeder e contrastar comentários escritos por diferentes segmentos de revisores ainda é um problema de pesquisa em aberto. Assim, nosso trabalho propõe uma nova arquitetura, chamado Tell Me Why. TMW é um projeto desenvolvido no Laboratório de Informática Grenoble em cooperação com a Universidade Federal do Rio Grande do Sul para fornecer aos usuários uma melhor compreensão dos comentários. Propomos uma combinação de análise de texto a partir de comentários com a mineração de dados estruturado resultante do cruzamento de dimensões do avaliador e item. Além disso, este trabalho realiza uma investigação sobre métodos de sumarização utilizados na revisão de produtos. A saída de nossa arquitetura consiste em declarações personalizadas de texto usando Geração de Linguagem Natural composto por atributos de itens e comentários resumidos que explicam a opinião das pessoas sobre um determinado assunto. Os resultados obtidos a partir de uma avaliação comparativa com a Revisão Mais Útil da Amazon revelam que é uma abordagem promissora e útil na opinião do usuário. / What other people think has been always an important part of the process of decision-making. For instance, people usually consult their friends to get an opinion about a book, or a movie or a restaurant. Nowadays, users publish their opinions on collaborative reviewing sites such as IMDB for movies, Yelp for restaurants and TripAdvisor for hotels. Over the time, these sites have built a massive database that connects users, items and opinions expressed by a numeric rating and a free text review that explain why they like or dislike a specific item. But this vast amount of data can hamper the user to get an opinion. Several related work provide a review interpretations to the users. They offer different advantages for various types of summaries. However, they all have the same limitation: they do not provide personalized summaries nor contrasting reviews written by different segments of reviewers. Understanding and contrast reviews written by different segments of reviewers is still an open research problem. Our work proposes a new architecture, called Tell Me Why, which is a project developed at Grenoble Informatics Laboratory in cooperation with Federal University of Rio Grande do Sul to provide users a better understanding of reviews. We propose a combination of text analysis from reviews with mining structured data resulting from crossing reviewer and item dimensions. Additionally, this work performs an investigation of summarization methods utilized in review domain. The output of our architecture consists of personalized statement using Natural Language Generation that explain people’s opinion about a particular item. The evaluation reveal that it is a promising approach and useful in user’s opinion.
66

Tell me why : uma arquitetura para fornecer explicações sobre revisões / Tell me why : an architecture to provide rich review explanations

Woloszyn, Vinicius January 2015 (has links)
O que as outras pessoas pensam sempre foi uma parte importante do processo de tomada de decisão. Por exemplo, as pessoas costumam consultar seus amigos para obter um parecer sobre um livro ou um filme ou um restaurante. Hoje em dia, os usuários publicam suas opiniões em sites de revisão colaborativa, como IMDB para filmes, Yelp para restaurantes e TripAdiviser para hotéis. Ao longo do tempo, esses sites têm construído um enorme banco de dados que conecta usuários, artigos e opiniões expressas por uma classificação numérica e um comentário de texto livre que explicam por que eles gostam ou não gostam de um item. Mas essa vasta quantidade de dados pode prejudicar o usuário a obter uma opinião. Muitos trabalhos relacionados fornecem uma interpretações de revisões para os usuários. Eles oferecem vantagens diferentes para vários tipos de resumos. No entanto, todos eles têm a mesma limitação: eles não fornecem resumos personalizados nem contrastantes comentários escritos por diferentes segmentos de colaboradores. Compreeder e contrastar comentários escritos por diferentes segmentos de revisores ainda é um problema de pesquisa em aberto. Assim, nosso trabalho propõe uma nova arquitetura, chamado Tell Me Why. TMW é um projeto desenvolvido no Laboratório de Informática Grenoble em cooperação com a Universidade Federal do Rio Grande do Sul para fornecer aos usuários uma melhor compreensão dos comentários. Propomos uma combinação de análise de texto a partir de comentários com a mineração de dados estruturado resultante do cruzamento de dimensões do avaliador e item. Além disso, este trabalho realiza uma investigação sobre métodos de sumarização utilizados na revisão de produtos. A saída de nossa arquitetura consiste em declarações personalizadas de texto usando Geração de Linguagem Natural composto por atributos de itens e comentários resumidos que explicam a opinião das pessoas sobre um determinado assunto. Os resultados obtidos a partir de uma avaliação comparativa com a Revisão Mais Útil da Amazon revelam que é uma abordagem promissora e útil na opinião do usuário. / What other people think has been always an important part of the process of decision-making. For instance, people usually consult their friends to get an opinion about a book, or a movie or a restaurant. Nowadays, users publish their opinions on collaborative reviewing sites such as IMDB for movies, Yelp for restaurants and TripAdvisor for hotels. Over the time, these sites have built a massive database that connects users, items and opinions expressed by a numeric rating and a free text review that explain why they like or dislike a specific item. But this vast amount of data can hamper the user to get an opinion. Several related work provide a review interpretations to the users. They offer different advantages for various types of summaries. However, they all have the same limitation: they do not provide personalized summaries nor contrasting reviews written by different segments of reviewers. Understanding and contrast reviews written by different segments of reviewers is still an open research problem. Our work proposes a new architecture, called Tell Me Why, which is a project developed at Grenoble Informatics Laboratory in cooperation with Federal University of Rio Grande do Sul to provide users a better understanding of reviews. We propose a combination of text analysis from reviews with mining structured data resulting from crossing reviewer and item dimensions. Additionally, this work performs an investigation of summarization methods utilized in review domain. The output of our architecture consists of personalized statement using Natural Language Generation that explain people’s opinion about a particular item. The evaluation reveal that it is a promising approach and useful in user’s opinion.
67

Recognition and Linking of Product Mentions in User-generated Contents

Vieira, Henry Silva, +55-92-98165-9404 25 September 2018 (has links)
Submitted by Henry Silva Vieira (henry@icomp.ufam.edu.br) on 2018-10-15T15:55:24Z No. of bitstreams: 3 tese-henry-vieira.pdf: 1191114 bytes, checksum: b10400a0fae82d5f844e9bb1c5ec4519 (MD5) folha-de-aprovacao.pdf: 315239 bytes, checksum: b497a7ed9186152c12bd92a9cc4c206e (MD5) ata-de-defesa.pdf: 482952 bytes, checksum: 62340b99b961e7b1d17cdbf1ae2621a2 (MD5) / Approved for entry into archive by Secretaria PPGI (secretariappgi@icomp.ufam.edu.br) on 2018-10-15T18:52:47Z (GMT) No. of bitstreams: 3 tese-henry-vieira.pdf: 1191114 bytes, checksum: b10400a0fae82d5f844e9bb1c5ec4519 (MD5) folha-de-aprovacao.pdf: 315239 bytes, checksum: b497a7ed9186152c12bd92a9cc4c206e (MD5) ata-de-defesa.pdf: 482952 bytes, checksum: 62340b99b961e7b1d17cdbf1ae2621a2 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2018-10-16T17:41:31Z (GMT) No. of bitstreams: 3 tese-henry-vieira.pdf: 1191114 bytes, checksum: b10400a0fae82d5f844e9bb1c5ec4519 (MD5) folha-de-aprovacao.pdf: 315239 bytes, checksum: b497a7ed9186152c12bd92a9cc4c206e (MD5) ata-de-defesa.pdf: 482952 bytes, checksum: 62340b99b961e7b1d17cdbf1ae2621a2 (MD5) / Made available in DSpace on 2018-10-16T17:41:31Z (GMT). No. of bitstreams: 3 tese-henry-vieira.pdf: 1191114 bytes, checksum: b10400a0fae82d5f844e9bb1c5ec4519 (MD5) folha-de-aprovacao.pdf: 315239 bytes, checksum: b497a7ed9186152c12bd92a9cc4c206e (MD5) ata-de-defesa.pdf: 482952 bytes, checksum: 62340b99b961e7b1d17cdbf1ae2621a2 (MD5) Previous issue date: 2018-09-25 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / FAPEAM - Fundação de Amparo à Pesquisa do Estado do Amazonas / A mídia social online tornou-se uma parte essencial de nossa vida diária. Por meio dessas mídias, os usuários trocam informações que geram usando diversos mecanismos de comunicação. Nesse contexto, mais e mais usuários transmitem e confiam em informações publicadas por outros usuários sobre uma grande variedade de tópicos, incluindo opiniões e informações sobre produtos. A extração e o processamento automáticos de informações geradas pelo usuário nas mídias sociais podem fornecer informações e conhecimento relevantes para uma variedade de aplicativos interessantes. Em particular, uma das técnicas de análise de conteúdo mais aplicadas às mídias sociais é a de mineração de opinião. Uma das tarefas básicas associadas à mineração de opinião é extrair e categorizar as entidades de destino, ou seja, identificar as menções de entidade no texto e vincular essas menções de entidade a entidades do mundo real sobre as quais as opiniões são feitas. Em nosso trabalho, nos concentramos em entidades-alvo de um tipo específico e atualmente relevante: produtos eletrônicos de consumo. Tais produtos são o principal assunto de opiniões postadas pelos usuários em várias postagens em fóruns de discussão e sites de varejo na Web. Neste trabalho, estamos interessados ​​em usar o conteúdo textual não estruturado gerado por usuários de mídia social para permitir continuamente enriquecer o conhecimento sobre produtos representados em catálogos de produtos. Portanto, a tarefa que abordamos aqui é como reconhecer e vincular menções a produtos em conteúdo textual gerado pelo usuário para o produto, de um catálogo, ao qual eles se referem. Afirmamos que duas sub-tarefas básicas surgem: primeiro, a extração de entidades alvo mencionada em conteúdo textual não-estruturado; segundo, a desambiguação de entidades extraídas, isto é, ligação menções extraídas à sua contraparte do mundo real. Neste trabalho, desenvolvemos métodos para abordar essas duas subtarefas. Esta tese detalha essas tarefas, discute nossas ideias para os métodos que desenvolvemos e apresenta nossas contribuições e resultados para esse objetivo. / Online social media has grown into an essential part of our daily life. Through these media, users exchange information that they generate by using many different communication mechanisms. In this context, more and more users pass on and trust information published by other users on a large variety of topics, including opinion and information about products. Automatically extracting and processing user-generated information in social media can provide relevant information and knowledge to a variety of interesting applications. In particular, one of the content analysis techniques most often applied to social media is that of opinion mining. One of the basic tasks associated with opinion mining is extracting and categorizing target entities, i.e., identifying entity mentions in text, and linking these entity mentions to unique real world entities about which the opinions are made. In our work, we focus on target entities of a specific, and currently relevant, type: consumer electronic products. Such products are the main subject of opinions posted by users on a number of posts in discussion forums and retail sites over the Web. In this work, we are interested in using the unstructured textual content generated by social media users to continuously allow enriching the knowledge about products represented in product catalogs. Therefore, the task we address here is how to recognize and link mentions to products in user generated textual content to the product, from a catalog, they refer to. We claim that two basic sub-tasks arise: first, extraction of target entities mentions from unstructured textual content; second, disambiguation of extracted entities, i.e., linking extracted mentions to their real world counterpart. In this work, we developed methods to address these two sub-tasks. This thesis details these tasks, discusses our ideas for the methods we developed, and presents our contributions and results towards this goal. / Não tive dificuldades, tudo funcionou corretamente.
68

多重插補法在線上使用者評分之應用 / Managing online user-generated product reviews using multiple imputation methods

李岑志, Li, Cen Jhih Unknown Date (has links)
隨著網路普及,人們越來越常在網路上購物並在線上評價商品,產生了非常大的口碑效應。不論對廠商或對消費者來說,線上商品評論都已經變得非常重要;消費者能藉由他人購買經驗判斷產品優劣,廠商能藉由消費者評價來提升產品品質,目前已有許多電子商務網站都有蒐集消費者購買產品後的意見回饋。 這些網站中有些提供消費者能對產品打一個總分並寫一段文字評論,然而每個消費者所評論的產品特徵通常各有不同,尤其是較晚購買的消費者更可能因為自己的意見已經有人提過而省略。將每個人提到的文字敘述量化為數字分數時,沒有寫到的特徵將會使量化後的資料存在許多遺漏值。 同時消費者也有可能提到一些不重要的特徵,若能找到消費者評論中,各個特徵影響消費者的多寡,廠商就能針對產品較重要的缺點改進。本研究將會著重探討消費者所提到的特徵對產品總分的影響,以及這些遺漏值填補後是否能接近消費者真實意見。 過去許多填補遺漏值的方法都是一次填補全部資料,並沒有考慮消費者會受到時間較早的評論影響。本研究設計一套多重插補的方法並透過模擬驗證,以之填補亞馬遜網站的Canon 系列 SX210、SX230、SX260等三個世代數位相機之消費者評論資料。研究結果指出此方法能夠準確估計各項特徵對產品總分的影響。 / Online user-generated product reviews have become a rich source of product quality information for both producers and customers. As a result, many E-commerce websites allow customers to rate products using scores, and some together with text comments. However, people usually comment only on the features they care about and might omit those have been mentioned by previous customers. Consequently, missing data occur when analyzing comments. In addition, customers may comment the features which influence neither their satisfaction nor sales volume. Thus, it is important to find the significant features so that manufacturers can improve the main defects. Our research focuses on modeling customer reviews and their influence on predicting overall ratings. We aim to understand whether, by filling up missing values, the critical features can be identified and the features rating authentically reflect customer opinion. Many previous studies fill whole the dataset, but not consider that customer reviews might be influenced by the foregoing reviews. We propose a method based on multiple imputation and fill the costumer reviews of Canon digital camera (SX210, SX230, SX260 generations) on Amazon. We design a simulation to verify the method’s effectiveness and the method get a great result on identifying the critical features.
69

Combining Lexicon- and Learning-based Approaches for Improved Performance and Convenience in Sentiment Classification

Sommar, Fredrik, Wielondek, Milosz January 2015 (has links)
Sentiment classification is the process of categorizing data into categories based on its polarity with a wide array of applications across several industries. This report examines a combination of two prominent approaches to sentiment classification using a lexicon of weighted words and machine learning respectively. These approaches are compared with the combined hybrid approach in order to give an account of their relative strengths and weaknesses. When run on a set of IMDb movie reviews the results indicate that the hybrid model performs better than the lexicon-based approach, in turn being outperformed by the learning-based approach. However, the gain in convenience brought on by eliminating the need for training data makes the hybrid model an appealing alternative to the other approaches with a slight trade-off in performance. / Att klassificera text i kategorier baserat på känslan de uttrycker är ett aktuellt område idag och kan tillämpas inom många industrier. Rapporten undersöker en kombination av de två framstående tillvägagångssätten till denna typ av klassificering baserade på ett lexikon med definerade ordvikter respektive maskininlärning. Denna hybridlösning jämförs mot de två andra tillvägagångssätten för att framlägga deras relativa styrkor och svagheter. På ett dataset med filmrecensioner från IMDb får maskininlärningsklassificeraren bäst resultat, följt av hybridlösningen och sist den lexikonbaserade lösningen. Trots det kan hybridlösningen vara att föredra i situationer där det är ogenomförbart eller oskäligt att förbereda träningsdata för maskininlärningsklassificeraren, dock med ett visst avkall på prestanda.
70

Analyse des sentiments : système autonome d'exploration des opinions exprimées dans les critiques cinématographiques

Dziczkowski, Grzegorz 04 December 2008 (has links) (PDF)
Cette thèse décrit l'étude et le développement d'un système conçu pour l'évaluation des sentiments des critiques cinématographiques. Un tel système permet :<br />- la recherche automatique des critiques sur Internet,<br />- l'évaluation et la notation des opinions des critiques cinématographiques,<br />- la publication des résultats.<br /><br />Afin d'améliorer les résultats d'application des algorithmes prédicatifs, l'objectif de ce système est de fournir un système de support pour les moteurs de prédiction analysant les profils des utilisateurs. Premièrement, le système recherche et récupère les probables critiques cinématographiques de l'Internet, en particulier celles exprimées par les commentateurs prolifiques. <br /><br />Par la suite, le système procède à une évaluation et à une notation de l'opinion<br />exprimée dans ces critiques cinématographiques pour automatiquement associer<br />une note numérique à chaque critique ; tel est l'objectif du système.<br />La dernière étape est de regrouper les critiques (ainsi que les notes) avec l'utilisateur qui les a écrites afin de créer des profils complets, et de mettre à disposition ces profils pour les moteurs de prédictions.<br /><br />Pour le développement de ce système, les travaux de recherche de cette thèse portaient essentiellement sur la notation des sentiments ; ces travaux s'insérant dans les domaines de ang : Opinion Mining et d'Analyse des Sentiments.<br />Notre système utilise trois méthodes différentes pour le classement des opinions. Nous présentons deux nouvelles méthodes ; une fondée sur les connaissances linguistiques et une fondée sur la limite de traitement statistique et linguistique. Les résultats obtenus sont ensuite comparés avec la méthode statistique basée sur le classificateur de Bayes, largement utilisée dans le domaine.<br />Il est nécessaire ensuite de combiner les résultats obtenus, afin de rendre l'évaluation finale aussi précise que possible. Pour cette tâche nous avons utilisé un quatrième classificateur basé sur les réseaux de neurones.<br /><br />Notre notation des sentiments à savoir la notation des critiques est effectuée sur une échelle de 1 à 5. Cette notation demande une analyse linguistique plus profonde qu'une notation seulement binaire : positive ou négative, éventuellement subjective ou objective, habituellement utilisée.<br /><br />Cette thèse présente de manière globale tous les modules du système conçu et de manière plus détaillée la partie de notation de l'opinion. En particulier, nous mettrons en évidence les avantages de l'analyse linguistique profonde moins utilisée dans le domaine de l'analyse des sentiments que l'analyse statistique.

Page generated in 0.0762 seconds