• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 30
  • 24
  • 8
  • 8
  • 6
  • 5
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 91
  • 91
  • 51
  • 50
  • 15
  • 14
  • 14
  • 14
  • 14
  • 14
  • 14
  • 12
  • 12
  • 12
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Mineração de opiniões baseada em aspectos para revisões de produtos e serviços / Aspect-based Opinion Mining for Reviews of Products and Services

Yugoshi, Ivone Penque Matsuno 27 April 2018 (has links)
A Mineração de Opiniões é um processo que tem por objetivo extrair as opiniões e suas polaridades de sentimentos expressas em textos em língua natural. Essa área de pesquisa tem ganhado destaque devido ao volume de opiniões que os usuários compartilham na Internet, como revisões em sites de e-commerce, rede sociais e tweets. A Mineração de Opiniões baseada em Aspectos é uma alternativa promissora para analisar a polaridade do sentimento em um maior nível de detalhes. Os métodos tradicionais para extração de aspectos e classificação de sentimentos exigem a participação de especialistas de domínio para criar léxicos ou definir regras de extração para diferentes idiomas e domínios. Além disso, tais métodos usualmente exploram algoritmos de aprendizado supervisionado, porém exigem um grande conjunto de dados rotulados para induzir um modelo de classificação. Os desafios desta tese de doutorado estão relacionados a como diminuir a necessidade de grande esforço humano tanto para rotular dados, quanto para tratar a dependência de domínio para as tarefas de extração de aspectos e classificação de sentimentos dos aspectos para Mineração de Opiniões. Para reduzir a necessidade de grande quantidade de exemplos rotulados foi proposta uma abordagem semissupervisionada, denominada por Aspect-based Sentiment Propagation on Heterogeneous Networks (ASPHN) em que são propostas representações de textos nas quais os atributos linguísticos, os aspectos candidatos e os rótulos de sentimentos são modelados por meio de redes heterogêneas. Para redução dos esforços para construir recursos específicos de domínio foi proposta uma abordagem baseada em aprendizado por transferência entre domínios denominada Cross-Domain Aspect Label Propagation through Heterogeneous Networks (CD-ALPHN) que utiliza dados rotulados de outros domínios para suportar tarefas de aprendizado em domínios sem dados rotulados. Nessa abordagem são propostos uma representação em uma rede heterogênea e um método de propagação de rótulos. Os vértices da rede são os aspectos rotulados do domínio de origem, os atributos linguísticos e os candidatos a aspectos do domínio alvo. Além disso, foram analisados métodos de extração de aspectos e propostas algumas variações para considerar cenários nãosupervisionados e independentes de domínio. As soluções propostas nesta tese de doutorado foram avaliadas e comparadas as do estado-da-arte utilizando coleções de revisões de diferentes produtos e serviços. Os resultados obtidos nas avaliações experimentais são competitivos e demonstram que as soluções propostas são promissoras. / Opinion Mining is a process that aims to extract opinions and their sentiment polarities expressed in natural language texts. This area of research has been in the highlight because of the volume of opinions that users share on the available visualization means on the Internet (reviews on e-commerce sites, social networks, tweets, others). Aspect-based Opinion Mining is a promising alternative for analyzing the sentiment polarity on a high level of detail. The traditional methods for aspect extraction and sentiment classification require the participation of domain experts to create lexicons or define extraction rules for different languages and domains. In addition, such methods usually exploit supervised machine learning algorithms, but require a large set of labeled data to induce a classification model. The challenges of this doctoral thesis are related on to how to reduce the need for great human effort both: (i) to label data; and (ii) to treat domain dependency for the tasks of aspect extraction and aspect sentiment classification for Opinion Mining. In order to reduce the need for a large number of labeled examples, a semi-supervised approach was proposed, called Aspect-based Sentiment Propagation on Heterogeneous Networks (ASPHN). In this approach, text representations are proposed in which linguistic attributes, candidate aspects and sentiment labels are modeled by heterogeneous networks. Also, a cross-domain learning approach called Cross-Domain Aspect Label Propagation through Heterogeneous Networks (CD-ALPHN) is proposed in order to reduce efforts to build domain-specific resources, This approach uses labeled data from other domains to support learning tasks in domains without labeled data. A representation in a heterogeneous network and a label propagation method are proposed in this cross-domain learning approach. The vertices of the network are the labeled aspects of the source domain, the linguistic attributes, and the candidate aspects of the target domain. In addition, aspect extraction methods were analyzed and some variations were proposed to consider unsupervised and domain independent scenarios. The solutions proposed in this doctoral thesis were evaluated and compared to the state-of-the-art solutions using collections of different product and service reviews. The results obtained in the experimental evaluations are competitive and demonstrate that the proposed solutions are promising.
72

Um framework para reconhecimento de opinião utilizando Sistema de Informação Geográfica (SIG): um estudo de caso na geração de mapas / A framework for opinion recognition using the Geographic Information System (GIS): a case study in the generation of maps

Nunes Neto, Gilberto 19 August 2016 (has links)
Submitted by Rosivalda Pereira (mrs.pereira@ufma.br) on 2017-07-03T18:15:15Z No. of bitstreams: 1 GilbertoNunes.pdf: 2115811 bytes, checksum: 7e6f22622b699f30f43f51d50a8be819 (MD5) / Made available in DSpace on 2017-07-03T18:15:15Z (GMT). No. of bitstreams: 1 GilbertoNunes.pdf: 2115811 bytes, checksum: 7e6f22622b699f30f43f51d50a8be819 (MD5) Previous issue date: 2016-08-19 / With the globalization of the Internet, the number of users using the means of social communication it is each time bigger. The social network Twitter is a good example. Twitter is often used to post comments on all kinds of subjects, such as artists, products, public health, among others. The spread of information in these media is very important because it can reach people from social class, anytime and anywhere in the world. Twitter supports geo-referenced comments. This feature allows georeferenced tweets. One can use the comments obtained from Twitter to evaluate how the reality of social network reflects the real world. In this sense, the present work proposes a generic Framework that besides evaluating concepts related to the opinion mining, describes the accomplishment of case studies, which analyze sources of textual opinions and proposes to mine opinions at the level of aspect, using as sources of opinion Twitter comments. A prototype extends and implements the proposed Framework to enable the process of opinion mining in social networks. The results show the feasibility of using this Framework to support decision making by its users. / Com a globalização da Internet, o número de usuários utilizando os meios de comunicação social é cada vez maior. A Rede Social Twitter é um bom exemplo disso. Frequentemente, o Twitter é utilizado para postar comentários sobre os mais variados tipos de assuntos, como: artistas, produtos, saúde pública, dentre outros. A propagação da informação nesses meios de comunicação é muito relevante, pois pode atingir pessoas de todas as classes sociais, a qualquer hora e lugar do mundo. O Twitter, além de apresentar tamanha abrangência, permite a postagem de comentários georreferenciados, ou seja, possibilita a localização de onde as postagens foram feitas. Diversos estudos propõem a utilização das postagens obtidas a partir do Twitter, para avaliar o quão esses meios de comunicação refletem o mundo real. Nesse sentido, o presente trabalho propõe um Framework genérico que, além de avaliar conceitos relacionados à mineração de opiniões, descreve a realização de estudos de caso, os quais analisam fontes de opiniões textuais e propõe minerar opiniões em nível de aspecto, utilizando como fontes de opinião comentários do Twitter. Um protótipo estende e implementa o Framework proposto para viabilizar o processo de mineração de opinião em redes sociais. Os resultados obtidos mostram a viabilidade da utilização desse Framework para suporte à tomada de decisão por parte de seus usuários.
73

網路評價搜尋結果的正負意見分類系統 / A sentiment classification system on search results of web opinions

黃泓彰, Huang, Hung Chang Unknown Date (has links)
本研究嘗試建置一個包含兩個主要功能的系統,分別是網路評價搜尋以及情感分類。在網路評價搜尋的部份,我們使用Google搜尋並蒐集一攜帶型智慧裝置(智慧型手機、平板電腦與筆記型電腦)的網路評價搜尋結果;情感分類的部分則是將搜尋結果依照對該產品的意見分類為,共有正面/負面/中立、正面/負面、正面/非正面,以及負面/非負面等四種分類方式。為了建置此系統,我們首先從知名的網路論壇Mobile01和批踢踢蒐集和攜帶型智慧裝置有關的網路文章以及產品名稱,接著以人工的方式標記每篇文章,以及部分文章中的句子的情感。本研究設計了兩個層次的情感分類實驗,我們首先從語句層次出發,以監督式機器學習法訓練將句子分為正面/負面/中立等三個類別的分類模型後,再進入文章層次,將句子的意見彙整,並同樣以監督式機器學習法訓練四種不同文章層次的分類模型:正面/負面/中立、正面/負面、正面/非正面,以及負面/非負面。我們分別選出四種分類實驗中表現最佳的模型,並用於系統建置,其中表現最佳的是分類為正面/負面的分類模型,平均的F-measure為0.87;其次是分類為負面/非負面的模型,對負面類別的F-measure為0.83;接著是分類為正面/非正面的模型,對正面類別的F-measure為0.81;表現最差的是正面/負面/中立的分類,平均的F-measure為0.77。在正面/負面分類的準確率上,本研究的表現並不壞於過去以英文為主要語言的相關研究。最後,我們也以過去不經過語句層次的分類方法進行實驗並比較,其結果發現經過語句層次的情感分類比不經過語句層次的情感分類較佳。 / In this research, we implemented a system that retrieves the search results of mobile phones, tablets, and notebooks from Google, and then classifies them as: (1) positive, negative, or neutral, (2) positive or negative, (3) positive or non-positive, (4) negative or non-negative. To build this system, first we collected some documents about mobile phones, tablets, and notebooks on two popular web forums: mobile01.com and ptt.cc. Next, a sentiment label (positive, negative, or neutral) is attached to each document and each sentence of these documents. We designed a two-level supervised sentiment classification experiment. At sentence level, we trained classifiers that classify sentences as positive, negative, or neutral. The best sentence classifier was then used at document level. At document level, the sentiment labels of the sentences in documents are used. We trained classifiers in four different classification problems: (1) positive, negative, or neutral, (2) positive vs. negative, (3) positive vs. non-positive, (4) negative vs. non-negative. The best is the second classifier with an average F-measure of 0.87. The next is the fourth classifier with an F-measure of 0.83 on negative class, and then comes with the third classifier with an F-measure of 0.81 on positive class. The last is the first classifier with an average F-measure of 0.77. Our accuracy is not worse than the past English study on the classification of positive vs. negative. Finally, we conducted another classification experiment using document-level-only classification method, and the results showed that our two-level sentiment classification (first sentence level, then document level) outperforms document-level-only sentiment classification.
74

De l'extraction des connaissances à la recommandation / From knowledge extraction to recommendation

Duthil, Benjamin 03 December 2012 (has links)
Les technologies de l'information et le succès des services associés (forums, sites spécialisés, etc) ont ouvert la voie à un mode d'expression massive d'opinions sur les sujets les plus variés (e-commerce, critiques artistiques, etc). Cette profusion d'opinions constitue un véritable eldorado pour l'internaute, mais peut rapidement le conduire à une situation d'indécision car les avis déposés peuvent être fortement disparates voire contradictoires. Pour une gestion fiable et pertinente de l'information contenue dans ces avis, il est nécessaire de mettre en place des systèmes capables de traiter directement les opinions exprimées en langage naturel afin d'en contrôler la subjectivité et de gommer les effets de lissage des traitements statistiques. La plupart des systèmes dits de recommandation ne prennent pas en compte toute la richesse sémantique des critiques et leur associent souvent des systèmes d'évaluation qui nécessitent une implication conséquente et des compétences particulières chez l'internaute. Notre objectif est de minimiser l'intervention humaine dans le fonctionnement collaboratif des systèmes de recommandation en automatisant l'exploitation des données brutes que constituent les avis en langage naturel. Notre approche non supervisée de segmentation thématique extrait les sujets d'intérêt des critiques, puis notre technique d'analyse de sentiments calcule l'opinion exprimée sur ces critères. Ces méthodes d'extraction de connaissances combinées à des outils d'analyse multicritère adaptés à la fusion d'avis d'experts ouvrent la voie à des systèmes de recommandation pertinents, fiables et personnalisés. / Information Technology and the success of its related services (blogs, forums, etc.) have paved the way for a massive mode of opinion expression on the most varied subjects (e-commerce websites, art reviews, etc). This abundance of opinions could appear as a real gold mine for internet users, but it can also be a source of indecision because available opinions may be ill-assorted if not contradictory. A reliable and relevant information management of opinions bases requires systems able to directly analyze the content of opinions expressed in natural language. It allows controlling subjectivity in evaluation process and avoiding smoothing effects of statistical treatments. Most of the so-called recommender systems are unable to manage all the semantic richness of a review and prefer to associate to the review an assessment system that supposes a substantial implication and specific competences of the internet user. Our aim is minimizing user intervention in the collaborative functioning of recommender systems thanks to an automated processing of available reviews in natural language by the recommender system itself. Our topic segmentation method extracts the subjects of interest from the reviews, and then our sentiment analysis approach computes the opinion related to these criteria. These knowledge extraction methods are combined with multicriteria analysis techniques adapted to expert assessments fusion. This proposal should finally contribute to the coming of a new generation of more relevant, reliable and personalized recommender systems.
75

Modeling and mining of web discussions / Modélisation et fouille de discussions de Web

Stavrianou, Anna 01 February 2010 (has links)
The development of Web 2.0 has resulted in the generation of a vast amount of online discussions. Mining and extracting quality knowledge from online discussions is significant for the industrial and marketing sector, as well as for e-commerce applications. Discussions of this kind encapsulate people's interests and beliefs and hence, there is a great interest in acquiring and developing online discussion analysis tools. The objective of this thesis is to define a model which represents online discussions and facilitates their analysis. We propose a graph-oriented model. The vertices of the graph represent postings. Each posting encapsulates information such as the content of the message, the author who has written it, the opinion polarity of the message and the time that the message was posted. The edges among the postings point out a "reply-to" relation. In other words they show which posting replies to what as it is given by the structure of the online discussion.The proposed model is accompanied by a number of measures which facilitate the discussion mining and the extraction of knowledge from it. Defined measures consist in measures that are underlined by the structure of the discussion and the way the postings are linked to each other. There are opinion-oriented measures which deal with the opinion evolution within a discussion. Time-oriented measures exploit the presence of the temporal dimension within a model, while topic-oriented measures can be used in order to measure the presence of topics within a discussion. The user's presence inside the online discussions can be exploited either by social network techniques or through the new model which encapsulates knowledge about the author of each posting.The representation of an online discussion in the proposed way allows a user to "zoom" inside the discussion. A recommendation of messages is proposed to the user to enable a more efficient participation inside the discussion.Additionally, a prototype system has been implemented which allows the user to mine online discussions by selecting a subset of postings and browse through them efficiently. / Le développement du Web 2.0 a donné lieu à la production d'une grande quantité de discussions en ligne. La fouille et l'extraction de données de qualité de ces discussions en ligne sont importantes dans de nombreux domaines (industrie, marketing) et particulièrement pour toutes les applications de commerce électronique. Les discussions de ce type contiennent des opinions et des croyances de personnes et cela explique l'intérêt de développer des outils d'analyse efficaces pour ces discussions.L'objectif de cette thèse est de définir un modèle qui représente les discussions en ligne et facilite leur analyse. Nous proposons un modèle basé sur des graphes. Les sommets du graphe représentent les objets de type message. Chaque objet de type message contient des informations comme son contenu, son auteur, l'orientation de l'opinion qui y été exprimée et la date où il a été posté. Les liens parmi les objets message montrent une relation de type "répondre à". En d'autres termes, ils montrent quels objets répondent à quoi, conséquence directe de la structure de la discussion en ligne.Avec ce nouveau modèle, nous proposons un certain nombre de mesures qui guident la fouille au sein de la discussion et permettent d'extraire des informations pertinentes. Les mesures sont définies par la structure de la discussion et la façon dont les objets messages sont liés entre eux. Il existe des mesures centrées sur l'analyse de l'opinion qui traitent de l'évolution de l'opinion au sein de la discussion. Nous définissons également des mesures centrées sur le temps, qui exploitent la dimension temporelle du modèle, alors que les mesures centrées sur le sujet peuvent être utilisées pour mesurer la présence de sujets dans une discussion. La représentation d'une discussion en ligne de la manière proposée permet à un utilisateur de "zoomer" dans une discussion. Une liste de messages clés est recommandée à l'utilisateur pour permettre une participation plus efficace au sein de la discussion. De plus, un système prototype a été implémenté pour permettre à l'utilisateur de fouiller les discussions en ligne en sélectionnant un sous ensemble d'objets de type message et naviguer à travers ceux-ci de manière efficace.
76

Modélisation conjointe des thématiques et des opinions : application à l'analyse des données textuelles issues du Web / Joint topic-sentiment modeling : an application to Web data analysis

Dermouche, Mohamed 08 June 2015 (has links)
Cette thèse se situe à la confluence des domaines de "la modélisation de thématiques" (topic modeling) et l'"analyse d'opinions" (opinion mining). Le problème que nous traitons est la modélisation conjointe et dynamique des thématiques (sujets) et des opinions (prises de position) sur le Web et les médias sociaux. En effet, dans la littérature, ce problème est souvent décomposé en sous-tâches qui sont menées séparément. Ceci ne permet pas de prendre en compte les associations et les interactions entre les opinions et les thématiques sur lesquelles portent ces opinions (cibles). Dans cette thèse, nous nous intéressons à la modélisation conjointe et dynamique qui permet d'intégrer trois dimensions du texte (thématiques, opinions et temps). Afin d'y parvenir, nous adoptons une approche statistique, plus précisément, une approche basée sur les modèles de thématiques probabilistes (topic models). Nos principales contributions peuvent être résumées en deux points : 1. Le modèle TS (Topic-Sentiment model) : un nouveau modèle probabiliste qui permet une modélisation conjointe des thématiques et des opinions. Ce modèle permet de caractériser les distributions d'opinion relativement aux thématiques. L'objectif est d'estimer, à partir d'une collection de documents, dans quelles proportions d'opinion les thématiques sont traitées. 2. Le modèle TTS (Time-aware Topic-Sentiment model) : un nouveau modèle probabiliste pour caractériser l'évolution temporelle des thématiques et des opinions. En s'appuyant sur l'information temporelle (date de création de documents), le modèle TTS permet de caractériser l'évolution des thématiques et des opinions quantitativement, c'est-à-dire en terme de la variation du volume de données à travers le temps. Par ailleurs, nous apportons deux autres contributions : une nouvelle mesure pour évaluer et comparer les méthodes d'extraction de thématiques, ainsi qu'une nouvelle méthode hybride pour le classement d'opinions basée sur une combinaison de l'apprentissage automatique supervisé et la connaissance a priori. Toutes les méthodes proposées sont testées sur des données réelles en utilisant des évaluations adaptées. / This work is located at the junction of two domains : topic modeling and sentiment analysis. The problem that we propose to tackle is the joint and dynamic modeling of topics (subjects) and sentiments (opinions) on the Web. In the literature, the task is usually divided into sub-tasks that are treated separately. The models that operate this way fail to capture the topic-sentiment interaction and association. In this work, we propose a joint modeling of topics and sentiments, by taking into account associations between them. We are also interested in the dynamics of topic-sentiment associations. To this end, we adopt a statistical approach based on the probabilistic topic models. Our main contributions can be summarized in two points : 1. TS (Topic-Sentiment model) : a new probabilistic topic model for the joint extraction of topics and sentiments. This model allows to characterize the extracted topics with distributions over the sentiment polarities. The goal is to discover the sentiment proportions specfic to each of theextracted topics. 2. TTS (Time-aware Topic-Sentiment model) : a new probabilistic model to caracterize the topic-sentiment dynamics. Relying on the document's time information, TTS allows to characterize the quantitative evolutionfor each of the extracted topic-sentiment pairs. We also present two other contributions : a new evaluation framework for measuring the performance of topic-extraction methods, and a new hybrid method for sentiment detection and classification from text. This method is based on combining supervised machine learning and prior knowledge. All of the proposed methods are tested on real-world data based on adapted evaluation frameworks.
77

Um modelo para predição de bolsa de valores baseado em mineração de opinião

Lima, Milson Louseiro 06 May 2016 (has links)
Made available in DSpace on 2016-08-17T14:52:40Z (GMT). No. of bitstreams: 1 Dissertacao_MilsonLouseiroLima.pdf: 4206975 bytes, checksum: 68293f1f1c80ce84d0573111677ff097 (MD5) Previous issue date: 2016-05-06 / FUNDAÇÃO DE AMPARO À PESQUISA E AO DESENVOLVIMENTO CIENTIFICO E TECNOLÓGICO DO MARANHÃO / Predicting the behavior of stocks in the stock market is a challenging task, a lot of times related to unknown factors or influenced by very distinct natures of variables, which can range from high-profile news to the collective sentiment, expressed in publications on social networks. Such market volatility may represent considerable financial losses for investors. In order to forestall such variations other mechanisms to predict the behavior of assets in the stock market have been proposed, based on pre-existing indicator data. Such mechanisms only analyze statistical data, not considering the collective human sentiment. This work aims to develop a model to predict the stock market, based on analysis of sentiment and it will make use of techniques of artificial intelligence as natural language processing (PLN) and Support Vector Machines (SVM) to predict the active behavior. However, it should be emphasized that this model is intended to be an aid tool in the decision-making process that involves buying and selling shares on the stock market. / Predizer o comportamento das ações na bolsa de valores é uma tarefa desafiadora, muita vezes relacionada a fatores desconhecidos ou influenciados por variáveis de naturezas bem distintas, que podem ir desde notícias de grande repercussão até o sentimento coletivo, expresso em publicações de redes sociais. Tal volatilidade do mercado pode representar perdas financeiras consideráveis para os investidores. No intuito de se antecipar a tais variações já foram propostos outros mecanismos para predizer o comportamento de ativos na bolsa de valores, baseados em dados de indicadores pré-existentes. Tais mecanismos analisam apenas dados estatísticos, não considerando o sentimento humano coletivo. Este trabalho tem como finalidade desenvolver um modelo para predição da bolsa de valores, baseado na mineração de opinião e, para isso, fará uso de técnicas de Inteligência artificial como processamento de linguagem natural(PLN) e Máquinas de Vetor de Suporte(SVM) para predizer o comportamento do ativo. No entanto, convém ressaltar que o referido modelo tem como finalidade ser uma ferramenta de auxílio no processo de tomada de decisão que envolve a compra e venda de ações na bolsa de valores.
78

Mineração de opiniões baseada em aspectos para revisões de produtos e serviços / Aspect-based Opinion Mining for Reviews of Products and Services

Ivone Penque Matsuno Yugoshi 27 April 2018 (has links)
A Mineração de Opiniões é um processo que tem por objetivo extrair as opiniões e suas polaridades de sentimentos expressas em textos em língua natural. Essa área de pesquisa tem ganhado destaque devido ao volume de opiniões que os usuários compartilham na Internet, como revisões em sites de e-commerce, rede sociais e tweets. A Mineração de Opiniões baseada em Aspectos é uma alternativa promissora para analisar a polaridade do sentimento em um maior nível de detalhes. Os métodos tradicionais para extração de aspectos e classificação de sentimentos exigem a participação de especialistas de domínio para criar léxicos ou definir regras de extração para diferentes idiomas e domínios. Além disso, tais métodos usualmente exploram algoritmos de aprendizado supervisionado, porém exigem um grande conjunto de dados rotulados para induzir um modelo de classificação. Os desafios desta tese de doutorado estão relacionados a como diminuir a necessidade de grande esforço humano tanto para rotular dados, quanto para tratar a dependência de domínio para as tarefas de extração de aspectos e classificação de sentimentos dos aspectos para Mineração de Opiniões. Para reduzir a necessidade de grande quantidade de exemplos rotulados foi proposta uma abordagem semissupervisionada, denominada por Aspect-based Sentiment Propagation on Heterogeneous Networks (ASPHN) em que são propostas representações de textos nas quais os atributos linguísticos, os aspectos candidatos e os rótulos de sentimentos são modelados por meio de redes heterogêneas. Para redução dos esforços para construir recursos específicos de domínio foi proposta uma abordagem baseada em aprendizado por transferência entre domínios denominada Cross-Domain Aspect Label Propagation through Heterogeneous Networks (CD-ALPHN) que utiliza dados rotulados de outros domínios para suportar tarefas de aprendizado em domínios sem dados rotulados. Nessa abordagem são propostos uma representação em uma rede heterogênea e um método de propagação de rótulos. Os vértices da rede são os aspectos rotulados do domínio de origem, os atributos linguísticos e os candidatos a aspectos do domínio alvo. Além disso, foram analisados métodos de extração de aspectos e propostas algumas variações para considerar cenários nãosupervisionados e independentes de domínio. As soluções propostas nesta tese de doutorado foram avaliadas e comparadas as do estado-da-arte utilizando coleções de revisões de diferentes produtos e serviços. Os resultados obtidos nas avaliações experimentais são competitivos e demonstram que as soluções propostas são promissoras. / Opinion Mining is a process that aims to extract opinions and their sentiment polarities expressed in natural language texts. This area of research has been in the highlight because of the volume of opinions that users share on the available visualization means on the Internet (reviews on e-commerce sites, social networks, tweets, others). Aspect-based Opinion Mining is a promising alternative for analyzing the sentiment polarity on a high level of detail. The traditional methods for aspect extraction and sentiment classification require the participation of domain experts to create lexicons or define extraction rules for different languages and domains. In addition, such methods usually exploit supervised machine learning algorithms, but require a large set of labeled data to induce a classification model. The challenges of this doctoral thesis are related on to how to reduce the need for great human effort both: (i) to label data; and (ii) to treat domain dependency for the tasks of aspect extraction and aspect sentiment classification for Opinion Mining. In order to reduce the need for a large number of labeled examples, a semi-supervised approach was proposed, called Aspect-based Sentiment Propagation on Heterogeneous Networks (ASPHN). In this approach, text representations are proposed in which linguistic attributes, candidate aspects and sentiment labels are modeled by heterogeneous networks. Also, a cross-domain learning approach called Cross-Domain Aspect Label Propagation through Heterogeneous Networks (CD-ALPHN) is proposed in order to reduce efforts to build domain-specific resources, This approach uses labeled data from other domains to support learning tasks in domains without labeled data. A representation in a heterogeneous network and a label propagation method are proposed in this cross-domain learning approach. The vertices of the network are the labeled aspects of the source domain, the linguistic attributes, and the candidate aspects of the target domain. In addition, aspect extraction methods were analyzed and some variations were proposed to consider unsupervised and domain independent scenarios. The solutions proposed in this doctoral thesis were evaluated and compared to the state-of-the-art solutions using collections of different product and service reviews. The results obtained in the experimental evaluations are competitive and demonstrate that the proposed solutions are promising.
79

Analyse d'opinion dans les interactions orales / Opinion analysis in speech interactions

Barriere, Valentin 15 April 2019 (has links)
La reconnaissance des opinions d'un locuteur dans une interaction orale est une étape cruciale pour améliorer la communication entre un humain et un agent virtuel. Dans cette thèse, nous nous situons dans une problématique de traitement automatique de la parole (TAP) sur les phénomènes d'opinions dans des interactions orales spontanées naturelles. L'analyse d'opinion est une tâche peu souvent abordée en TAP qui se concentrait jusqu'à peu sur les émotions à l'aide du contenu vocal et non verbal. De plus, la plupart des systèmes récents existants n'utilisent pas le contexte interactionnel afin d'analyser les opinions du locuteur. Dans cette thèse, nous nous penchons sur ces sujet. Nous nous situons dans le cadre de la détection automatique en utilisant des modèles d’apprentissage statistiques. Après une étude sur la modélisation de la dynamique de l'opinion par un modèle à états latents à l’intérieur d'un monologue, nous étudions la manière d’intégrer le contexte interactionnel dialogique, et enfin d'intégrer l'audio au texte avec différents types de fusion. Nous avons travaillé sur une base de données de Vlogs au niveau d'un sentiment global, puis sur une base de données d'interactions dyadiques multimodales composée de conversations ouvertes, au niveau du tour de parole et de la paire de tours de parole. Pour finir, nous avons fait annoté une base de données en opinion car les base de données existantes n'étaient pas satisfaisantes vis-à-vis de la tâche abordée, et ne permettaient pas une comparaison claire avec d'autres systèmes à l'état de l'art.A l'aube du changement important porté par l’avènement des méthodes neuronales, nous étudions différents types de représentations: les anciennes représentations construites à la main, rigides mais précises, et les nouvelles représentations apprises de manière statistique, générales et sémantiques. Nous étudions différentes segmentations permettant de prendre en compte le caractère asynchrone de la multi-modalité. Dernièrement, nous utilisons un modèle d'apprentissage à états latents qui peut s'adapter à une base de données de taille restreinte, pour la tâche atypique qu'est l'analyse d'opinion, et nous montrons qu'il permet à la fois une adaptation des descripteurs du domaine écrit au domaine oral, et servir de couche d'attention via son pouvoir de clusterisation. La fusion multimodale complexe n'étant pas bien gérée par le classifieur utilisé, et l'audio étant moins impactant sur l'opinion que le texte, nous étudions différentes méthodes de sélection de paramètres pour résoudre ces problèmes. / 2588/5000Recognizing a speaker's opinions in an oral interaction is a crucial step in improving communication between a human and a virtual agent. In this thesis, we find ourselves in a problematic of automatic speech processing (APT) on opinion phenomena in natural spontaneous oral interactions. Opinion analysis is a task that is not often addressed in TAP that focused until recently on emotions using voice and non-verbal content. In addition, most existing legacy systems do not use the interactional context to analyze the speaker's opinions. In this thesis, we focus on these topics.We are in the context of automatic detection using statistical learning models. A study on modeling the dynamics of opinion by a model with latent states within a monologue, we study how to integrate the context interactional dialogical, and finally to integrate audio to text with different types of fusion. We worked on a basic Vlogs data at a global sense, and on the basis of multimodal data dyadic interactions composed of open conversations, at the turn of speech and word pair of towers. Finally, we annotated database in opinion because existing database were not satisfactory vis-à-vis the task addressed, and did not allow a clear comparison with other systems in the state art.At the dawn of significant change brought by the advent of neural methods, we study different types of representations: the ancient representations built by hand, rigid, but precise, and new representations learned statistically, and general semantics. We study different segmentations to take into account the asynchronous nature of multi-modality. Recently, we are using a latent state learning model that can adapt to a small database, for the atypical task of opinion analysis, and we show that it allows both an adaptation of the descriptors of the written domain to the oral domain, and serve as an attention layer via its clustering power. Complex multimodal fusion is not well managed by the classifier used, and audio being less impacting on opinion than text, we study different methods of parameter selection to solve these problems.
80

Dolování dat v prostředí sociálních sítí / Data Mining in Social Networks

Raška, Jiří January 2013 (has links)
This thesis deals with knowledge discovery from social media. This thesis is focused on feature based opinion mining from user reviews. In theoretical part were described methods of opinion mining and natural language processing. Main parts of this thesis were design and implementation of library for opinion mining based on Stanford Parser and lexicon WordNet. For feature identi cation was used dependency grammar, implicit features were mined with method CoAR and opinions were classi ed with supervised algorithm. Finally were given experiments with implemented library and examples of usage.

Page generated in 0.4407 seconds