• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 220
  • 43
  • 17
  • 14
  • 11
  • 9
  • 7
  • 7
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • Tagged with
  • 369
  • 369
  • 103
  • 101
  • 94
  • 79
  • 77
  • 75
  • 71
  • 64
  • 63
  • 61
  • 60
  • 59
  • 55
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Experimentos comparativos combinando aprendizado supervisionado e tradução automática para mineração de emoçoes em textos multilíngues / Comparative experiments combining supervised learning and machine translation for multilingual emotion mining

Santos, Aline Graciela Lermen dos January 2016 (has links)
Com o avanço da Internet pelo mundo, as pessoas passaram a interagir cada vez mais com a Web, principalmente após o surgimento das redes sociais, criando conteúdo que pode ser explorado de diversas formas. Esse aumento de usuários tem sido global, ou seja, pessoas de diversos países passaram a produzir textos de diversos idiomas. Esses textos compõem um rico conteúdo para Análise de Sentimentos Multilíngue. A maior parte dos trabalhos da área se foca em Mineração de Opinião, analisando o sentimento através da polaridade. Outro tipo de sentimento que tem atraído atenção é a emoção, embora não seja amplamente explorada a Análise de Sentimentos Multilíngue usando emoção. Este trabalho utiliza técnicas geralmente usadas para Mineração de Opinião e polaridade para Análise de Sentimentos Multilíngues usando emoção. O objetivo deste trabalho é comparar diferentes combinações de aprendizado de máquina supervisionado e tradução automática para criar corpora em diferentes idiomas a partir de corpora anotados já existentes. As duas formas de utilizar as traduções comparadas são: criando classificadores de emoção separados por idiomas, chamados monolíngues, e criando um classificador composto do idioma original e das traduções, chamado multilíngue. É feito ainda um experimento cruzando dois corpora, visando avaliar o uso da tradução de um corpus com os textos originais do outro. Os resultados dos experimentos mostram não apenas o sucesso de analisar emoção usando aprendizado supervisionado e tradução automática, mas que o classificador multilíngue supera os classificadores monolíngues. O experimento cruzando os corpora mostra que para algumas emoções os corpora estão alinhados, mas que para outras é preciso que haja maior similaridade nos textos. / With the growth of the Internet around the world, people began to interact more and more with the Web, especially after the emergence of social networks, creating content that can be exploited in several ways. This increase in the number of users has been global, that is, people from different countries started producing texts in several languages. These texts comprise a rich content for Multilingual Sentiment Analysis. Most of the work in the area focus in Opinion Mining, analyzing the feeling through polarity. Another type of feeling that has attracted attention is emotion, although not extensively explored in Multilingual Sentiment Analysis. This work uses techniques commonly used for Opinion Mining and polarity for Multilingual Sentiment Analysis using emotion. The objective of this study is to compare different combinations of supervised machine learning and automatic translation to create corpora in different languages from existing annotated corpora. The two ways to use the translations compared are: creating emotion classifiers separated by languages, called monolingual, and creating a composed classifier, with the original language and it’s translations, called multilingual. An experiment crossing the two corpora used is made, to evaluate the use of the translation of one corpus with the original texts of the other. The results of the experiments show not only the success of analysing emotion using supervised machine learning and automatic translation, but that the multilingual classifier exceeds the monolingual classifiers. The experiment crossing the corpora shows that to some emotions the corpora are aligned, but for others there needs to be greater similarity in the texts.
62

Experimentos comparativos combinando aprendizado supervisionado e tradução automática para mineração de emoçoes em textos multilíngues / Comparative experiments combining supervised learning and machine translation for multilingual emotion mining

Santos, Aline Graciela Lermen dos January 2016 (has links)
Com o avanço da Internet pelo mundo, as pessoas passaram a interagir cada vez mais com a Web, principalmente após o surgimento das redes sociais, criando conteúdo que pode ser explorado de diversas formas. Esse aumento de usuários tem sido global, ou seja, pessoas de diversos países passaram a produzir textos de diversos idiomas. Esses textos compõem um rico conteúdo para Análise de Sentimentos Multilíngue. A maior parte dos trabalhos da área se foca em Mineração de Opinião, analisando o sentimento através da polaridade. Outro tipo de sentimento que tem atraído atenção é a emoção, embora não seja amplamente explorada a Análise de Sentimentos Multilíngue usando emoção. Este trabalho utiliza técnicas geralmente usadas para Mineração de Opinião e polaridade para Análise de Sentimentos Multilíngues usando emoção. O objetivo deste trabalho é comparar diferentes combinações de aprendizado de máquina supervisionado e tradução automática para criar corpora em diferentes idiomas a partir de corpora anotados já existentes. As duas formas de utilizar as traduções comparadas são: criando classificadores de emoção separados por idiomas, chamados monolíngues, e criando um classificador composto do idioma original e das traduções, chamado multilíngue. É feito ainda um experimento cruzando dois corpora, visando avaliar o uso da tradução de um corpus com os textos originais do outro. Os resultados dos experimentos mostram não apenas o sucesso de analisar emoção usando aprendizado supervisionado e tradução automática, mas que o classificador multilíngue supera os classificadores monolíngues. O experimento cruzando os corpora mostra que para algumas emoções os corpora estão alinhados, mas que para outras é preciso que haja maior similaridade nos textos. / With the growth of the Internet around the world, people began to interact more and more with the Web, especially after the emergence of social networks, creating content that can be exploited in several ways. This increase in the number of users has been global, that is, people from different countries started producing texts in several languages. These texts comprise a rich content for Multilingual Sentiment Analysis. Most of the work in the area focus in Opinion Mining, analyzing the feeling through polarity. Another type of feeling that has attracted attention is emotion, although not extensively explored in Multilingual Sentiment Analysis. This work uses techniques commonly used for Opinion Mining and polarity for Multilingual Sentiment Analysis using emotion. The objective of this study is to compare different combinations of supervised machine learning and automatic translation to create corpora in different languages from existing annotated corpora. The two ways to use the translations compared are: creating emotion classifiers separated by languages, called monolingual, and creating a composed classifier, with the original language and it’s translations, called multilingual. An experiment crossing the two corpora used is made, to evaluate the use of the translation of one corpus with the original texts of the other. The results of the experiments show not only the success of analysing emotion using supervised machine learning and automatic translation, but that the multilingual classifier exceeds the monolingual classifiers. The experiment crossing the corpora shows that to some emotions the corpora are aligned, but for others there needs to be greater similarity in the texts.
63

Inferring Aspect-Specific Opinion Structure in Product Reviews

Carter, David January 2015 (has links)
Identifying differing opinions on a given topic as expressed by multiple people (as in a set of written reviews for a given product, for example) presents challenges. Opinions about a particular subject are often nuanced: a person may have both negative and positive opinions about different aspects of the subject of interest, and these aspect-specific opinions can be independent of the overall opinion on the subject. Being able to identify, collect, and count these nuanced opinions in a large set of data offers more insight into the strengths and weaknesses of competing products and services than does aggregating the overall ratings of such products and services. I make two useful and useable contributions in working with opinionated text. First, I present my implementation of a semi-supervised co-training machine classification method for identifying both product aspects (features of products) and sentiments expressed about such aspects. It offers better precision than fully-supervised methods while requiring much less text to be manually tagged (a time-consuming process). This algorithm can also be run in a fully supervised manner when more data is available. Second, I apply this co-training approach to reviews of restaurants and various electronic devices; such text contains both factual statements and opinions about features/aspects of products. The algorithm automatically identifies the product aspects and the words that indicate aspect-specific opinion polarity, while largely avoiding the problem of misclassifying the products themselves as inherently positive or negative. This method performs well compared to other approaches. When run on a set of reviews of five technology products collected from Amazon, the system performed with some demonstrated competence (with an average precision of 0.83) at the difficult task of simultaneously identifying aspects and sentiments, though comparison to contemporaries' simpler rules-based approaches was difficult. When run on a set of opinionated sentences about laptops and restaurants that formed the basis of a shared challenge in the SemEval-2014 Task 4 competition, it was able to classify the sentiments expressed about aspects of laptops better than any team that competed in the task (achieving 0.72 accuracy). It was above the mean in its ability to identify the aspects of restaurants about which people expressed opinions, even when co-training using only half of the labelled training data at the outset. While the SemEval-2014 aspect-based sentiment extraction task considered only separately the tasks of identifying product aspects and determining their polarities, I take an extra step and evaluate sentences as a whole, inferring aspects and the aspect-specific sentiments expressed simultaneously, a more difficult task that seems more applicable to real-world tasks. I present first results of this sentence-level task. The algorithm uses both lexical and syntactic information in a manner that is shown to be able to handle new words that it has never before seen. It offers some demonstrated ability to adapt to new subject domains for which it has no training data. The system is characterizable by very high precision and weak-to-average recall and it estimates its own confidence in its predictions; this characteristic should make the algorithm suitable for use on its own or for combination in a confidence-based voting ensemble. The software created for and described in the course of this dissertation is made available online.
64

Reliable General Purpose Sentiment Analysis of the Public Twitter Stream

Haldenwang, Nils 27 September 2017 (has links)
General purpose Twitter sentiment analysis is a novel field that is closely related to traditional Twitter sentiment analysis but slightly differs in some key aspects. The main difference lies in the fact that the novel approach considers the unfiltered public Twitter stream while most of the previous approaches often applied various filtering steps which are not feasible for many applications. Another goal is to yield more reliable results by only classifying a tweet as positive or negative if it distinctly consists of the respective sentiment and mark the remaining messages as uncertain. Traditional approaches are often not that strict. Within the course of this thesis it could be verified that the novel approach differs significantly from the traditional approach. Moreover, the experimental results indicated that the archetypical approaches could be transferred to the new domain but the related domain data is consistently sub par when compared to high quality in-domain data. Finally, the viability of the best classification algorithm could be qualitatively verified in a real-world setting that was also developed within the course of this thesis.
65

[en] MACHINE LEARNING FOR SENTIMENT CLASSIFICATION / [pt] APRENDIZADO DE MÁQUINA PARA O PROBLEMA DE SENTIMENT CLASSIFICATION

PEDRO OGURI 18 May 2007 (has links)
[pt] Sentiment Analysis é um problema de categorização de texto no qual deseja-se identificar opiniões favoráveis e desfavoráveis com relação a um tópico. Um exemplo destes tópicos de interesse são organizações e seus produtos. Neste problema, documentos são classificados pelo sentimento, conotação, atitudes e opiniões ao invés de se restringir aos fatos descritos neste. O principal desafio em Sentiment Classification é identificar como sentimentos são expressados em textos e se tais sentimentos indicam uma opinião positiva (favorável) ou negativa (desfavorável) com relação a um tópico. Devido ao crescente volume de dados disponível na Web, onde todos tendem a ser geradores de conteúdo e expressarem opiniões sobre os mais variados assuntos, técnicas de Aprendizado de Máquina vem se tornando cada vez mais atraentes. Nesta dissertação investigamos métodos de Aprendizado de Máquina para Sentiment Analysis. Apresentamos alguns modelos de representação de documentos como saco de palavras e N-grama. Testamos os classificadores SVM (Máquina de Vetores Suporte) e Naive Bayes com diferentes modelos de representação textual e comparamos seus desempenhos. / [en] Sentiment Analysis is a text categorization problem in which we want to identify favorable and unfavorable opinions towards a given topic. Examples of such topics are organizations and its products. In this problem, docu- ments are classifed according to their sentiment, connotation, attitudes and opinions instead of being limited to the facts described in it. The main challenge in Sentiment Classification is identifying how sentiments are expressed in texts and whether they indicate a positive (favorable) or negative (unfavorable) opinion towards a topic. Due to the growing volume of information available online in an environment where we all tend to be content generators and express opinions on a variety of subjects, Machine Learning techniques have become more and more attractive. In this dissertation, we investigate Machine Learning methods applied to Sentiment Analysis. We present document representation models such as bag-of-words and N-grams.We compare the performance of the Naive Bayes and the Support Vector Machine classifiers for each proposed model
66

Sentiment Analysis & Time Series Analysis on Stock Market

Singh, Aniket Kumar 28 April 2023 (has links)
No description available.
67

Graph-based approaches for semi-supervised and cross-domain sentiment analysis

Ponomareva, Natalia January 2014 (has links)
The rapid development of Internet technologies has resulted in a sharp increase in the number of Internet users who create content online. User-generated content often represents people's opinions, thoughts, speculations and sentiments and is a valuable source of information for companies, organisations and individual users. This has led to the emergence of the field of sentiment analysis, which deals with the automatic extraction and classification of sentiments expressed in texts. Sentiment analysis has been intensively researched over the last ten years, but there are still many issues to be addressed. One of the main problems is the lack of labelled data necessary to carry out precise supervised sentiment classification. In response, research has moved towards developing semi-supervised and cross-domain techniques. Semi-supervised approaches still need some labelled data and their effectiveness is largely determined by the amount of these data, whereas cross-domain approaches usually perform poorly if training data are very different from test data. The majority of research on sentiment classification deals with the binary classification problem, although for many practical applications this rather coarse sentiment scale is not sufficient. Therefore, it is crucial to design methods which are able to perform accurate multiclass sentiment classification. The aims of this thesis are to address the problem of limited availability of data in sentiment analysis and to advance research in semi-supervised and cross-domain approaches for sentiment classification, considering both binary and multiclass sentiment scales. We adopt graph-based learning as our main method and explore the most popular and widely used graph-based algorithm, label propagation. We investigate various ways of designing sentiment graphs and propose a new similarity measure which is unsupervised, easy to compute, does not require deep linguistic analysis and, most importantly, provides a good estimate for sentiment similarity as proved by intrinsic and extrinsic evaluations. The main contribution of this thesis is the development and evaluation of a graph-based sentiment analysis system that a) can cope with the challenges of limited data availability by using semi-supervised and cross-domain approaches b) is able to perform multiclass classification and c) achieves highly accurate results which are superior to those of most state-of-the-art semi-supervised and cross-domain systems. We systematically analyse and compare semi-supervised and cross-domain approaches in the graph-based framework and propose recommendations for selecting the most pertinent learning approach given the data available. Our recommendations are based on two domain characteristics, domain similarity and domain complexity, which were shown to have a significant impact on semi-supervised and cross-domain performance.
68

Probabilistic topic models for sentiment analysis on the Web

Chenghua, Lin January 2011 (has links)
Sentiment analysis aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text, and has received a rapid growth of interest in natural language processing in recent years. Probabilistic topic models, on the other hand, are capable of discovering hidden thematic structure in large archives of documents, and have been an active research area in the field of information retrieval. The work in this thesis focuses on developing topic models for automatic sentiment analysis of web data, by combining the ideas from both research domains. One noticeable issue of most previous work in sentiment analysis is that the trained classifier is domain dependent, and the labelled corpora required for training could be difficult to acquire in real world applications. Another issue is that the dependencies between sentiment/subjectivity and topics are not taken into consideration. The main contribution of this thesis is therefore the introduction of three probabilistic topic models, which address the above concerns by modelling sentiment/subjectivity and topic simultaneously. The first model is called the joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. Unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when applied to new domains, the weakly-supervised nature of JST makes it highly portable to other domains, where the only supervision information required is a domain-independent sentiment lexicon. Apart from document-level sentiment classification results, JST can also extract sentiment-bearing topics automatically, which is a distinct feature compared to the existing sentiment analysis approaches. The second model is a dynamic version of JST called the dynamic joint sentiment-topic (dJST) model. dJST respects the ordering of documents, and allows the analysis of topic and sentiment evolution of document archives that are collected over a long time span. By accounting for the historical dependencies of documents from the past epochs in the generative process, dJST gives a richer posterior topical structure than JST, and can better respond to the permutations of topic prominence. We also derive online inference procedures based on a stochastic EM algorithm for efficiently updating the model parameters. The third model is called the subjectivity detection LDA (subjLDA) model for sentence-level subjectivity detection. Two sets of latent variables were introduced in subjLDA. One is the subjectivity label for each sentence; another is the sentiment label for each word token. By viewing the subjectivity detection problem as weakly-supervised generative model learning, subjLDA significantly outperforms the baseline and is comparable to the supervised approach which relies on much larger amounts of data for training. These models have been evaluated on real world datasets, demonstrating that joint sentiment topic modelling is indeed an important and useful research area with much to offer in the way of good results.
69

Diseño e implementación de un sistema para la clasificación de tweets según su polaridad

Tapia Caro, Pablo Andrés January 2014 (has links)
Ingeniero Civil Indusrial / La alta penetración de Twitter en Chile ha favorecido que esta red social sea utilizada por empresas, políticos y organizaciones como un medio para obtener información adicional de las opiniones de usuarios acerca de sus productos, servicios o ellos mismos. Al ser los comentarios en Twitter, por defecto, de carácter público, se pueden analizar con el fin de extraer información accionable. En particular las empresas además de estar interesadas en la información cuantitativa, les interesa saber bajo qué polaridad se efectúan estas menciones, por cuanto una variación positiva en el número de comentarios puede deberse a un mayor número de menciones tanto positivas como negativas. Si bien existen un número considerable de softwares que vienen con la funcionalidad de detección de polaridad de sentimientos, estos no son de mucha utilidad ya que la forma en que interactúa el usuario chileno con esta plataforma está llena de modismos propios de nuestro lenguaje local y abreviaciones que se deben principalmente a la limitación de caracteres de Twitter. Al ser esta una industria inmadura en Chile, la tarea de detección de polaridad de sentimientos, se está realizando de forma manual por agencias publicitarias y otro tipo de empresas, pero dado el gran número de comentarios que se producen minuto a minuto, esta tarea resulta muy demandante en tiempo y dinero. Para resolver este tipo de problemáticas se utilizan técnicas de aprendizaje automático con el fin de entrenar un algoritmo que luego pueda determinar si un comentario es positivo, negativo o neutro, campo que se conoce como sentiment analysis. Mientras más datos sean procesados para el entrenamiento del algoritmo, mejor es el desempeño del clasificador y como en Twitter es sencillo obtener comentarios mediante su API, a diferencia de la web, se han formulado técnicas para generar automáticamente la corpora que contiene los tweets de entrenamiento para cada una de las clases y así sacar provecho de esta propiedad. En este trabajo se profundiza el uso de una metodología semiautomática basada en emoticons para la generación de una corpora de tweets para la detección de polaridad de sentimientos en Twitter. Esto se realiza introduciendo un nuevo enfoque para la consolidación de los datos de entrenamiento mediante filtros que mejoran el etiquetado automático. Esto permite prevenir la aparición de comentarios erráticos y que causan ruido en las fases de entrenamiento y clasificación. Además se introduce una nueva clase de tweets que no se había considerado anteriormente, que consiste de tweets que carecen de información suficiente para clasificarlos como positivos, negativos o neutros, por lo que clasificarlos en alguna de estas clases disminuye la precisión del sistema. Evaluaciones experimentales mostraron que el uso de esta cuarta clase denominada irrelevante con el criterio de filtros presentado para la generación de la corpora, mejora el desempeño del sistema. Además se comprobó experimentalmente que el uso de una corpora generada en base a tweets chilenos clasifican mejor a los comentarios originados por usuarios locales.
70

Towards a science of human stories: using sentiment analysis and emotional arcs to understand the building blocks of complex social systems

Reagan, Andrew James 01 January 2017 (has links)
We can leverage data and complex systems science to better understand society and human nature on a population scale through language --- utilizing tools that include sentiment analysis, machine learning, and data visualization. Data-driven science and the sociotechnical systems that we use every day are enabling a transformation from hypothesis-driven, reductionist methodology to complex systems sciences. Namely, the emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, with profound implications for our understanding of human behavior. Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture's evolution through its texts using a "big data" lens. Given the growing assortment of sentiment measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts. Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied to 4 different corpora, and briefly examine a further 20 methods. We show that while inappropriate for sentences, dictionary-based methods are generally robust in their classification accuracy for longer texts. Most importantly they can aid understanding of texts with reliable and meaningful word shift graphs if (1) the dictionary covers a sufficiently large enough portion of a given text's lexicon when weighted by word usage frequency; and (2) words are scored on a continuous scale. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories, forming patterns that are meaningful to us. By classifying the emotional arcs for a filtered subset of 4,803 stories from Project Gutenberg's fiction collection, we find a set of six core trajectories which form the building blocks of complex narratives. We strengthen our findings by separately applying optimization, linear decomposition, supervised learning, and unsupervised learning. For each of these six core emotional arcs, we examine the closest characteristic stories in publication today and find that particular emotional arcs enjoy greater success, as measured by downloads. Within stories lie the core values of social behavior, rich with both strategies and proper protocol, which we can begin to study more broadly and systematically as a true reflection of culture. Of profound scientific interest will be the degree to which we can eventually understand the full landscape of human stories, and data driven approaches will play a crucial role. Finally, we utilize web-scale data from Twitter to study the limits of what social data can tell us about public health, mental illness, discourse around the protest movement of #BlackLivesMatter, discourse around climate change, and hidden networks. We conclude with a review of published works in complex systems that separately analyze charitable donations, the happiness of words in 10 languages, 100 years of daily temperature data across the United States, and Australian Rules Football games.

Page generated in 0.0823 seconds