Spelling suggestions: "subject:"sentiment analysis"" "subject:"centiment analysis""
111 |
Análise de sentimentos em tíquetes para o suporte de TI / Sentiment Analysis in Tickets for IT SupportBlaz, Cássio Castaldi Araújo January 2017 (has links)
Análise de Sentimentos/Mineração de Opinião é adotada na engenharia de software para questões como usabilidade e sentimentos de desenvolvedores em projetos. Este trabalho propõe métodos para avaliar os sentimentos presentes em tíquetes abertos à área de suporte de TI. Há diversos tipos de tíquetes abertos à TI (e.g. infraestrutura, software), que envolvem erros, incidentes, requisições, etc. O maior desafio é automaticamente distinguir entre a necessidade em si, a qual é intrinsecamente negativa (por exemplo, a descrição de um erro), de um sentimento embutido na descrição. Nossa abordagem automaticamente cria um dicionário de domínio que contém termos que expressam sentimentos no contexto de TI, utilizados para filtrar expressões em um tíquete para análise de sentimentos. Nós criamos e avaliamos três métodos de classificação para calcular a polaridade em tíquetes. Nosso estudo utilizou 34.895 tíquetes de cinco organizações. Para polaridade, 2.333 tíquetes foram selecionados aleatoriamente para compor nosso gold standard. Nossos melhores resultados apresentam uma precisão e revocação de 82,83% e 88,42%, respectivamente, o que supera outras soluções de análise de sentimentos comparadas. De forma complementar, emoções em tíquetes foram estudadas considerando os modelos de Ekman e VAD. Um dos três métodos de classificação criados foi adaptado para também identificar emoções nos tíquetes. Possíveis correlações entre polaridade e emoções foram verificadas via regras de associação. Resultados correlacionam tíquetes positivos com valência e dominância altas e excitação baixa, além de presença de alegria e surpresa e ausência de medo. Tíquetes negativos correlacionam com valência, excitação e dominância neutras, além de ausência de alegria e presença de medo. Contudo os resultados para a polaridade negativa não são precisos. / Sentiment Analysis/Opinion Mining has been adopted in software engineering for problems such as software usability and sentiment of developers in projects. This work proposes methods to evaluate the sentiment contained in tickets for IT (Information Technology) support. IT tickets are broad in coverage (e.g. infrastructure, software), and involve errors, incidents, requests, etc. The main challenge is to automatically distinguish between factual information, which is intrinsically negative (e.g. error description), from the sentiment embedded in the description. Our approach is to automatically create a domain dictionary that contains terms with sentiment in IT context, used to filter terms in tickets for sentiment analysis. We created and evaluate three classification methods for calculating the polarity of terms in tickets. Our study was developed using 34,895 tickets from five organizations. For polarity, we randomly selected 2.333 tickets to compose a gold standard. Our best results display an average precision and recall of 82.83% and 88.42%, respectively, which outperforms the compared sentiment analysis solutions. Complementarily, emotions in tickets were studied considering the models of Ekman and VAD. One of the three classification methods created has been adapted to also identify emotions in the tickets. Possible correlations between polarity and emotions were verified through association rules. Results correlate positive tickets with valence and dominance high and low excitation, besides presence of joy and surprise and absence of fear. Negative tickets correlate with valence, neutral excitement and dominance, besides absence of joy and presence of fear. However the results for negative polarity are not accurate.
|
112 |
Event Analytics on Social Media: Challenges and SolutionsJanuary 2014 (has links)
abstract: Social media platforms such as Twitter, Facebook, and blogs have emerged as valuable
- in fact, the de facto - virtual town halls for people to discover, report, share and
communicate with others about various types of events. These events range from
widely-known events such as the U.S Presidential debate to smaller scale, local events
such as a local Halloween block party. During these events, we often witness a large
amount of commentary contributed by crowds on social media. This burst of social
media responses surges with the "second-screen" behavior and greatly enriches the
user experience when interacting with the event and people's awareness of an event.
Monitoring and analyzing this rich and continuous flow of user-generated content can
yield unprecedentedly valuable information about the event, since these responses
usually offer far more rich and powerful views about the event that mainstream news
simply could not achieve. Despite these benefits, social media also tends to be noisy,
chaotic, and overwhelming, posing challenges to users in seeking and distilling high
quality content from that noise.
In this dissertation, I explore ways to leverage social media as a source of information and analyze events based on their social media responses collectively. I develop, implement and evaluate EventRadar, an event analysis toolbox which is able to identify, enrich, and characterize events using the massive amounts of social media responses. EventRadar contains three automated, scalable tools to handle three core event analysis tasks: Event Characterization, Event Recognition, and Event Enrichment. More specifically, I develop ET-LDA, a Bayesian model and SocSent, a matrix factorization framework for handling the Event Characterization task, i.e., modeling characterizing an event in terms of its topics and its audience's response behavior (via ET-LDA), and the sentiments regarding its topics (via SocSent). I also develop DeMa, an unsupervised event detection algorithm for handling the Event Recognition task, i.e., detecting trending events from a stream of noisy social media posts. Last, I develop CrowdX, a spatial crowdsourcing system for handling the Event Enrichment task, i.e., gathering additional first hand information (e.g., photos) from the field to enrich the given event's context.
Enabled by EventRadar, it is more feasible to uncover patterns that have not been
explored previously and re-validating existing social theories with new evidence. As a
result, I am able to gain deep insights into how people respond to the event that they
are engaged in. The results reveal several key insights into people's various responding
behavior over the event's timeline such the topical context of people's tweets does not
always correlate with the timeline of the event. In addition, I also explore the factors
that affect a person's engagement with real-world events on Twitter and find that
people engage in an event because they are interested in the topics pertaining to
that event; and while engaging, their engagement is largely affected by their friends'
behavior. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2014
|
113 |
Mining Signed Social Networks Using Unsupervised Learning AlgorithmsJanuary 2017 (has links)
abstract: Due to vast resources brought by social media services, social data mining has
received increasing attention in recent years. The availability of sheer amounts of
user-generated data presents data scientists both opportunities and challenges. Opportunities are presented with additional data sources. The abundant link information
in social networks could provide another rich source in deriving implicit information
for social data mining. However, the vast majority of existing studies overwhelmingly
focus on positive links between users while negative links are also prevailing in real-
world social networks such as distrust relations in Epinions and foe links in Slashdot.
Though recent studies show that negative links have some added value over positive
links, it is dicult to directly employ them because of its distinct characteristics from
positive interactions. Another challenge is that label information is rather limited
in social media as the labeling process requires human attention and may be very
expensive. Hence, alternative criteria are needed to guide the learning process for
many tasks such as feature selection and sentiment analysis.
To address above-mentioned issues, I study two novel problems for signed social
networks mining, (1) unsupervised feature selection in signed social networks; and
(2) unsupervised sentiment analysis with signed social networks. To tackle the first problem, I propose a novel unsupervised feature selection framework SignedFS. In
particular, I model positive and negative links simultaneously for user preference
learning, and then embed the user preference learning into feature selection. To study the second problem, I incorporate explicit sentiment signals in textual terms and
implicit sentiment signals from signed social networks into a coherent model Signed-
Senti. Empirical experiments on real-world datasets corroborate the effectiveness of
these two frameworks on the tasks of feature selection and sentiment analysis. / Dissertation/Thesis / Masters Thesis Computer Science 2017
|
114 |
Developing a Recurrent Neural Network with High Accuracy for Binary Sentiment AnalysisCunanan, Kevin 01 January 2018 (has links)
Sentiment analysis has taken on various machine learning approaches in order to optimize accuracy, precision, and recall. However, Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs) account for the context of a sentence by using previous predictions as additional input for future sentence predictions. Our approach focused on developing an LSTM RNN that could perform binary sentiment analysis for positively and negatively labeled sentences. In collaboration with Mariam Salloum, I developed a collection of programs to classify individual sentences as either positive or negative. This paper additionally looks into machine learning, neural networks, data preprocessing, implementation, and resulting comparisons.
|
115 |
Detecção não supervisionada de posicionamento em textos de tweets / Unsupervised stance detection in texts of tweetsDias, Marcelo dos Santos January 2017 (has links)
Detecção de posicionamento é a tarefa de automaticamente identificar se o autor de um texto é favorável, contrário, ou nem favorável e nem contrário a uma dada proposição ou alvo. Com o amplo uso do Twitter como plataforma para expressar opiniões e posicionamentos, a análise automatizada deste conteúdo torna-se de grande valia para empresas, organizações e figuras públicas. Em geral, os trabalhos que exploram tal tarefa adotam abordagens supervisionadas ou semi-supervisionadas. O presente trabalho propõe e avalia um processo não supervisionado de detecção de posicionamento em textos de tweets que tem como entrada apenas o alvo e um conjunto de tweets a rotular e é baseado em uma abordagem híbrida composta por 2 etapas: a) rotulação automática de tweets baseada em um conjunto de heurísticas e b) classificação complementar baseada em aprendizado supervisionado de máquina. A proposta tem êxito quando aplicada a figuras públicas, superando o estado-da-arte. Além disso, são avaliadas alternativas no intuito de melhorar seu desempenho quando aplicada a outros domínios, revelando a possibilidade de se empregar estratégias tais como o uso de alvos e perfis semente dependendo das características de cada domínio. / Stance Detection is the task of automatically identifying if the author of a text is in favor of the given target, against the given target, or whether neither inference is likely. With the wide use of Twitter as a platform to express opinions and stances, the automatic analysis of this content becomes of high regard for companies, organizations and public figures. In general, works that explore such task adopt supervised or semi-supervised approaches. The present work proposes and evaluates a non-supervised process to detect stance in texts of tweets that has as entry only the target and a set of tweets to classify and is based on a hybrid approach composed by 2 stages: a) automatic labelling of tweets based on a set of heuristics and b) complementary classification based on supervised machine learning. The proposal succeeds when applied to public figures, overcoming the state-of-the-art. Beyond that, some alternatives are evaluated with the intention of increasing the performance when applied to other domains, revealing the possibility of use of strategies such as using seed targets and profiles depending on each domain characteristics.
|
116 |
Análise de sentimentos em tíquetes para o suporte de TI / Sentiment Analysis in Tickets for IT SupportBlaz, Cássio Castaldi Araújo January 2017 (has links)
Análise de Sentimentos/Mineração de Opinião é adotada na engenharia de software para questões como usabilidade e sentimentos de desenvolvedores em projetos. Este trabalho propõe métodos para avaliar os sentimentos presentes em tíquetes abertos à área de suporte de TI. Há diversos tipos de tíquetes abertos à TI (e.g. infraestrutura, software), que envolvem erros, incidentes, requisições, etc. O maior desafio é automaticamente distinguir entre a necessidade em si, a qual é intrinsecamente negativa (por exemplo, a descrição de um erro), de um sentimento embutido na descrição. Nossa abordagem automaticamente cria um dicionário de domínio que contém termos que expressam sentimentos no contexto de TI, utilizados para filtrar expressões em um tíquete para análise de sentimentos. Nós criamos e avaliamos três métodos de classificação para calcular a polaridade em tíquetes. Nosso estudo utilizou 34.895 tíquetes de cinco organizações. Para polaridade, 2.333 tíquetes foram selecionados aleatoriamente para compor nosso gold standard. Nossos melhores resultados apresentam uma precisão e revocação de 82,83% e 88,42%, respectivamente, o que supera outras soluções de análise de sentimentos comparadas. De forma complementar, emoções em tíquetes foram estudadas considerando os modelos de Ekman e VAD. Um dos três métodos de classificação criados foi adaptado para também identificar emoções nos tíquetes. Possíveis correlações entre polaridade e emoções foram verificadas via regras de associação. Resultados correlacionam tíquetes positivos com valência e dominância altas e excitação baixa, além de presença de alegria e surpresa e ausência de medo. Tíquetes negativos correlacionam com valência, excitação e dominância neutras, além de ausência de alegria e presença de medo. Contudo os resultados para a polaridade negativa não são precisos. / Sentiment Analysis/Opinion Mining has been adopted in software engineering for problems such as software usability and sentiment of developers in projects. This work proposes methods to evaluate the sentiment contained in tickets for IT (Information Technology) support. IT tickets are broad in coverage (e.g. infrastructure, software), and involve errors, incidents, requests, etc. The main challenge is to automatically distinguish between factual information, which is intrinsically negative (e.g. error description), from the sentiment embedded in the description. Our approach is to automatically create a domain dictionary that contains terms with sentiment in IT context, used to filter terms in tickets for sentiment analysis. We created and evaluate three classification methods for calculating the polarity of terms in tickets. Our study was developed using 34,895 tickets from five organizations. For polarity, we randomly selected 2.333 tickets to compose a gold standard. Our best results display an average precision and recall of 82.83% and 88.42%, respectively, which outperforms the compared sentiment analysis solutions. Complementarily, emotions in tickets were studied considering the models of Ekman and VAD. One of the three classification methods created has been adapted to also identify emotions in the tickets. Possible correlations between polarity and emotions were verified through association rules. Results correlate positive tickets with valence and dominance high and low excitation, besides presence of joy and surprise and absence of fear. Negative tickets correlate with valence, neutral excitement and dominance, besides absence of joy and presence of fear. However the results for negative polarity are not accurate.
|
117 |
Análise de sentimentos baseada em aspectos e atribuições de polaridade / Aspect-based sentiment analysis and polarity assignmentKauer, Anderson Uilian January 2016 (has links)
Com a crescente expansão da Web, cada vez mais usuários compartilham suas opiniões sobre experiências vividas. Essas opiniões estão, na maioria das vezes, representadas sob a forma de texto não estruturado. A Análise de Sentimentos (ou Mineração de Opinião) é a área dedicada ao estudo computacional das opiniões e sentimentos expressos em textos, tipicamente classificando-os de acordo com a sua polaridade (i.e., como positivos ou negativos). Ao mesmo tempo em que sites de vendas e redes sociais tornam-se grandes fontes de opiniões, cresce a busca por ferramentas que, de forma automática, classifiquem as opiniões e identifiquem a qual aspecto da entidade avaliada elas se referem. Neste trabalho, propomos métodos direcionados a dois pontos fundamentais para o tratamento dessas opiniões: (i) análise de sentimentos baseada em aspectos e (ii) atribuição de polaridade. Para a análise de sentimentos baseada em aspectos, desenvolvemos um método que identifica expressões que mencionem aspectos e entidades em um texto, utilizando ferramentas de processamento de linguagem natural combinadas com algoritmos de aprendizagem de máquina. Para a atribuição de polaridade, desenvolvemos um método que utiliza 24 atributos extraídos a partir do ranking gerado por um motor de busca e para gerar modelos de aprendizagem de máquina. Além disso, o método não depende de recursos linguísticos e pode ser aplicado sobre dados com ruídos. Experimentos realizados sobre datasets reais demonstram que, em ambas as contribuições, conseguimos resultados próximos aos dos baselines mesmo com um número pequeno de atributos. Ainda, para a atribuição de polaridade, os resultados são comparáveis aos de métodos do estado da arte que utilizam técnicas mais complexas. / With the growing expansion of the Web, more and more users share their views on experiences they have had. These views are, in most cases, represented in the form of unstructured text. The Sentiment Analysis (or Opinion Mining) is a research area dedicated to the computational study of the opinions and feelings expressed in texts, typically categorizing them according to their polarity (i.e., as positive or negative). As on-line sales and social networking sites become great sources of opinions, there is a growing need for tools that classify opinions and identify to which aspect of the evaluated entity they refer to. In this work, we propose methods aimed at two key points for the treatment of such opinions: (i) aspect-based sentiment analysis and (ii) polarity assignment. For aspect-based sentiment analysis, we developed a method that identifies expressions mentioning aspects and entities in text, using natural language processing tools combined with machine learning algorithms. For the identification of polarity, we developed a method that uses 24 attributes extracted from the ranking generated by a search engine to generate machine learning models. Furthermore, the method does not rely on linguistic resources and can be applied to noisy data. Experiments on real datasets show that, in both contributions, our results using a small number of attributes were similar to the baselines. Still, for assigning polarity, the results are comparable to prior art methods that use more complex techniques.
|
118 |
Mineração de opiniões em aspectos em fontes de opiniões fracamente estruturadas / Aspect-based opinion mining in weakly structured opinion sourcesSápiras, Leonardo Augusto January 2015 (has links)
Na WEB, são encontradas postagens sobre assuntos variados, notícias de celebridades, produtos e serviços. Tal conteúdo contém emoções positivas, negativas ou neutras. Minerar o sentimento da população sobre candidatos a eleições e seus aspectos em mídias virtuais pode ser realizado por meio de técnicas de Mineração de Opiniões. Existem soluções para fontes de opinião fortemente estruturadas, tais como revisões de produtos e serviços, no entanto o problema que se apresenta é realizar a mineração de opiniões em nível de aspecto em fontes de opiniões fracamente estruturadas. Além de avaliar conceitos relacionados à mineração de opiniões, o presente trabalho descreve a realização de um estudo de caso, o qual analisa fontes de opiniões fracamente estruturadas e propõe uma abordagem para minerar opiniões em nível de aspecto, utilizando como fontes de opinião comentários de leitores de jornais. O estudo de caso contribui (i) na concepção de uma abordagem para identificação da opinião em nível de aspecto sobre entidades eleitorais em comentários de notícias políticas, (ii) na aplicação de um método baseado em aprendizagem de máquina para classificar a opinião sobre entidades e seus aspectos em três classes (positivo, negativo e neutro), (iii) na representação da sumarização visual de opinião sobre entidades e seus aspectos. São descritos experimentos para identificar comentários que mencionam os aspectos saúde e educação, utilizando co-ocorrência, em que foram obtidos resultados satisfatórios utilizando as técnicas Expected Mutual Information Measure e phi-squared. Já para a polarização de sentenças, são realizados experimentos com duas abordagens de classificação: uma que classifica sentenças em três classes e outra que realiza classificações binárias em duas etapas. / In the WEB are found posts about various subjects like celebrity news, products and services. Such content has positive, negative or neutral emotions. Mining the population’s sentiments about elections candidates and their aspects in virtual media can be performed using Opinion Mining techniques. There are solutions for highly structured opinion sources, such as reviews of products and services, however the problem is how to perform aspect-based opinion mining in less structured opinions sources. Besides evaluating concepts related to opinion mining, this work describes a case study which analyzes weakly structured sources and proposes an approach to mine aspect-based opinions using as sources of sentiment reviews of newspaper readers. The case study contributes (i) designing an approach to identify the aspect-based opinion about electoral candidates in news political comments, (ii) to the application of a machine learning-based method to classify the opinion about entities and their aspects in three classes (positive, negative and neutral) (iii) to the representation of a visual summarization review of entities and their aspects. It describes experiments to identify comments about health and education aspects using co-occurrence where satisfactory results were obtained using the techniques Expected Mutual Information Measure and phi-squared. In which regards sentences polarization, experiments are performed with two classification approaches, one that classifies sentences in three classes and another that performs binary classifications in two stages.
|
119 |
A text-mining based approach to capturing the NHS patient experienceBahja, Mohammed January 2017 (has links)
An important issue for healthcare service providers is to achieve high levels of patient satisfaction. Collecting patient feedback about their experience in hospital enables providers to analyse their performance in terms of the levels of satisfaction and to identify the strengths and limitations of their service delivery. A common method of collecting patient feedback is via online portals and the forums of the service provider, where the patients can rate and comment about the service received. A challenge in analysing patient experience collected via online portals is that the amount of data can be huge and hence, prohibitive to analyse manually. In this thesis, an automated approach to patient experience analysis via Sentiment Analysis, Topic Modelling, and Dependency Parsing methods is presented. The patient experience data collected from the National Health Service (NHS) online portal in the United Kingdom is analysed in the study to understand this experience. The study was carried out in three iterations: (1) In the first, the Sentiment Analysis method was applied, which identified whether a given patient feedback item was positive or negative. (2) The second iteration involved applying Topic Modelling methods to identify automatically themes and topics from the patient feedback. Further, the outcomes of the Sentiment Analysis study from the first iteration were utilised to identify the patient sentiment regarding the topic being discussed in a given comment. (3) In the third iteration of the study, Dependency Parsing methods were employed for each patient feedback item and the topics identified. A method was devised to summarise the reason for a particular sentiment about each of the identified topics. The outcomes of the study demonstrate that text-mining methods can be effectively utilised to identify patients’ sentiment in their feedback as well as to identify the themes and topics discussed in it. The approach presented in the study was proven capable of effectively automatically analysing the NHS patient feedback database. Specifically, it can provide an overview of the positive and negative sentiment rate, identify the frequently discussed topics and summarise individual patient feedback items. Moreover, an API visualisation tool is introduced to make the outcomes more accessible to the health care providers.
|
120 |
Tweeting opinions : How does Twitter data stack up against the polls and betting odds?Karlsson, Beppe January 2018 (has links)
With the rise of social media, people have gained a platform to express opinions and discuss current subjects with others. This thesis investigates whether a simple sentiment analysis — determining how positive a tweet about a given party is — can be used to predict the results of the Swedish general election and compares the results to betting odds and opinion polls. The results show that while the idea is an interesting one, and sometimes the data can point in the right direction, it is by far a reliable source to predict election outcomes.
|
Page generated in 0.0906 seconds