Global ETD Search

121	Word Frequency as a Predictor of Word Intensity Padilla López, Rebeca January 2017 (has links) In this thesis we explore the intensity of adjectives and how it can be predicted by different word features. We investigate how to accurately determine intensity between synonymous adjectives. For this, we look at features such as word frequency, number of senses and syllable length. Our study is inspired by life satisfaction and happiness surveys and the possibility that differences in intensity in the translation of the adjectives used for the questionnaires could explain the high degree of satisfaction that some countries show. We base our hypothesis on the theories of grammaticalization and semantic bleaching and the discoveries made by other researches about the relations between these word features and word intensity. We focus on studying Danish, English and French. Our study points to a statistically significant negative correlation between word frequency and word intensity. word frequency intensity sentiment analysis General Language Studies and Linguistics
122	Mineração de opiniões em aspectos em fontes de opiniões fracamente estruturadas / Aspect-based opinion mining in weakly structured opinion sources Sápiras, Leonardo Augusto January 2015 (has links) Na WEB, são encontradas postagens sobre assuntos variados, notícias de celebridades, produtos e serviços. Tal conteúdo contém emoções positivas, negativas ou neutras. Minerar o sentimento da população sobre candidatos a eleições e seus aspectos em mídias virtuais pode ser realizado por meio de técnicas de Mineração de Opiniões. Existem soluções para fontes de opinião fortemente estruturadas, tais como revisões de produtos e serviços, no entanto o problema que se apresenta é realizar a mineração de opiniões em nível de aspecto em fontes de opiniões fracamente estruturadas. Além de avaliar conceitos relacionados à mineração de opiniões, o presente trabalho descreve a realização de um estudo de caso, o qual analisa fontes de opiniões fracamente estruturadas e propõe uma abordagem para minerar opiniões em nível de aspecto, utilizando como fontes de opinião comentários de leitores de jornais. O estudo de caso contribui (i) na concepção de uma abordagem para identificação da opinião em nível de aspecto sobre entidades eleitorais em comentários de notícias políticas, (ii) na aplicação de um método baseado em aprendizagem de máquina para classificar a opinião sobre entidades e seus aspectos em três classes (positivo, negativo e neutro), (iii) na representação da sumarização visual de opinião sobre entidades e seus aspectos. São descritos experimentos para identificar comentários que mencionam os aspectos saúde e educação, utilizando co-ocorrência, em que foram obtidos resultados satisfatórios utilizando as técnicas Expected Mutual Information Measure e phi-squared. Já para a polarização de sentenças, são realizados experimentos com duas abordagens de classificação: uma que classifica sentenças em três classes e outra que realiza classificações binárias em duas etapas. / In the WEB are found posts about various subjects like celebrity news, products and services. Such content has positive, negative or neutral emotions. Mining the population’s sentiments about elections candidates and their aspects in virtual media can be performed using Opinion Mining techniques. There are solutions for highly structured opinion sources, such as reviews of products and services, however the problem is how to perform aspect-based opinion mining in less structured opinions sources. Besides evaluating concepts related to opinion mining, this work describes a case study which analyzes weakly structured sources and proposes an approach to mine aspect-based opinions using as sources of sentiment reviews of newspaper readers. The case study contributes (i) designing an approach to identify the aspect-based opinion about electoral candidates in news political comments, (ii) to the application of a machine learning-based method to classify the opinion about entities and their aspects in three classes (positive, negative and neutral) (iii) to the representation of a visual summarization review of entities and their aspects. It describes experiments to identify comments about health and education aspects using co-occurrence where satisfactory results were obtained using the techniques Expected Mutual Information Measure and phi-squared. In which regards sentences polarization, experiments are performed with two classification approaches, one that classifies sentences in three classes and another that performs binary classifications in two stages. Mineracao : Dados Sistemas : Informação Sistemas eleitorais Opinion mining Sentiment analysis Aspect identification Opinion summarization
123	Intangible Costs of Data Breach Events Sinanaj, Griselda 17 October 2017 (has links) No description available. 330 Data breaches Abnormal returns Event study Corporate reputation Sentiment analysis Wirtschaftswissenschaften (PPN621567140)
124	AN ONTOLOGY BASED SENTIMENT ANALYSIS : A Case Study Haider, Syed Zeeshan January 2012 (has links) Business through e-commerce has become popular recently due to the massive amount of information available on internet. This has resulted in the abnormal number of reviews on websites like www.amazon.com and www.ebay.com, where customers express their opinions about the purchases they have made. Analyzing customer’s behavior has become very important for the organizations to find new market trends and insights. For the potential customer it becomes really difficult to get the knowledge about a product in the presence of such huge number of reviews and to sort the useful reviews and make good decision. The reviews available on these websites are in heterogeneous form i.e. structured and unstructured form and needs to be stored in a consistent format. Since good decision requires quality information in limited amount of time, Yaakub et, al.(2011) have proposed an ontology that uses a multidimensional model to integrate customer’s characteristics and their comments about products. This approach first identifies the entities and then sentiments present in the customers reviews related to mobiles are transformed into an attribute table by using a 7 point polarity system (-3 to 3). The research proposed by Yaakub et, al.(2011) is in developing stage. The limitation of their approach is that the ontology proposed by them is too general. The authors have shown their desire that it should be tested for a large group of products. Also, Yaakub et, al.(2011) have used very short and simple comments for the manual extraction of features for which a sentiment has been expressed. Usually comments present on e-commerce websites are not that short and simple. In order to fulfill the aim of this thesis project, a case study has been conducted on websites www.amazon.com and www.ebay.com and the ontology proposed by Yaakub et, al.(2011) has been refined for the three categories of mobile phones: smart phones, wet and dirty mobile phones and simple mobile phones. Further, sentiment analysis has been conducted by first using the ontology proposed by Yaakub et, al.(2011) and then by using the refined version of the ontologies for the three categories of mobile in order to compare the results. Opinion analysis Sentiment analysis Ontologies Opinion mining Computer Sciences Datavetenskap (datalogi)
125	Analytics and Healthcare Costs (A Three Essay Dissertation) Bouayad, Lina 01 January 2015 (has links) Both literature and practice have looked at different strategies to diminish healthcare associated costs. As an extension to this stream of research, the present three paper dissertation addresses the issue of reducing elevated healthcare costs using analytics. The first paper looks at extending the benefits of auditing algorithms from mere detection of fraudulent providers to maximizing the deterrence from inappropriate behavior. Using the structure of the physicians' network, a new auditing algorithm is developed. Evaluation of the algorithm is performed using an agent-based simulation and an analytical model. A case study is also included to illustrate the application of the algorithm in the warranty domain. The second paper relies on experimental data to build a personalized medical recommender system geared towards re-enforcing price-sensitive prescription behavior. The study analyzes the impact of time pressure, and procedure cost and prescription prevalence/popularity on the physicians' use of the system's recommendations. The third paper investigates the relationship between patients' compliance and healthcare costs. The study includes a survey of the literature along with a longitudinal analysis of patients' data to determine factors leading to patients' non-compliance, and ways to alleviate it. Auditing Algorithms Sentinel Effect Time Pressure Recommender Systems Patient Compliance Sentiment Analysis Business
126	Contextual lexicon-based sentiment analysis for social media Muhammad, Aminu January 2016 (has links) Sentiment analysis concerns the computational study of opinions expressed in text. Social media domains provide a wealth of opinionated data, thus, creating a greater need for sentiment analysis. Typically, sentiment lexicons that capture term-sentiment association knowledge are commonly used to develop sentiment analysis systems. However, the nature of social media content calls for analysis methods and knowledge sources that are better able to adapt to changing vocabulary. Invariably existing sentiment lexicon knowledge cannot usefully handle social media vocabulary which is typically informal and changeable yet rich in sentiment. This, in turn, has implications on the analyser's ability to effectively capture the context therein and to interpret the sentiment polarity from the lexicons. In this thesis we use SentiWordNet, a popular sentiment-rich lexicon with a substantial vocabulary coverage and explore how to adapt it for social media sentiment analysis. Firstly, the thesis identifies a set of strategies to incorporate the effect of modifiers on sentiment-bearing terms (local context). These modifiers include: contextual valence shifters, non-lexical sentiment modifiers typical in social media and discourse structures. Secondly, the thesis introduces an approach in which a domain-specific lexicon is generated using a distant supervision method and integrated with a general-purpose lexicon, using a weighted strategy, to form a hybrid (domain-adapted) lexicon. This has the dual purpose of enriching term coverage of the general purpose lexicon with non-standard but sentiment-rich terms as well as adjusting sentiment semantics of terms. Here, we identified two term-sentiment association metrics based on Term Frequency and Inverse Document Frequency that are able to outperform the state-of-the-art Point-wise Mutual Information on social media data. As distant supervision may not be readily applicable on some social media domains, we explore the cross-domain transferability of a hybrid lexicon. Thirdly, we introduce an approach for improving distant-supervised sentiment classification with knowledge from local context analysis, domain-adapted (hybrid) and emotion lexicons. Finally, we conduct a comprehensive evaluation of all identified approaches using six sentiment-rich social media datasets. 302.23
127	Análisis estático y dinámico de opiniones en twitter Bravo Márquez, Felipe January 2013 (has links) Magíster en Ciencias, Mención COmputación / Los medios de comunicación social y en particular las plataformas de Microblogging se han consolidado como un espacio para el consumo y producción de información. Twitter se ha vuelto una de las plataforma más populares de este estilo y hoy en día tiene millones de usuarios que diariamente publican millones de mensajes personales o ``twiits''. Una parte importante de estos mensajes corresponden a opiniones personales, cuya riqueza y volumen ofrecen una gran oportunidad para el estudio de la opinión pública. Para tabajar con este alto volumen de opiniones digitales, se utilizan un conjunto de herramientas computacionales conocidas como métodos de análisis de sentimiento o minería de opinión. La utilidad de evaluar la opinión pública usando análisis de sentimiento sobre opiniones digitales genera controversia en la comunidad científica. Mientras diversos trabajos declaran que este enfoque permite capturar la opinión pública de una manera similar a medios tradicionales como las encuestas, otros trabajos declaran que este poder esta sobrevalorado. En este contexto, estudiamos el comportamiento estático y dinámico de las opiniones digitales para comprender su naturaleza y determinar las limitaciones de predecir su evolución en el tiempo. En una primera etapa se estudia el problema de identificar de manera automática los tuits que expresan una opinión, para luego inferir si es que esa opinión tiene una connotación positiva o negativa. Se propone una metodología para mejorar la clasificación de sentimiento en Twitter usando atributos basados en distintas dimensiones de sentimiento. Se combinan aspectos como la intensidad de opinión, la emoción y la polaridad, a partir de distintos métodos y recursos existentes para el análisis de sentimiento. La investigación muestra que la combinación de distintas dimensiones de opinión permite mejorar significativamente las tareas de clasificación de sentimientos en Twitter de detección de subjetividad y de polaridad. En la segunda parte del análisis se exploran las propiedades temporales de las opiniones en Twitter mediante el análisis de series temporales de opinión. La idea principal es determinar si es que las series temporales de opinión pueden ser usadas para crear modelos predictivos confiables. Se recuperan en el tiempo mensajes emitidos en Twitter asociados a un grupo definido de tópicos. Luego se calculan indicadores de opinión usando métodos de análisis de sentimiento para luego agregarlos en el tiempo y construir series temporales de opinión. El estudio se basa en modelos ARMA/ARIMA y GARCH para modelar la media y la volatilidad de las series. Se realiza un análisis profundo de las propiedades estadísticas de las series temporales encontrando que éstas presentan propiedades de estacionalidad y volatilidad. Como la volatilidad se relaciona con la incertidumbre, se postula que estas series no debiesen ser usadas para realizar pronósticos en el largo plazo. Los resultados experimentales obtenidos permiten concluir que las opiniones son objetos multidimensionales, donde las distintas dimensiones pueden complementarse para mejorar la clasificación de sentimiento. Por otro lado, podemos decir que las series temporales de opinión deben cumplir con ciertas propiedades estadísticas para poder realizar pronósticos confiables a partir de ellas. Dado que aún no hay suficiente evidencia para validar el supuesto poder predictivo de las opiniones digitales, nuestros resultados indican que una validación más rigurosa de los modelos estáticos y dinámicos que se constuyen a partir de estas opiniones permiten establecer de mejor manera los alcances de la minería de opinión. Internet - Aspectos sociales Redes sociales - Chile Minería de datos Twitter Minería de opinion Sentiment analysis
128	Continuous dimensional emotion tracking in music Imbrasaite, Vaiva January 2015 (has links) The size of easily-accessible libraries of digital music recordings is growing every day, and people need new and more intuitive ways of managing them, searching through them and discovering new music. Musical emotion is a method of classification that people use without thinking and it therefore could be used for enriching music libraries to make them more user-friendly, evaluating new pieces or even for discovering meaningful features for automatic composition. The field of Emotion in Music is not new: there has been a lot of work done in musicology, psychology, and other fields. However, automatic emotion prediction in music is still at its infancy and often lacks that transfer of knowledge from the other fields surrounding it. This dissertation explores automatic continuous dimensional emotion prediction in music and shows how various findings from other areas of Emotion and Music and Affective Computing can be translated and used for this task. There are four main contributions. Firstly, I describe a study that I conducted which focused on evaluation metrics used to present the results of continuous emotion prediction. So far, the field lacks consensus on which metrics to use, making the comparison of different approaches near impossible. In this study, I investigated people’s intuitively preferred evaluation metric, and, on the basis of the results, suggested some guidelines for the analysis of the results of continuous emotion recognition algorithms. I discovered that root-mean-squared error (RMSE) is significantly preferable to the other metrics explored for the one dimensional case, and it has similar preference ratings to correlation coefficient in the two dimensional case. Secondly, I investigated how various findings from the field of Emotion in Music can be used when building feature vectors for machine learning solutions to the problem. I suggest some novel feature vector representation techniques, testing them on several datasets and several machine learning models, showing the advantage they can bring. Some of the suggested feature representations can reduce RMSE by up to 19% when compared to the standard feature representation, and up to 10-fold improvement for non-squared correlation coefficient. Thirdly, I describe Continuous Conditional Random Fields and Continuous Conditional Neural Fields (CCNF) and introduce their use for the problem of continuous dimensional emotion recognition in music, comparing them with Support Vector Regression. These two models incorporate some of the temporal information that the standard bag-of-frames approaches lack, and are therefore capable of improving the results. CCNF can reduce RMSE by up to 20% when compared to Support Vector Regression, and can increase squared correlation for the valence axis by up to 40%. Finally, I describe a novel multi-modal approach to continuous dimensional music emotion recognition. The field so far has focused solely on acoustic analysis of songs, while in this dissertation I show how the separation of vocals and music and the analysis of lyrics can be used to improve the performance of such systems. The separation of music and vocals can improve the results by up to 10% with a stronger impact on arousal, when compared to a system that uses only acoustic analysis of the whole signal, and the addition of the analysis of lyrics can provide a similar improvement to the results of the valence model. 786.7
129	Sentiment Analysis on Multi-view Social Data Niu, Teng January 2016 (has links) With the proliferation of social networks, people are likely to share their opinions about news, social events and products on the Web. There is an increasing interest in understanding users’ attitude or sentiment from the large repository of opinion-rich data on the Web. This can beneﬁt many commercial and political applications. Primarily, the researchers concentrated on the documents such as users’ comments on the purchased products. Recent works show that visual appearance also conveys rich human affection that can be predicted. While great efforts have been devoted on the single media, either text or image, little attempts are paid for the joint analysis of multi-view data which is becoming a prevalent form in the social media. For example, paired with the posted textual messages on Twitter, users are likely to upload images and videos which may carry their affective states. One common obstacle is the lack of sufficient manually annotated instances for model learning and performance evaluation. To prompt the researches on this problem, we introduce a multi-view sentiment analysis dataset (MVSA) including a set of manually annotated image-text pairs collected from Twitter. The dataset can be utilized as a valuable benchmark for both single-view and multi-view sentiment analysis. In this thesis, we further conduct a comprehensive study on computational analysis of sentiment from the multi-view data. The state-of-the-art approaches on single view (image or text) or multi view (image and text) data are introduced, and compared through extensive experiments conducted on our constructed dataset and other public datasets. More importantly, the effectiveness of the correlation between different views is also studied using the widely used fusion strategies and advanced multi-view feature extraction methods. Sentiment analysis social media multi-view data textual feature visual feature joint feature learning
130	Expansão de recursos para análise de sentimentos usando aprendizado semi-supervisionado / Extending sentiment analysis resources using semi-supervised learning Henrico Bertini Brum 23 March 2018 (has links) O grande volume de dados que temos disponíveis em ambientes virtuais pode ser excelente fonte de novos recursos para estudos em diversas tarefas de Processamento de Linguagem Natural, como a Análise de Sentimentos. Infelizmente é elevado o custo de anotação de novos córpus, que envolve desde investimentos financeiros até demorados processos de revisão. Nossa pesquisa propõe uma abordagem de anotação semissupervisionada, ou seja, anotação automática de um grande córpus não anotado partindo de um conjunto de dados anotados manualmente. Para tal, introduzimos o TweetSentBR, um córpus de tweets no domínio de programas televisivos que possui anotação em três classes e revisões parciais feitas por até sete anotadores. O córpus representa um importante recurso linguístico de português brasileiro, e fica entre os maiores córpus anotados na literatura para classificação de polaridades. Além da anotação manual do córpus, realizamos a implementação de um framework de aprendizado semissupervisionado que faz uso de dados anotados e, de maneira iterativa, expande o mesmo usando dados não anotados. O TweetSentBR, que possui 15:000 tweets anotados é assim expandido cerca de oito vezes. Para a expansão, foram treinados modelos de classificação usando seis classificadores de polaridades, assim como foram avaliados diferentes parâmetros e representações a fim de obter um córpus confiável. Realizamos experimentos gerando córpus expandidos por cada classificador, tanto para a classificação em três polaridades (positiva, neutra e negativa) quanto para classificação binária. Avaliamos os córpus gerados usando um conjunto de held-out e comparamos a FMeasure da classificação usando como treinamento os córpus anotados manualmente e semiautomaticamente. O córpus semissupervisionado que obteve os melhores resultados para a classificação em três polaridades atingiu 62;14% de F-Measure média, superando a média obtida com as avaliações no córpus anotado manualmente (61;02%). Na classificação binária, o melhor córpus expandido obteve 83;11% de F1-Measure média, superando a média obtida na avaliação do córpus anotado manualmente (79;80%). Além disso, simulamos nossa expansão em córpus anotados da literatura, medindo o quão corretas são as etiquetas anotadas semi-automaticamente. Nosso melhor resultado foi na expansão de um córpus de reviews de produtos que obteve FMeasure de 93;15% com dados binários. Por fim, comparamos um córpus da literatura obtido por meio de supervisão distante e nosso framework semissupervisionado superou o primeiro na classificação de polaridades binária em cross-domain. / The high volume of data available in the Internet can be a good resource for studies of several tasks in Natural Language Processing as in Sentiment Analysis. Unfortunately there is a high cost for the annotation of new corpora, involving financial support and long revision processes. Our work proposes an approach for semi-supervised labeling, an automatic annotation of a large unlabeled set of documents starting from a manually annotated corpus. In order to achieve that, we introduced TweetSentBR, a tweet corpora on TV show programs domain with annotation for 3-point (positive, neutral and negative) sentiment classification partially reviewed by up to seven annotators. The corpus is an important linguistic resource for Brazilian Portuguese language and it stands between the biggest annotated corpora for polarity classification. Beyond the manual annotation, we implemented a semi-supervised learning based framework that uses this labeled data and extends it using unlabeled data. TweetSentBR corpus, containing 15:000 documents, had its size augmented in eight times. For the extending process, we trained classification models using six polarity classifiers, evaluated different parameters and representation schemes in order to obtain the most reliable corpora. We ran experiments generating extended corpora for each classifier, both for 3-point and binary classification. We evaluated the generated corpora using a held-out subset and compared the obtained F-Measure values with the manually and the semi-supervised annotated corpora. The semi-supervised corpus that obtained the best values for 3-point classification achieved 62;14% on average F-Measure, overcoming the results obtained by the same classification with the manually annotated corpus (61;02%). On binary classification, the best extended corpus achieved 83;11% on average F-Measure, overcoming the results on the manually corpora (79;80%). Furthermore, we simulated the extension of labeled corpora in literature, measuring how well the semi-supervised annotation works. Our best results were in the extension of a product review corpora, achieving 93;15% on F1-Measure. Finally, we compared a literature corpus which was labeled by using distant supervision with our semi-supervised corpus, and this overcame the first in binary polarity classification on cross-domain data. Análise de sentimentos Anotação de córpus Aprendizado semisupervisionado Corpus annotation Semi-supervised learning Sentiment analysis

Search results