201 |
Applications In Sentiment Analysis And Machine Learning For Identifying Public Health Variables Across Social MediaClark, Eric Michael 01 January 2019 (has links)
Twitter, a popular social media outlet, has evolved into a vast source of linguistic data, rich with opinion, sentiment, and discussion. We mined data from several public Twitter endpoints to identify content relevant to healthcare providers and public health regulatory professionals. We began by compiling content related to electronic nicotine delivery systems (or e-cigarettes) as these had become popular alternatives to tobacco products. There was an apparent need to remove high frequency tweeting entities, called bots, that would spam messages, advertisements, and fabricate testimonials. Algorithms were constructed using natural language processing and machine learning to sift human responses from automated accounts with high degrees of accuracy. We found the average hyperlink per tweet, the average character dissimilarity between each individual's content, as well as the rate of introduction of unique words were valuable attributes in identifying automated accounts. We performed a 10-fold Cross Validation and measured performance of each set of tweet features, at various bin sizes, the best of which performed with 97% accuracy. These methods were used to isolate automated content related to the advertising of electronic cigarettes. A rich taxonomy of automated entities, including robots, cyborgs, and spammers, each with different measurable linguistic features were categorized.
Electronic cigarette related posts were classified as automated or organic and content was investigated with a hedonometric sentiment analysis. The overwhelming majority (≈ 80%) were automated, many of which were commercial in nature. Others used false testimonials that were sent directly to individuals as a personalized form of targeted marketing. Many tweets advertised nicotine vaporizer fluid (or e-liquid) in various “kid-friendly” flavors including 'Fudge Brownie', 'Hot Chocolate', 'Circus Cotton Candy' along with every imaginable flavor of fruit, which were long ago banned for traditional tobacco products. Others offered free trials, as well as incentives to retweet and spread the post among their own network. Free prize giveaways were also hosted whose raffle tickets were issued for sharing their tweet. Due to the large youth presence on the public social media platform, this was evidence that the marketing of electronic cigarettes needed considerable regulation. Twitter has since officially banned all electronic cigarette advertising on their platform.
Social media has the capacity to afford the healthcare industry with valuable feedback from patients who reveal and express their medical decision-making process, as well as self-reported quality of life indicators both during and post treatment. We have studied several active cancer patient populations, discussing their experiences with the disease as well as survivor-ship. We experimented with a Convolutional Neural Network (CNN) as well as logistic regression to classify tweets as patient related. This led to a sample of 845 breast cancer survivor accounts to study, over 16 months. We found positive sentiments regarding patient treatment, raising support, and spreading awareness. A large portion of negative sentiments were shared regarding political legislation that could result in loss of coverage of their healthcare. We refer to these online public testimonies as “Invisible Patient Reported Outcomes” (iPROs), because they carry relevant indicators, yet are difficult to capture by conventional means of self-reporting. Our methods can be readily applied interdisciplinary to obtain insights into a particular group of public opinions. Capturing iPROs and public sentiments from online communication can help inform healthcare professionals and regulators, leading to more connected and personalized treatment regimens. Social listening can provide valuable insights into public health surveillance strategies.
|
202 |
Effects of Investor Sentiment Using Social Media on Corporate Financial DistressHoteit, Tarek 01 January 2015 (has links)
The mainstream quantitative models in the finance literature have been ineffective in detecting possible bankruptcies during the 2007 to 2009 financial crisis. Coinciding with the same period, various researchers suggested that sentiments in social media can predict future events. The purpose of the study was to examine the relationship between investor sentiment within the social media and the financial distress of firms Grounded on the social amplification of risk framework that shows the media as an amplified channel for risk events, the central hypothesis of the study was that investor sentiments in the social media could predict t he level of financial distress of firms. Third quarter 2014 financial data and 66,038 public postings in the social media website Twitter were collected for 5,787 publicly held firms in the United States for this study. The Spearman rank correlation was applied using Altman Z-Score for measuring financial distress levels in corporate firms and Stanford natural language processing algorithm for detecting sentiment levels in the social media. The findings from the study suggested a non-significant relationship between investor sentiments in the social media and corporate financial distress, and, hence, did not support the research hypothesis. However, the model developed in this study for analyzing investor sentiments and corporate distress in firms is both original and extensible for future research and is also accessible as a low-cost solution for financial market sentiment analysis.
|
203 |
口碑情感對於募資專案之影響 / The Influence of eWOM Sentiment on the Success of Crowdfunding Projects林漢文 Unknown Date (has links)
「群眾募資」為社會大眾透過小額資金的贊助,發揮群體集結的力量,支持個人
或組織使其目標或專案得以執行完成。隨著群眾募資平台的出現,加速了群眾募
資的發展,從國外知名的Kickstarter 到國內的Flyingv,這股募資的旋風一路席
捲了國內外傳統借貸生態。然而募資專案的成功因素也變成了一個重要的課題,
過去關於募資專案的文獻大多提到募資金額、募資更新次數等因素,較少著墨於
投資者對於募資產品的評論或口碑因素。因此本研究提出一個更廣泛的整合架構,
針對網路評論做情感分析作為影響募資專案成功的重要因素之一,並對
Kickstarter 上的專案,進行實證研究,結果發現口碑的數量及情感因素在不同類
別的專案中有不同的影響。在Game, Technology 和Design 類別對募資專案成功
有顯著的影響,但是在Music, Theater 和Dance 專案則沒有顯著影響。 / Abstract
Crowdfunding is definded as a process or activity that openly solicits a small amount
of money from a group of persons or orgnizations to make it success. The appearance
of crowdfunding platforms in recent years has accelerated the popularity of
crowdfunding. From Kickstarter to Flyingv, this Crowdfunding trend has changed
traditional borrowing ecology. However, not all crowdfunding projects are successful.
A substantial amount of proposed projects failed due to unable to raise the target
money. Therefore, it is interesting to investigate factors that may affect the success of
a fundraising project.
Previous literature has reported several success factors for crowdfunding, such as the
target amount, the number of updates, and so on. However, not many studies have
investigated the effect of project reviews in the past literature. It is clear that word of
mouth plays an important role in consumer decision, and it is reasonable to believe
that project reviews as a kind of word of mouth will have effect on investors’ decision.
Hence, this study adopts the sentiment analysis technique to analyze how the
sentiment of project reviews, along with other factors, may affect the eventual project
success. The data collected from the Kickstarter.com was used to evaluate our
research model. Our findings indicate that the number and sentiment of project
reviews did have impact on fundraising success, but only in certain categories such as
game, design and technology that seem to have objective evaluation criteria. Their
effect was not significant in categories such as music, theater, and dance in which
investors’ preference may be very subjective.
|
204 |
Novel document representations based on labels and sequential informationKim, Seungyeon 21 September 2015 (has links)
A wide variety of text analysis applications are based on statistical machine learning techniques. The success of those applications is critically affected by how we represent a document. Learning an efficient document representation has two major challenges: sparsity and sequentiality. The sparsity often causes high estimation error, and text's sequential nature, interdependency between words, causes even more complication.
This thesis presents novel document representations to overcome the two challenges. First, I employ label characteristics to estimate a compact document representation. Because label attributes implicitly describe the geometry of dense subspace that has substantial impact, I can effectively resolve the sparsity issue while only focusing the compact subspace. Second, while modeling a document as a joint or conditional distribution between words and their sequential information, I can efficiently reflect sequential nature of text in my document representations. Lastly, the thesis is concluded with a document representation that employs both labels and sequential information in a unified formulation.
The following four criteria are utilized to evaluate the goodness of representations: how close a representation is to its original data, how strongly a representation can be distinguished from each other, how easy to interpret a representation by a human, and how much computational effort is needed for a representation.
While pursuing those good representation criteria, I was able to obtain document representations that are closer to the original data, stronger in discrimination, and easier to be understood than traditional document representations. Efficient computation algorithms make the proposed approaches largely scalable. This thesis examines emotion prediction, temporal emotion analysis, modeling documents with edit histories, locally coherent topic modeling, and text categorization tasks for possible applications.
|
205 |
Application of common sense computing for the development of a novel knowledge-based opinion mining engineErik, Cambria January 2011 (has links)
The ways people express their opinions and sentiments have radically changed in the past few years thanks to the advent of social networks, web communities, blogs, wikis and other online collaborative media. The distillation of knowledge from this huge amount of unstructured information can be a key factor for marketers who want to create an image or identity in the minds of their customers for their product, brand, or organisation. These online social data, however, remain hardly accessible to computers, as they are specifically meant for human consumption. The automatic analysis of online opinions, in fact, involves a deep understanding of natural language text by machines, from which we are still very far. Hitherto, online information retrieval has been mainly based on algorithms relying on the textual representation of web-pages. Such algorithms are very good at retrieving texts, splitting them into parts, checking the spelling and counting their words. But when it comes to interpreting sentences and extracting meaningful information, their capabilities are known to be very limited. Existing approaches to opinion mining and sentiment analysis, in particular, can be grouped into three main categories: keyword spotting, in which text is classified into categories based on the presence of fairly unambiguous affect words; lexical affinity, which assigns arbitrary words a probabilistic affinity for a particular emotion; statistical methods, which calculate the valence of affective keywords and word co-occurrence frequencies on the base of a large training corpus. Early works aimed to classify entire documents as containing overall positive or negative polarity, or rating scores of reviews. Such systems were mainly based on supervised approaches relying on manually labelled samples, such as movie or product reviews where the opinionist’s overall positive or negative attitude was explicitly indicated. However, opinions and sentiments do not occur only at document level, nor they are limited to a single valence or target. Contrary or complementary attitudes toward the same topic or multiple topics can be present across the span of a document. In more recent works, text analysis granularity has been taken down to segment and sentence level, e.g., by using presence of opinion-bearing lexical items (single words or n-grams) to detect subjective sentences, or by exploiting association rule mining for a feature-based analysis of product reviews. These approaches, however, are still far from being able to infer the cognitive and affective information associated with natural language as they mainly rely on knowledge bases that are still too limited to efficiently process text at sentence level. In this thesis, common sense computing techniques are further developed and applied to bridge the semantic gap between word-level natural language data and the concept-level opinions conveyed by these. In particular, the ensemble application of graph mining and multi-dimensionality reduction techniques on two common sense knowledge bases was exploited to develop a novel intelligent engine for open-domain opinion mining and sentiment analysis. The proposed approach, termed sentic computing, performs a clause-level semantic analysis of text, which allows the inference of both the conceptual and emotional information associated with natural language opinions and, hence, a more efficient passage from (unstructured) textual information to (structured) machine-processable data. The engine was tested on three different resources, namely a Twitter hashtag repository, a LiveJournal database and a PatientOpinion dataset, and its performance compared both with results obtained using standard sentiment analysis techniques and using different state-of-the-art knowledge bases such as Princeton’s WordNet, MIT’s ConceptNet and Microsoft’s Probase. Differently from most currently available opinion mining services, the developed engine does not base its analysis on a limited set of affect words and their co-occurrence frequencies, but rather on common sense concepts and the cognitive and affective valence conveyed by these. This allows the engine to be domain-independent and, hence, to be embedded in any opinion mining system for the development of intelligent applications in multiple fields such as Social Web, HCI and e-health. Looking ahead, the combined novel use of different knowledge bases and of common sense reasoning techniques for opinion mining proposed in this work, will, eventually, pave the way for development of more bio-inspired approaches to the design of natural language processing systems capable of handling knowledge, retrieving it when necessary, making analogies and learning from experience.
|
206 |
K lingvistické struktuře emocionálního významu v češtině / On the Linguistic Structure of Emotional Meaning in CzechVeselovská, Kateřina January 2015 (has links)
Title: On the Linguistic Structure of Emotional Meaning in Czech Author: Mgr. Kateřina Veselovská Department: Institute of Formal and Applied Linguistics Supervisor: Prof. PhDr. Eva Hajičová, DrSc., Institute of Formal and Applied Linguistics Keywords: emotional meaning, linguistic structure, sentiment analysis, opinion mining, evaluative language Abstract: This thesis has two main goals. First, we provide an analysis of language means which together form an emotional meaning of written utterances in Czech. Sec- ond, we employ the findings concerning emotional language in computational applications. We provide a systematic overview of lexical, morphosyntactic, semantic and pragmatic aspects of emotional meaning in Czech utterances. Also, we propose two formal representations of emotional structures within the framework of the Prague Dependency Treebank and Construction Grammar. Regarding the computational applications, we focus on sentiment analysis, i.e. automatic extraction of emotions from text. We describe a creation of manually annotated emotional data resources in Czech and perform two main sentiment analysis tasks, polarity classification and opinion target identification on Czech data. In both of these tasks, we reach the state-of-the-art results.
|
207 |
Social media sentiment analysis for firm's revenue predictionDimadi, Ioanna January 2018 (has links)
The advent of the Internet and its social media platforms have affected people’s daily life. More and more people use it as a tool in order to communicate, exchange opin-ions and share information with others. However, those platforms have not only been used for socializing but also for expressing people’s product preferences. This wide spread of social networking sites has enabled companies to take advantage of them as an important way of approaching their target audience. This thesis focuses on study-ing the influence of social media platforms on the revenue of a single organization like Nike that uses them actively. Facebook and Twitter, two widely-used social me-dia platforms, were investigated with tweets and comments produced by consumer’s online discussions in brand’s hosted pages being gathered. This unstructured social media data were collected from 26 Nike official pages, 13 fan pages from each plat-form and their sentiment was analyzed. The classification of those comments had been done by using the Valence Aware Dictionary and Sentiment Reasoner (VADER), a lexicon-based approach that is implemented for social media analysis. After gathering the five-year Nike’s revenue, the degree to which these could be affected by the clas-sified data was examined by using multiple stepwise linear regression analysis. The findings showed that the fraction of positive/total for both Facebook and Twitter ex-plained 84.6% of the revenue’s variance. Fitting this data on the multiple regression model, Nike’s revenue could be forecast with a root mean square error around 287 billion.
|
208 |
Ontology Based Framework for Conceptualizing Human Affective States and Their InfluencesAbaalkhail, Rana 12 November 2018 (has links)
The study of human affective states and their influences has been a research interest in psychology for some time. Fortunately, the presence of an affective computing paradigm allows us to use theories and findings from the discipline of psychology in the representation and development of human affective applications.
However, because of the complexity of the subject, it is possible to misunderstand concepts that are shared via human and/or computer communications. With the appearance of technological innovations in our lives, for instance the SemanticWeb and the Web Ontology Language (OWL), there is a stronger need for computers to better understand human affective states and their influences. The use of an ontology can be beneficial in order to represent human affective states and their influences in a machine-understandable format. Truly, ontologies provide powerful tools to make sense of data.
Our thesis proposes HASIO, a Human Affective States and their Influences Ontology, designed based on existing psychological theories. HASIO was developed to represent the knowledge that is necessary to model affective states and their influences in a computerized format. It describes the human affective states (Emotion, Mood and Sentiment) and their influences (Personality, Need and Subjective well-being) and conceptualizes their models and recognition methods. HASIO also represents the relationships between affective states and the factors that influence them. We surveyed and analyzed existing ontologies regarding human affective states and their influences to realize the significance and profit of developing our proposed ontology (HASIO).
We follow the Methontology approach, a comprehensive engineering methodology for ontology building, to design and build HASIO.
An important aspect in determining the ontology scope is Competency Questions (CQs). We configure HASIO CQs by analyzing the resources from psychology theories, available lexicons and existing ontologies.
In this thesis, we present the development, modularization and evaluation of HASIO. HASIO can profit from the modularization process by dividing the whole ontology in self-contained modules that are easy to reuse and maintain. The ontology is evaluated through Question Answering system (HASIOQA), a task-based evaluation system, for validation. We design and develop a natural language interface system for this purpose. Moreover, the proposed ontology was evaluated
through the Ontology Pitfall Scanner for verification and correctness against several criteria.
Furthermore, HASIO was used in sentiment analysis on diffrent Twitter dataset. We designed and developed a tweet polarity calculation algorithm. Additionally, we compare our ontology result with machine learning technique. We demonstrate and highlight the advantage of using ontology in sentiment analysis.
|
209 |
Sentiment Analysis With Convolutional Neural Networks : Classifying sentiment in Swedish reviewsSvensson, Kristoffer January 2017 (has links)
Today many companies exist and market their products and services on social medias, and therefore may receive reviews and thoughts from their end-users directly in these social medias. Reading every text by hand can be time-consuming, so by analysing the sentiment for all texts give the companies an overview how positive or negative the users are on a specific subject. Sentiment analysis is a feature that Beanloop AB is interested in implementing in their future projects and this thesis research problem was to investigate how deep learning could be used for this task. It was done by conducting an experiment with deep learning and neural networks. Several convolutional neural network models were implemented with different settings to find a combination of settings that gave the highest accuracy on the given test dataset. There were two different kind of models, one kind classifying positive and negative, and the second classified the previous two categories but also neutral. The training dataset and the test dataset contained data from two recommendation sites, www.reco.se and se.trustpilot.com. The final result shows that when classifying three categories (positive, negative and neutral) the models had problems to reach an accuracy at 85%, were only one model reached 80% accuracy as best on the test dataset. However, when only classifying two categories (positive and negative) the models showed very good results and reached almost 95% accuracy for every model.
|
210 |
Sentiment analysis of Swedish reviews and transfer learning using Convolutional Neural NetworksSundström, Johan January 2018 (has links)
Sentiment analysis is a field within machine learning that focus on determine the contextual polarity of subjective information. It is a technique that can be used to analyze the "voice of the customer" and has been applied with success for the English language for opinionated information such as customer reviews, political opinions and social media data. A major problem regarding machine learning models is that they are domain dependent and will therefore not perform well for other domains. Transfer learning or domain adaption is a research field that study a model's ability of transferring knowledge across domains. In the extreme case a model will train on data from one domain, the source domain, and try to make accurate predictions on data from another domain, the target domain. The deep machine learning model Convolutional Neural Network (CNN) has in recent years gained much attention due to its performance in computer vision both for in-domain classification and transfer learning. It has also performed well for natural language processing problems but has not been investigated to the same extent for transfer learning within this area. The purpose of this thesis has been to investigate how well suited the CNN is for cross-domain sentiment analysis of Swedish reviews. The research has been conducted by investigating how the model perform when trained with data from different domains with varying amount of source and target data. Additionally, the impact on the model’s transferability when using different text representation has also been studied. This study has shown that a CNN without pre-trained word embedding is not that well suited for transfer learning since it performs worse than a traditional logistic regression model. Substituting 20% of source training data with target data can in many of the test cases boost the performance with 7-8% both for the logistic regression and the CNN model. Using pre-trained word embedding produced by a word2vec model increases the CNN's transferability as well as the in-domain performance and outperform the logistic regression model and the CNN model without pre-trained word embedding in the majority of test cases.
|
Page generated in 0.466 seconds