Global ETD Search

161	Knowledge acquisition from user reviews for interactive question answering Konstantinova, Natalia January 2013 (has links) Nowadays, the effective management of information is extremely important for all spheres of our lives and applications such as search engines and question answering systems help users to find the information that they need. However, even when assisted by these various applications, people sometimes struggle to find what they want. For example, when choosing a product customers can be confused by the need to consider many features before they can reach a decision. Interactive question answering (IQA) systems can help customers in this process, by answering questions about products and initiating a dialogue with the customers when their needs are not clearly defined. The focus of this thesis is how to design an interactive question answering system that will assist users in choosing a product they are looking for, in an optimal way, when a large number of similar products are available. Such an IQA system will be based on selecting a set of characteristics (also referred to as product features in this thesis), that describe the relevant product, and narrowing the search space. We believe that the order in which these characteristics are presented in terms of these IQA sessions is of high importance. Therefore, they need to be ranked in order to have a dialogue which selects the product in an efficient manner. The research question investigated in this thesis is whether product characteristics mentioned in user reviews are important for a person who is likely to purchase a product and can therefore be used when designing an IQA system. We focus our attention on products such as mobile phones; however, the proposed techniques can be adapted for other types of products if the data is available. Methods from natural language processing (NLP) fields such as coreference resolution, relation extraction and opinion mining are combined to produce various rankings of phone features. The research presented in this thesis employs two corpora which contain texts related to mobile phones specifically collected for this thesis: a corpus of Wikipedia articles about mobile phones and a corpus of mobile phone reviews published on the Epinions.com website. Parts of these corpora were manually annotated with coreference relations, mobile phone features and relations between mentions of the phone and its features. The annotation is used to develop a coreference resolution module as well as a machine learning-based relation extractor. Rule-based methods for identification of coreference chains describing the phone are designed and thoroughly evaluated against the annotated gold standard. Machine learning is used to find links between mentions of the phone (identified by coreference resolution) and phone features. It determines whether some phone feature belong to the phone mentioned in the same sentence or not. In order to find the best rankings, this thesis investigates several settings. One of the hypotheses tested here is that the relatively low results of the proposed baseline are caused by noise introduced by sentences which are not directly related to the phone and phone feature. To test this hypothesis, only sentences which contained mentions of the mobile phone and a phone feature linked to it were processed to produce rankings of the phones features. Selection of the relevant sentences is based on the results of coreference resolution and relation extraction. Another hypothesis is that opinionated sentences are a good source for ranking the phone features. In order to investigate this, a sentiment classification system is also employed to distinguish between features mentioned in positive and negative contexts. The detailed evaluation and error analysis of the methods proposed form an important part of this research and ensure that the results provided in this thesis are reliable. 006.3
162	應用情感分析於輿情之研究-以台灣2016總統選舉為例 / A Study of using sentiment analysis for emotion in Taiwan's presidential election of 2016 陳昭元, Chen, Chao-Yuan Unknown Date (has links) 從2014年九合一選舉到今年總統大選，網路在選戰的影響度越來越大，後選人可透過網路上之熱門討論議題即時掌握民眾需求。文字情感分析通常使用監督式或非監督式的方法來分析文件，監督式透過文件量化可達很高的正確率，但無法預期未知趨勢，耗費人力標注文章。本研究針對網路上之政治新聞輿情，提出一個混合非監督式與監督式學習的中文情感分析方法，先透過非監督式方法標注新聞，再用監督式方法建立分類模型，驗證分類準確率。在實驗結果中，主題標注方面，本研究發現因文本數量遠大於議題詞數量造成TFIDF矩陣過於稀疏，使得TFIDF-Kmeans主題模型分類效果不佳；而NPMI-Concor主題模型分類效果較佳但是所分出的議題詞數量不均衡，然而LDA主題模型基於所有主題被所有文章共享的特性，使得在字詞分群與主題分類準確度都優於TFIDF-Kmeans和NPMI-Concor主題模型，分類準確度高達97%，故後續採用LDA主題模型進行主題標注。情緒傾向標注方面，證實本研究擴充後的情感詞集比起NTUSD有更好的字詞極性判斷效果，並且進一步使用ChineseWordnet 和 SentiWordNet，找出詞彙的情緒強度，使得在網友評論的情緒計算更加準確。亦發現所有文本的情緒指數皆具皆能反應民調指數，故本研究用文本的情緒指數來建立民調趨勢分類模型。在關注議題分類結果的實驗，整體正確率達到95%，而在民調趨勢分類結果的實驗，整體正確率達到85%。另外建立全面性的視覺化報告以瞭解民眾的正反意見，提供候選人在選戰上之競爭智慧。 / From Taiwanese local elections, 2014 to Taiwan presidential elections, 2016. Network is in growing influence of the election. The nominee can immediately grasp the needs of the people through a popular subject of discussion on the website. Sentiment Analysis research encompasses supervised and unsupervised methods for analyzing review text. The supervised learning is proved as a powerful method with high accuracy, but there are limits where future trend cannot be recognized, and the labels of individual classes must be made manually. In the study, we propose a Chinese Sentiment Analysis method which combined supervised and unsupervised learning. First, we used unsupervised learning to label every articles. Secondly, we used supervised learning to build classification model and verified the result. According to the result of finding subject labeling, we found that TFIDF-Kmeans model is not suitable because of document characteristic. NPMI-Concor model is better than TFIDF-Kmeans model. But the subject words is not balanced. However, LDA model has the feature that all subject is share by all articles. LDA model classification performance can reach 97% accuracy. So we choose it to decide article subject. According to the result of sentimental labeling, the sentimental dictionary we build has higher accuracy than NTUSD on judging word polarity. Moreover, we used ChineseWordnet and SentiWordNet to calculate the strength of word. So we can have more accuracy on calculate public’s sentiment. So we use these sentiment index to build prediction model. In the result of subject labeling, our accuracy is 95%. Meanwhile, In the result of prediction our accuracy is 85%. We also create the Visualization report for the nominee to understand the positive and the negative options of public. Our research can help the nominee by providing competitive wisdom. 情感分析文字分類支援向量機 Sentiment Analysis Text Classification SVM
163	Mathematical Modeling of Public Opinion using Traditional and Social Media Cody, Emily 01 January 2016 (has links) With the growth of the internet, data from text sources has become increasingly available to researchers in the form of online newspapers, journals, and blogs. This data presents a unique opportunity to analyze human opinions and behaviors without soliciting the public explicitly. In this research, I utilize newspaper articles and the social media service Twitter to infer self-reported public opinions and awareness of climate change. Climate change is one of the most important and heavily debated issues of our time, and analyzing large-scale text surrounding this issue reveals insights surrounding self-reported public opinion. First, I inquire about public discourse on both climate change and energy system vulnerability following two large hurricanes. I apply topic modeling techniques to a corpus of articles about each hurricane in order to determine how these topics were reported on in the post event news media. Next, I perform sentiment analysis on a large collection of data from Twitter using a previously developed tool called the "hedonometer". I use this sentiment scoring technique to investigate how the Twitter community reports feeling about climate change. Finally, I generalize the sentiment analysis technique to many other topics of global importance, and compare to more traditional public opinion polling methods. I determine that since traditional public opinion polls have limited reach and high associated costs, text data from Twitter may be the future of public opinion polling. environmental communications human behavior opinion polling sentiment analysis social media topic modeling Applied Mathematics Climate Social and Behavioral Sciences
164	應用情感分析於媒體新聞傾向之研究-以中央社為例 / Applying sentiment analysis to the tendency of media news: a case study of central news agency 吳信維, Wu, Xin-Wei Unknown Date (has links) 本研究目的在於結合關聯規則新詞發掘演算法來擴增詞庫，並藉此提高結斷詞句的精確度以及透過非監督式情感分析方法，從中央通訊社中抓取國民黨以及民進黨的相關新聞文本，建立主題模型與情緒傾向的標注。再藉由監督式學習方法建立分類模型並驗證其成果。　　本研究藉由n-gram with a-priori algorithm來進行斷詞斷句的詞庫擴增。共有32007組詞被發掘，於這些詞中具有真正意義的詞共有28838筆，成功率可達88%。　　本研究比較兩種分群方法建立主題模型，分別為TFIDF-Kmeans以及LDA。在TFIDF-Kmeans分群結果中，因為文本數量遠大於議題詞數量，造成TFIDF矩陣過於稀疏，造成分群效果不佳。在LDA的分群結果底下，因為LDA模型其多文章多主題共享的特性，主題分類的精準度更高達八成以上。故本研究認為在分析具有多主題特性之文本，採用LDA模型來進行議題詞分群會有較佳的表現。　　本研究透過結合不同的資料時間區間，呈現出中央通訊社的新聞文本在我國近五次總統大選前後三個月間的新聞情緒傾向。同時探討各主題模型中各類別於大選前後三個月之情緒傾向變化。可以觀察到大致上文本的情感指數高峰值會出現於投票日的時候，而近三次總統大選的結果顯示，相關的政黨新聞情感值會於選舉過後趨於平緩。而從新聞文本的正負向情感統計以及以及整體情緒傾向分析可以看出，不論執政黨為何，中央通訊社的新聞對於國民黨以及民進黨皆呈現了正向且平穩的內容，大抵不會特別偏向單一政黨 / The purpose of this research is to combine association rules and new word mining algorithms to expand the lexicons so as to improve the accuracy of word segmentations, and by capturing the KMT and DPP news from the Central News Agency, it establishes the theme model and sentiment orientation through the unsupervised sentiment analysis method. Finally, by means of supervised learning methods, this research establishes classifications models and verifies its results. 　　This research uses n-gram with a-priori algorithm to segment words and sentences to expand the lexicons. A total of 32007 word are found, and among them, there have 28838 words with real meaning. The success rate is up to 88%. 　　In this research, we compare two different clustering methods to form the theme model, which are the TFIDF-Kmeans, and the LDA. From the results of TFIDF-Kmeans, the TFIDF matrix is too sparse, resulting in poor clustering because the number of texts is a lot larger than that of the issues. Unlike TFIDF-Kmeans, because of LDA model with more features of multi-topic sharing, the accuracy of topic classification is more than 80%. Therefore, this research suggests that it will have a better performance to analyze the multi-subjective texts with LDA model to classify the word clustering. 　　Through the combination of different data time interval, this research presents the sentimental tendencies of Central News Agency’s news in three months before and after the last five presidential elections in Taiwan. At the same time, it also explores the changes of the sentimental tendencies in the various theme models in the three months before and after the election. It can be observed the sentimental peak of the text will be appeared on the polling day, and nearly three times of the presidential election results show that the sentimental value of the relevant party’s news will become smooth after the election. From the positive and negative sentimental statistics of the news text and the analysis of the overall sentimental tendencies, no matter which the ruling party is, the news of the Central News Agency for the KMT and the DPP presents a positive and stable content, not particularly toward any political party. 情感分析 LDA主題模型 n-gram a-priori Sentiment analysis LDA N-gram A-priori
165	Contribuições da relação de oposição adjetival para o mapeamento de sentimentos em plataformas online de ensino Haas, Daniela Deitos 17 March 2015 (has links) Submitted by Maicon Juliano Schmidt (maicons) on 2015-06-15T14:23:37Z No. of bitstreams: 1 Daniela Deitos Haas.pdf: 2265336 bytes, checksum: 0f36508aa2d3eff2a2b12c951ccbe6b2 (MD5) / Made available in DSpace on 2015-06-15T14:23:37Z (GMT). No. of bitstreams: 1 Daniela Deitos Haas.pdf: 2265336 bytes, checksum: 0f36508aa2d3eff2a2b12c951ccbe6b2 (MD5) Previous issue date: 2015-03-17 / Milton Valente / O objetivo da dissertação foi descrever semanticamente a oposição de adjetivos do domínio dos sentimentos no contexto da Educação a Distância. Pretendeu-se contribuir para enriquecer um léxico de emoção que será utilizado como base de dados para um analisador de sentimentos que identifique automaticamente os sentimentos expressos pelos alunos no ambiente virtual Moodle. Uma das justificativas para a construção de um analisador de sentimentos aplicado ao contexto de ensino a distância é a crença de que um dos fatores que contribuem para o sucesso da Educação a Distância (EaD) está na capacidade de o professor/tutor identificar rapidamente como os alunos estão se sentindo no ambiente e, por essas declarações estarem dispersas nas várias ferramentas que compõem o ambiente virtual, as tarefas de identificação e de resposta rápida ao aluno são prejudicadas, fato que pode influenciar na evasão de cursos e de disciplinas a distância. Esse estudo é interdisciplinar, ancorado na Linguística Cognitiva (Cruse, 1986; 2000) em interface com a área do Processamento Automático de Língua Natural (PLN), a partir das teorias da Semântica Lexical Computacional na área da Análise de Sentimentos (Pang e Lee, 2008; Liu, 2012). Por ser interdisciplinar, a metodologia adotada abrange três domínios que se complementam: o linguístico, o linguístico-computacional e o computacional (Dias-da-Silva 1996; 1998; 2003). No domínio linguístico foram estudados a emoção à luz da abordagem componencial psicológica de Scherer (1994; 2000; 2005), a Roda da Emoção (Scherer, 2005) e o fenômeno linguístico da oposição (Lyons, 1977; Cruse, 1986; 2000; Murphy, 2003). Com vistas ao domínio linguístico-computacional foi proposta uma descrição formalizável dos adjetivos tendo em vista a teoria da oposição estudada e a Roda da Emoção. O domínio computacional será realizado por uma equipe de informatas da Unisinos, parceiros do projeto “MAS-EaD: Mapeamento automático de sentimentos na EaD: a construção de um léxico de emoção”, financiado pela FAPERGS. Os resultados da investigação revelam que a literatura apresenta dois tipos de oposição, a complementar e a antonímia, sendo que somente casos de antonímia foram encontrados em nosso corpus. Desse modo, a relação de oposição é a principal relação para a Análise de Sentimentos, uma vez que esta identifica sentimentos contrários. Além disso, a relação de oposição se mostrou importante para organizar as polaridades dos sentimentos da Roda da Emoção de Scherer. / The aim of this dissertation was to describe semantically adjectives opposition of sentiments domain in the Distance Education context. The purpose was to enrich an emotion lexicon which will be used as a database for a sentiment analyzer to identify automatically sentiments expressed by students on the open source learning platform Moodle. One of the justifications for building a sentiment analyzer applied to the distance education context is the belief that one of the factors that contribute to its success is the capacity of the teacher/tutor to identify as quickly as possible how students are feeling using the platform. Students’ declarations are diffused in several tolls in the platform and for this reason their identification and a quick response to students are less effective what can influence the evasion in courses and disciplines on a distance basis. This study is interdisciplinary, founded in Cognitive Linguistics (Cruse, 1986; 2000), interaction with Automatic Processing of Natural Language, from the Computacional Lexical Semantic Theory in the Sentiment Analysis (Pang e Lee, 2008; Liu, 2012). As an interdisciplinary study, the methodology comprehend three domains which complement one another: linguistic, computational-linguistic and computational (Dias-da-Silva, 1996; 1998; 2003). At regarding linguistic domains the emotion according to the componential psychologic approach from Scherer (1994; 2000; 2005; 2013), the Geneva Emotion Wheel (Scherer, 2005) and the linguistic phenomenon of opposition (Lyons, 1977; Cruse, 1986; 2000; Murphy, 2003) were studied. At concerning the computational linguistic domain a formalizable description of adjectives was proposed with respect to the opposition theory studied and the Geneva Emotion Wheel. The computational domain will be done by a computer science team from Unisinos, who are working with us in the project “MAS-EaD: Automatic sentiment mining in distance education: building an emotion lexicon”, defrayed by FAPERGS. The findings of this investigation showed that the literature presents two types of opposition, complementary and antonym, but only antonym cases were found in our corpus. Thereby, the opposition relation is the main relation for the Sentiment Analysis, because it identifies opposite sentiments. Besides, the opposition relation is important to organize sentiment polarities of the Geneva Emotion Wheel of Scherer. EaD Análise de sentimentos Roda da emoção Oposição Distance education Sentiment analysis Geneva emotion wheel Opposition
166	SentiHealth-Cancer: uma ferramenta de análise de sentimento para ajudar a detectar o humor de pacientes de câncer em uma rede social online / SentiHealth-Cancer: a sentiment analysis tool to help detecting mood of cancer patients in online social network Rodrigues, Ramon Gouveia 26 April 2016 (has links) Submitted by Cássia Santos (cassia.bcufg@gmail.com) on 2016-08-10T13:36:21Z No. of bitstreams: 2 Dissertação - Ramon Gouveia Rodrigues - 2016.pdf: 1747013 bytes, checksum: c84129f95e549109990ae9dbec6bc09f (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2016-08-10T13:46:59Z (GMT) No. of bitstreams: 2 Dissertação - Ramon Gouveia Rodrigues - 2016.pdf: 1747013 bytes, checksum: c84129f95e549109990ae9dbec6bc09f (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2016-08-10T13:46:59Z (GMT). No. of bitstreams: 2 Dissertação - Ramon Gouveia Rodrigues - 2016.pdf: 1747013 bytes, checksum: c84129f95e549109990ae9dbec6bc09f (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2016-04-26 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / Cancer is a critical disease that affects millions of people and families around the world. In 2012 about 14.1 million new cases of cancer occurred globally. Because of many reasons like the severity of some cases, the side effects of some treatments and death of other patients, cancer patients tend to be affected by serious emotional disorders, like depression. Thus, the use of a behavioral tool that assists the detection of the people mood can contribute to the monitoring of patients and family members during treatment. Therefore, the objective of this work is to develop a Sentiment Analysis tool, named SentiHealth-Cancer (SHC), to assist the detection of the emotional state of people members of Brazilian virtual communities for support cancer patients. We conducted a comparative study of the proposed method and a set of general-purpose Sentiment Analysis tools. For this, we collected 789 messages of 8 Facebook communities and considered 2.574 reviews of volunteers about the real sentiments expressed in these messages. Thus, the performance of the tools were tested in each community, with psychologists and non psychologists reviews and, where possible, with texts in Portuguese and translated into English. The results showed that, overall, the proposed method performance in this work is superior to other tools, both analyzing texts in Portuguese and English. For example, its accuracy (56.64%) analyzing all messages shows a significant increase of 11.78% compared to the greater accuracy (50.67%) presented by other tools. / O câncer é uma doença crítica que afeta milhões de pessoas e famílias em todo o mundo. Em 2012, cerca de 14,1 milhões de novos casos de câncer ocorreram no mundo. Devido a muitas razões, como a gravidade de alguns casos, os efeitos colaterais de alguns tratamentos e morte de alguns pacientes, pessoas com câncer tendem a ser afetados por graves distúrbios emocionais, por exemplo, a depressão. Assim, o uso de uma ferramenta comportamental que auxilie a detecção do humor das pessoas pode contribuir para o acompanhamento de pacientes e familiares durante o tratamento. Portanto, o objetivo deste trabalho é desenvolver uma ferramenta de Análise de Sentimento, chamada SentiHealth-Cancer (SHC), para auxiliar a detecção do estado emocional de pessoas membros de comunidades virtuais brasileiras de apoio a pacientes de câncer. Foi realizado um estudo comparativo entre a ferramenta proposta e outras quatro de ferramentas de propósito geral de Análise de Sentimento. Para isso, foram coletadas 789 mensagens de 8 comunidades do Facebook e consideradas 2.574 avaliações de voluntários sobre os sentimentos reais expressos nessas mensagens. Com isso, foram testados os desempenhos das ferramentas em cada comunidade, com avaliações de psicólogos e não psicólogos e, quando possível, com textos em português e traduzidos para o inglês. Os resultados demonstraram que, no geral, o desempenho do método proposto neste trabalho é superior às outras ferramentas, tanto elas analisando textos em português quanto em inglês. Por exemplo, sua acurácia (56.64%) analisando todas as mensagens apresenta um aumento significativo de 11.78% em relação à maior acurácia (50.67%) apresentada pelas outras ferramentas. Análise de sentimento Mineração de opinião Câncer Taxonomia Sentiment analysis Mining opinion Cancer Taxonomy
167	Uma proposta de representação linguístico-computacional da negação com vistas à análise de sentimentos em contexto de ensino e aprendizagem on-line Belau, Francini Scipioni 11 January 2017 (has links) Submitted by Silvana Teresinha Dornelles Studzinski (sstudzinski) on 2017-03-15T16:53:11Z No. of bitstreams: 1 Francini Scipioni Belau_.pdf: 2278562 bytes, checksum: 806e6ee479b7b02ba595eb0759a37f05 (MD5) / Made available in DSpace on 2017-03-15T16:53:11Z (GMT). No. of bitstreams: 1 Francini Scipioni Belau_.pdf: 2278562 bytes, checksum: 806e6ee479b7b02ba595eb0759a37f05 (MD5) Previous issue date: 2017-01-11 / Gvdasa - Inteligência Educacional / A temática deste trabalho estabelece um diálogo entre as áreas da educação a distância, linguística e processamento automático das línguas naturais (PLN). A proposta é responder às seguintes questões norteadoras: (i) como a negação da emoção se manifesta na superfície da língua? E (ii) que regras computacionais expressam a negação da emoção?. A metodologia do trabalho segue o proposto por Dias-da-Silva (2006), que organiza os trabalhos em PLN em três domínios de investigação complementares: (i) linguístico, (ii) linguístico-computacional e (iii) computacional. No primeiro domínio, o linguístico, descreve-se o fenômeno da negação e o seu uso. No domínio linguístico-computacional, vamos representar os padrões percebidos para orientar os especialistas a codificarem essas regras em uma linguagem computacional. Para propor a descrição linguístico-computacional dos modos de expressão da negação, a partir de um corpus construído em contexto de ensino a distância com base nos relatos diários e fóruns dos alunos, utilizamos como base a teoria abordada por Maria Helena de Moura Neves (2011). A etapa computacional, que prevê a implementação do sistema, é própria do informata e não será contemplada neste trabalho, será realizada por grupo de pesquisa parceiro em colaboração com a empresa GVDasa. Ao todo foram criadas 11 regras linguístico-computacionais que possibilita dar conta das propriedades linguísticas identificadas ao responder a questão (i) de pesquisa. As regras visam a contribuir para que um sistema computacional possa localizar os fenômenos da negação em textos e verificar a existência de inversões de polaridade e emoção. / The thematic of this work establishes a dialogue between the fields of distance learning, linguistics, and natural language processing (NLP). The proposal is to answer the following guiding questions: (i) how does the negation of emotion manifest itself on the surface of the language? and (ii) which computational rules express the negation of emotion? The methodology of this work follows the proposed by Dias-da-Silva (2006), who organizes the works in NLP in three complementary domains of investigation: (i) linguistics, (ii) computational-linguistics, and (iii) computational. In the first domain, the linguistic domain, the phenomenon of denial and its use is described. In the linguistic-computational domain, we will represent the perceived patterns in order to guide the experts to encode these rules in computational language. In order to propose the linguistic-computational description of the forms of expression of negation, through a corpus built in a distance learning context based on daily reports and students’ forums, we take as a base the theory approached by Maria Helena de Moura Neves (2011). The computational phase which forecasts the implementation of the system is pertinent to the computing technician and it will not be contemplated in this work, but it will be performed by a partner research group in collaboration with the GVDasa company. Altogether, 11 linguistic-computational rules were created that make it possible to account for the linguistic properties identified when answering the research question (i). The rules aim to contribute with a computational system to locate the phenomenon of negation in texts and verify the existence of inversions of polarity and emotion. Educação a distância Learning analytics Análise de sentimento Negação Distance education Learning analytics Sentiment analysis Negation
168	Applications In Sentiment Analysis And Machine Learning For Identifying Public Health Variables Across Social Media Clark, Eric Michael 01 January 2019 (has links) Twitter, a popular social media outlet, has evolved into a vast source of linguistic data, rich with opinion, sentiment, and discussion. We mined data from several public Twitter endpoints to identify content relevant to healthcare providers and public health regulatory professionals. We began by compiling content related to electronic nicotine delivery systems (or e-cigarettes) as these had become popular alternatives to tobacco products. There was an apparent need to remove high frequency tweeting entities, called bots, that would spam messages, advertisements, and fabricate testimonials. Algorithms were constructed using natural language processing and machine learning to sift human responses from automated accounts with high degrees of accuracy. We found the average hyperlink per tweet, the average character dissimilarity between each individual's content, as well as the rate of introduction of unique words were valuable attributes in identifying automated accounts. We performed a 10-fold Cross Validation and measured performance of each set of tweet features, at various bin sizes, the best of which performed with 97% accuracy. These methods were used to isolate automated content related to the advertising of electronic cigarettes. A rich taxonomy of automated entities, including robots, cyborgs, and spammers, each with different measurable linguistic features were categorized. Electronic cigarette related posts were classified as automated or organic and content was investigated with a hedonometric sentiment analysis. The overwhelming majority (≈ 80%) were automated, many of which were commercial in nature. Others used false testimonials that were sent directly to individuals as a personalized form of targeted marketing. Many tweets advertised nicotine vaporizer fluid (or e-liquid) in various “kid-friendly” flavors including 'Fudge Brownie', 'Hot Chocolate', 'Circus Cotton Candy' along with every imaginable flavor of fruit, which were long ago banned for traditional tobacco products. Others offered free trials, as well as incentives to retweet and spread the post among their own network. Free prize giveaways were also hosted whose raffle tickets were issued for sharing their tweet. Due to the large youth presence on the public social media platform, this was evidence that the marketing of electronic cigarettes needed considerable regulation. Twitter has since officially banned all electronic cigarette advertising on their platform. Social media has the capacity to afford the healthcare industry with valuable feedback from patients who reveal and express their medical decision-making process, as well as self-reported quality of life indicators both during and post treatment. We have studied several active cancer patient populations, discussing their experiences with the disease as well as survivor-ship. We experimented with a Convolutional Neural Network (CNN) as well as logistic regression to classify tweets as patient related. This led to a sample of 845 breast cancer survivor accounts to study, over 16 months. We found positive sentiments regarding patient treatment, raising support, and spreading awareness. A large portion of negative sentiments were shared regarding political legislation that could result in loss of coverage of their healthcare. We refer to these online public testimonies as “Invisible Patient Reported Outcomes” (iPROs), because they carry relevant indicators, yet are difficult to capture by conventional means of self-reporting. Our methods can be readily applied interdisciplinary to obtain insights into a particular group of public opinions. Capturing iPROs and public sentiments from online communication can help inform healthcare professionals and regulators, leading to more connected and personalized treatment regimens. Social listening can provide valuable insights into public health surveillance strategies. Computational Linguistics Data Science Machine Learning Public Health Monitoring Sentiment Analysis Social Media Computer Sciences Social and Behavioral Sciences
169	Effects of Investor Sentiment Using Social Media on Corporate Financial Distress Hoteit, Tarek 01 January 2015 (has links) The mainstream quantitative models in the finance literature have been ineffective in detecting possible bankruptcies during the 2007 to 2009 financial crisis. Coinciding with the same period, various researchers suggested that sentiments in social media can predict future events. The purpose of the study was to examine the relationship between investor sentiment within the social media and the financial distress of firms Grounded on the social amplification of risk framework that shows the media as an amplified channel for risk events, the central hypothesis of the study was that investor sentiments in the social media could predict t he level of financial distress of firms. Third quarter 2014 financial data and 66,038 public postings in the social media website Twitter were collected for 5,787 publicly held firms in the United States for this study. The Spearman rank correlation was applied using Altman Z-Score for measuring financial distress levels in corporate firms and Stanford natural language processing algorithm for detecting sentiment levels in the social media. The findings from the study suggested a non-significant relationship between investor sentiments in the social media and corporate financial distress, and, hence, did not support the research hypothesis. However, the model developed in this study for analyzing investor sentiments and corporate distress in firms is both original and extensible for future research and is also accessible as a low-cost solution for financial market sentiment analysis. bankrupcy prediction algorithm corporate bankrupcy financial distress investor sentiment sentiment analysis social media Business Databases and Information Systems Finance and Financial Management
170	口碑情感對於募資專案之影響 / The Influence of eWOM Sentiment on the Success of Crowdfunding Projects 林漢文 Unknown Date (has links) 「群眾募資」為社會大眾透過小額資金的贊助，發揮群體集結的力量，支持個人或組織使其目標或專案得以執行完成。隨著群眾募資平台的出現，加速了群眾募資的發展，從國外知名的Kickstarter 到國內的Flyingv，這股募資的旋風一路席捲了國內外傳統借貸生態。然而募資專案的成功因素也變成了一個重要的課題，過去關於募資專案的文獻大多提到募資金額、募資更新次數等因素，較少著墨於投資者對於募資產品的評論或口碑因素。因此本研究提出一個更廣泛的整合架構，針對網路評論做情感分析作為影響募資專案成功的重要因素之一，並對 Kickstarter 上的專案，進行實證研究，結果發現口碑的數量及情感因素在不同類別的專案中有不同的影響。在Game, Technology 和Design 類別對募資專案成功有顯著的影響，但是在Music, Theater 和Dance 專案則沒有顯著影響。 / Abstract Crowdfunding is definded as a process or activity that openly solicits a small amount of money from a group of persons or orgnizations to make it success. The appearance of crowdfunding platforms in recent years has accelerated the popularity of crowdfunding. From Kickstarter to Flyingv, this Crowdfunding trend has changed traditional borrowing ecology. However, not all crowdfunding projects are successful. A substantial amount of proposed projects failed due to unable to raise the target money. Therefore, it is interesting to investigate factors that may affect the success of a fundraising project. Previous literature has reported several success factors for crowdfunding, such as the target amount, the number of updates, and so on. However, not many studies have investigated the effect of project reviews in the past literature. It is clear that word of mouth plays an important role in consumer decision, and it is reasonable to believe that project reviews as a kind of word of mouth will have effect on investors’ decision. Hence, this study adopts the sentiment analysis technique to analyze how the sentiment of project reviews, along with other factors, may affect the eventual project success. The data collected from the Kickstarter.com was used to evaluate our research model. Our findings indicate that the number and sentiment of project reviews did have impact on fundraising success, but only in certain categories such as game, design and technology that seem to have objective evaluation criteria. Their effect was not significant in categories such as music, theater, and dance in which investors’ preference may be very subjective. 情感分析關鍵成功因素群眾募資 sentiment analysis success factor crowdfunding

Search results