Global ETD Search

1	COVID-19: Анализ эмоциональной окраски сообщений в социальных сетях (на материале сети «Twitter») : магистерская диссертация / COVID-19: Social network sentiment analysis (based on the material of "Twitter" messages) Денисова, П. А., Denisova, P. A. January 2021 (has links) Работа посвящена изучению анализа тональности текстов в социальных сетях на примере сообщений-твитов из социальной сети Twitter. Материал исследования составили 818 224 сообщения по 17-ти ключевым словам, из которых 89 025 твитов содержали слова «COVID-19» и «Сoronavirus». В первой части работы рассматриваются общие теоретические и методологические вопросы: вводится понятие Sentiment Analysis, анализируются различные подходы к классификации тональности текстов. Особое внимание в задачах классификации текстов уделяется Байесовскому классификатору, который показывает высокую точность работы. Изучаются особенности анализа тональности текстов в социальных сетях во время эпидемий и вспышек болезней. Описывается процедура и алгоритм анализа тональности текста. Большое внимание уделяется анализу тональности текстов в Python с помощью библиотеки TextBlob, а также выбирается ещё один из инструментов «SaaS» - программное обеспечение как услуга, который позволяет реализовать анализ тональности текстов в режиме реального времени, где нет необходимости в большом опыте машинного обучения и обработке естественного языка, в сравнении с языком программирования Python. Вторая часть исследования начинается с построения выборок, т.е. определения ключевых слов, по которым в работе осуществляется поиск и экспорт необходимых твитов. Для этой цели используется корпус - Coronavirus Corpus, предназначенный для отражения социальных, культурных и экономических последствий коронавируса (COVID-19) в 2020 году и в последующий период. Анализируется динамика использования слов по изучаемой тематике в течение 2020 года и проводится аналогия между частотой их использования и происходящими событиями. Далее по выбранным ключевым словам осуществляется поиск твитов и, основываясь на полученных данных, реализуется анализ тональности cообщений с помощью библиотеки Python - TextBlob, созданной для обработки текстовых данных, и онлайн - сервиса Brand24. Сравнивая данные инструменты, отмечается схожесть полученных результатов. Исследование помогает быстро и в реальном времени понять общественные настроения по поводу вспышки COVID-19, способствуя тем самым пониманию развивающихся событий. Также данная работа может быть использована в качестве модели для определения эмоционального состояния интернет-пользователей в различных ситуациях. / The work is devoted to the sentiment analysis study of messages in Twitter social network. The research material consisted of 818,224 messages and 17 keywords, whereas 89,025 tweets contained the words "COVID-19" and "Coronavirus". In the first part, theoretical and methodological issues are considered: the concept of sentiment analysis is introduced, various approaches to text classification are analyzed. Particular attention in the problems of text classification is given to Naive Bayes classifier, which shows high accuracy of work. The features of sentiment analysis in social networks during epidemics and disease outbreaks are studied. The procedure and algorithm for analyzing the sentiment of the text are described. Much attention is paid to the analysis of sentiment of texts in Python using TextBlob library, and also one of the SaaS tools is chosen - software as a service, which allows real-time sentiment analysis of texts, where there is no need for extensive experience in machine learning and natural language processing against Python programming language. The second part of the study begins with sampling, i.e. definition of keywords by which the search and export of the necessary tweets is carried out. For this purpose, the Coronavirus Corpus is used, designed to reflect the social, cultural and economic consequences of the coronavirus (COVID-19) in 2020 and beyond. The dynamics of the topic words usage during 2020 is analyzed and an analogy is drawn between the frequency of their usage and the events in place. Next, the selected keywords are used to search for tweets and, based on the data obtained, the sentiment analysis of messages is carried out using the Python library - TextBlob, created for processing textual data, and the Brand24 online service. Comparing these tools, the results are similar. The study helps to understand quickly and in real-time public sentiments about the COVID-19 outbreak, thereby contributing to the understanding of developing events. Also, this work can be used as a model for determining the emotional state of Internet users in various situations. COVID-19 ПАНДЕМИЯ КОРОНАВИРУС TWITTER TEXTBLOB BRAND24 MASTER'S THESIS COVID-19 PANDEMIC CORONAVIRUS TWITTER SENTIMENT ANALYSIS TEXTBLOB NAIVE BAYES CLASSIFIER BRAND24
2	Understanding Sales Performance Using Natural Language Processing - An experimental study evaluating rule-based algorithms in a B2B setting Smedberg, Angelica January 2023 (has links) Natural Language Processing (NLP) is a branch in data science that marries artificial intelligence with linguistics. Essentially, it tries to program computers to understand human language, both spoken and written. Over the past decade, researchers have applied novel algorithms to gain a better understanding of human sentiment. While no easy feat, incredible improvements have allowed organizations, politicians, governments, and other institutions to capture the attitudes and opinions of the public. It has been particularly constructive for companies who want to check the pulse of a new product or see what the positive or negative sentiments are for their services. NLP has even become useful in boosting sales performance and improving training. Over the years, there have been countless studies on sales performance, both from a psychological perspective, where characteristics of salespersons are explored, and from a data science/AI (Artificial Intelligence) perspective, where text is analyzed to predict sales forecasting (Pai & Liu, 2018) and coach sales agents using AI trainers (Luo et al., 2021). However, few studies have discussed how NLP models can help characterize sales performance using actual sales transcripts. Thus, there is a need to explore to what extent NLP models can inform B2B businesses of the characteristics embodied within their salesforce. This study aims to fill that literature gap. Through a partnership with a medium-sized tech company based out of California, USA, this study conducted an experiment to try and answer to what extent can we characterize sales performance based on real-life sales communication? And in what ways can conversational data inform the sales team at a California-based mid-sized tech company about how top performers communicate with customers? In total, over 5000 sentences containing over 110 000 words were collected and analyzed using two separate rule-based sentiment analysis techniques: TextBlob developed by Steven Loria (2013) and Valence Aware Dictionary and sEntiment Reasoner (VADER) developed by CJ Hutto and Eric Gilbert (2014). A Naïve Bayes classifier was then adopted to test and train each sentiment output from the two rule-based techniques. While both models obtained high accuracy, above 90%, it was concluded that an oversampled VADER approach yields the highest results. Additionally, VADER also tends to classify positive and negative sentences more correctly than TextBlob, when manually reviewing the output, hence making it a better model for the used dataset. NLP Sentiment Analysis Ruled-based algorithms TextBlob VADER Naïve Bayes Machine Learning Information Systems
3	Data Analysis of Discussions, Regarding Common Vulnerabilities and Exposures, and their Sentiment on Social Media / Dataanalys av diskussioner, gällande vanliga säkerhetssårbarheter och exponeringar, och deras sentiment på sociala medier Rahmati, Mustafa, Grujicic, Danijel January 2022 (has links) As common vulnerabilites and exposures are detected, they are also discussed in various social platforms. The problem is that only a few of the posts made about them, are getting enough attention. This leads to an unawareness of potential and critical threats against systems. It is therefore important to look for patterns that make certain vulnerabilites more or less discussed. To do so, a framework was made for collecting discussions around cybersecurity and more specific vulnerabilites/exposures called CVE from Reddit. In addition, some of the desired data was collected from Twitter. Thereafter, the sentiments of the collected posts were calculated to see patterns between popular subreddits and the attitude shown in them. This was done with three methods: Flair, TextBlob and Vader. The results showed for instance that general discussions about information security were considered to be more positive than discussions of common vulnerabilites and exposures. Another result showed that the spread of CVEs that have a partial impact, are higher in Reddit, and is increasing almost exponentially. CVSS scores showed that a CVE with a CVSS score of around 7 is more likely to appear. Many CVEs in Reddit was also discussed before and after they were disclosed. The implication of this work might be that more and more people might use Reddit to discuss specific types of CVEs in a suitable subreddit, as well as being aware of common vulnerabilites and exposures, in order to prevent future threats. Social media Reddit Twitter sentiment analysis computer science information technology CVE information security CVSS score Flair Vader TextBlob API data collection web scraper data analysis natural language processing NLP information retrieval Computer and Information Sciences Data- och informationsvetenskap

Search results

Understanding Sales Performance Using Natural Language Processing - An experimental study evaluating rule-based algorithms in a B2B setting

Data Analysis of Discussions, Regarding Common Vulnerabilities and Exposures, and their Sentiment on Social Media / Dataanalys av diskussioner, gällande vanliga säkerhetssårbarheter och exponeringar, och deras sentiment på sociala medier