Global ETD Search

1	Sentiment Analysis and Time-series Analysis for the COVID-19 vaccine Tweets Sandaka, Gowtham Kumar, Gaekwade, Bala Namratha January 2021 (has links) Background: The implicit nature of social media information brings many advantages to realistic sentiment analysis applications. Sentiment Analysis is the process of extracting opinions and emotions from data. As a research topic, sentiment analysis of Twitter data has received much attention in recent years. In this study, we have built a model to perform sentiment analysis to classify the sentiments expressed in the Twitter dataset based on the public tweets to raise awareness of the public's concerns by training the models. Objectives: The main goal of this thesis is to develop a model to perform a sentiment analysis on the Twitter data regarding the COVID-19 vaccine and find out the sentiment’s polarity from the data to show the distribution of the sentiments as following: positive, negative, and neutral. A literature study and an experiment are set to identify a suitable approach to develop such a model. Time-series analysis performed to obtain daily sentiments over the timeline series and daily trend analysis with events associated with the particular dates. Method: A Systematic Literature Review is performed to identify the most suitable approach to accomplish the sentiment analysis on the COVID-19 vaccine. Then, through the literature study results, an experimental model is developed to distribute the sentiments on the analyzed data and identify the daily sentiments over the timeline series. Result: A VADER is identified from the Literature study, which is the best suitable approach to perform the sentiment analysis. The KDE distribution is determined for each sentiment as obtained by the VADER Sentiment Analyzer. Daily sentiments over the timeline series are generated to identify the trend analysis on Twitter data of the COVID-19 vaccine. Conclusion: This research aims to identify the best-suited approach for sentiment analysis on Twitter data concerning the selected dataset through the study of results. The VADER model prompts optimal results among the sentiments polarity score for the sentiment analysis of Twitter data regarding the selected dataset. The time-series analysis shows how daily sentiments are fluctuant and the daily counts. Seasonal decomposition outcomes speak about how the world is reacting towards the current COVID-19 situation and daily trend analysis elaborates on the everyday sentiments of people. COVID-19 vaccine Sentiment analysis Time-based analysis Twitter data VADER. Computer Sciences Datavetenskap (datalogi)
2	Predicting Bitcoin price fluctuation with Twitter sentiment analysis / Förutspå Bitcoin prisändringar med hjälp av semantisk analys på Twitter data Stenqvist, Evita, Lönnö, Jacob January 2017 (has links) Programmatically deriving sentiment has been the topic of many a thesis: it’s application in analyzing 140 character sentences, to that of 400-word Hemingway sentences; the methods ranging from naive rule based checks, to deeply layered neural networks. Unsurprisingly, sentiment analysis has been used to gain useful insight across industries, most notably in digital marketing and financial analysis. An advancement seemingly more excitable to the mainstream, Bitcoin, has risen in number of Google searches by three-folds since the beginning of this year alone, not unlike it’s exchange rate. The decentralized cryptocurrency, arguably, by design, a pure free market commodity – and as such, public perception bears the weight in Bitcoins monetary valuation. This thesis looks toward these public perceptions, by analyzing 2.27 million Bitcoin-related tweets for sentiment fluctuations that could indicate a price change in the near future. This is done by a naive method of solely attributing rise or fall based on the severity of aggregated Twitter sentiment change over periods ranging between 5 minutes and 4 hours, and then shifting these predictions forward in time 1, 2, 3 or 4 time periods to indicate the corresponding BTC interval time. The prediction model evaluation showed that aggregating tweet sentiments over a 30 min period with 4 shifts forward, and a sentiment change threshold of 2.2%, yielded a 79% accuracy. / Ämnet sentiment analysis, att programmatiskt härleda underliggande känslor i text, ligger som grund för många avhandlingar: hur det tillämpas bäst på 140 teckens meningar såväl som på 400-ords meningar a’la Hemingway, metoderna sträcker sig ifrån naiva, regelbaserade, till neurala nätverk. Givetvis sträcker sig intresset för sentiment analys utanför forskningsvärlden för att ta fram insikter i en rad branscher, men framförallt i digital marknadsföring och financiell analys. Sedan början på året har den digitala valutan Bitcoin stigit trefaldigt i sökningar på Google, likt priset på valutan. Då Bitcoins decentraliserade design är helt transparant och oreglerad, verkar den under ideala marknadsekonomiska förutsättningar. På så vis regleras Bitcoins monetära värde av marknadens uppfattning av värdet. Denna avhandling tittar på hur offentliga uppfattningar påverkar Bitcoins pris. Genom att analysera 2,27 miljoner Bitcoin-relaterade tweets för sentiment ändringar, föutspåddes ändringar i Bitcoins pris under begränsade förhållningar. Priset förespåddes att gå upp eller ner beroende på graden av sentiment ändring under en tidsperiod, de testade tidsperioderna låg emellan 5 minuter till 4 timmar. Om en förutspånning görs för en tidsperiod, prövas den emot 1, 2, 3 och 4 skiftningar framåt i tiden för att ange förutspådd Bitcoin pris interval. Utvärderingen av förutspåningar visade att aggregerade tweet-sentiment över en 30-minutersperiod med 4 skift framåt och ett tröskelvärde för förändring av sentimentet på 2,2 % gav ett resultat med 79 % noggrannhet. bitcoin sentiment analysis twitter data mining prediction cryptocurrency Computer Sciences Datavetenskap (datalogi)
3	Sentiment Analysis of Twitter Data Using Machine Learning and Deep Learning Methods Manda, Kundan Reddy January 2019 (has links) Background: Twitter, Facebook, WordPress, etc. act as the major sources of information exchange in today's world. The tweets on Twitter are mainly based on the public opinion on a product, event or topic and thus contains large volumes of unprocessed data. Synthesis and Analysis of this data is very important and difficult due to the size of the dataset. Sentiment analysis is chosen as the apt method to analyse this data as this method does not go through all the tweets but rather relates to the sentiments of these tweets in terms of positive, negative and neutral opinions. Sentiment Analysis is normally performed in 3 ways namely Machine learning-based approach, Sentiment lexicon-based approach, and Hybrid approach. The Machine learning based approach uses machine learning algorithms and deep learning algorithms for analysing the data, whereas the sentiment lexicon-based approach uses lexicons in analysing the data and they contain vocabulary of positive and negative words. The Hybrid approach uses a combination of both Machine learning and sentiment lexicon approach for classification. Objectives: The primary objectives of this research are: To identify the algorithms and metrics for evaluating the performance of Machine Learning Classifiers. To compare the metrics from the identified algorithms depending on the size of the dataset that affects the performance of the best-suited algorithm for sentiment analysis. Method: The method chosen to address the research questions is Experiment. Through which the identified algorithms are evaluated with the selected metrics. Results: The identified machine learning algorithms are Naïve Bayes, Random Forest, XGBoost and the deep learning algorithm is CNN-LSTM. The algorithms are evaluated with respect to the metrics namely precision, accuracy, F1 score, recall and compared. CNN-LSTM model is best suited for sentiment analysis on twitter data with respect to the selected size of the dataset. Conclusion: Through the analysis of results, the aim of this research is achieved in identifying the best-suited algorithm for sentiment analysis on twitter data with respect to the selected dataset. CNN-LSTM model results in having the highest accuracy of 88% among the selected algorithms for the sentiment analysis of Twitter data with respect to the selected dataset. Machine Learning Sentiment Analysis Twitter data Deep Learning Naïve Bayes Twitter Sentiment Analysis Computer Sciences Datavetenskap (datalogi)
4	A Hyperlink and Sentiment Analysis of the 2016 Presidential Election: Intermedia Issue Agenda and Attribute Agenda Setting in Online Contexts Joa, Youngnyo 02 August 2017 (has links) No description available. Journalism Mass Communications 2016 US Presidential Election Intermedia Agenda Setting Sentiment Contagion Twitter Data Analysis Sentiment Analysis Network Analysis
5	Pattern Exploration from Citizen Geospatial Data Ke Liu (5930729) 17 January 2019 (has links) Due to the advances in location-acquisition techniques, citizen geospatial data has emerged with opportunity for research, development, innovation, and business. A variety of research has been developed to study society and citizens through exploring patterns from geospatial data. In this thesis, we investigate patterns of population and human sentiments using GPS trajectory data and geo-tagged tweets. Kernel density estimation and emerging hot spot analysis are first used to demonstrate population distribution across space and time. Then a flow extraction model is proposed based on density difference for human movement detection and visualization. Case studies with volleyball game in West Lafayette and traffics in Puerto Rico verify the effectiveness of this method. Flow maps are capable of tracking clustering behaviors and direction maps drawn upon the orientation of vectors can precisely identify location of events. This thesis also analyzes patterns of human sentiments. Polarity of tweets is represented by a numeric value based on linguistics rules. Sentiments of four US college cities are analyzed according to its distribution on citizen, time, and space. The research result suggests that social media can be used to understand patterns of public sentiment and well-being. Geography Geospatial Information Systems geospatial data sets Spatial autocorrelation geostatistics Spatial clusters GPS trajectory data flow model Sentiment Polarity Twitter Data Analysis
6	Forecast dengue fever cases using time series models with exogenous covariates: climate, effective reproduction number, and twitter data Vieira, Julio Cesar de Azevedo 17 April 2018 (has links) Submitted by Julio Cesar de Azevedo Vieira (julio_vieira@globo.com) on 2018-06-16T14:57:18Z No. of bitstreams: 1 dissertacao_JulioCesarVieira.pdf: 1988173 bytes, checksum: 55cb349d2840d5de748cbd814f155bb9 (MD5) / Rejected by Marcia Bacha (marcia.bacha@fgv.br), reason: O aluno irá submeter com o novo PDF on 2018-06-19T14:38:11Z (GMT) / Submitted by Julio Cesar de Azevedo Vieira (julio_vieira@globo.com) on 2018-06-26T21:10:08Z No. of bitstreams: 1 dissertacao_JulioCesarVieira.pdf: 1801751 bytes, checksum: 382cab03be50d392c166a61e21222c05 (MD5) / Approved for entry into archive by Janete de Oliveira Feitosa (janete.feitosa@fgv.br) on 2018-07-05T13:19:09Z (GMT) No. of bitstreams: 1 dissertacao_JulioCesarVieira.pdf: 1801751 bytes, checksum: 382cab03be50d392c166a61e21222c05 (MD5) / Made available in DSpace on 2018-07-16T19:25:05Z (GMT). No. of bitstreams: 1 dissertacao_JulioCesarVieira.pdf: 1801751 bytes, checksum: 382cab03be50d392c166a61e21222c05 (MD5) Previous issue date: 2018-04-17 / Dengue é uma doença infecciosa que afeta países subtropicais. Autoridades de saúde locais utilizam informações sobre o número de notificações para monitorar e prever epidemias. Este trabalho foca na modelagem do número de casos de dengue semanal em quatro cidades do estado do Rio de Janeiro: Rio de Janeiro, São Gonçalo, Campos dos Goytacazes, e Petrópolis. Modelos de séries temporais são frequentemente utilizados para prever o número de casos de dengue nos próximos ciclos (semanas ou meses), particularmente, modelos SARIMA (Modelo Sazonal Autorregressivo Integrado de Médias Móveis) apresentam uma boa performance em situações distintas. Modelagens alternativas ainda incluem informação sobre o clima da região para melhorar a performance preditiva. Apesar disso, modelos que usam apenas dados históricos e de clima podem não possuir informações suficientes para capturar mudanças entre os regimes de não-epidemia e epidemia. Duas razões para isso são o atraso na notificação dos casos e que possivelmente não houveram epidemias nos anos anteriores. Baseando-se no sistema de monitoramento InfoDengue, esperasse que incluindo dados sobre ”numero de reprodução efetiva dos mosquitos”(RT) e ”número de tweets se referindo a dengue”(tweets) possam melhorar a qualidade das previsões no curto (1 semana) e longo (8 semanas) prazo. Foi possível mostrar que modelos de séries temporais incluindo RT e informações climáticas frequentemente performam melhor do que o modelo SARIMA em termos do erro preditivo quadrático médio (RMSE). Incluir a variável sobre o twitter não mostrou uma melhora no RMSE. / Dengue fever is an infectious disease affecting subtropical countries. Local health departments use the number of notified cases to monitor and predict epidemics. This work focus on modeling weekly incidence of dengue fever in four cities of the state of Rio de Janeiro: Rio de Janeiro, São Gonçalo, Campos dos Goytacazes, and Petrópolis. Time series models are often used to predict the number of cases in the next cycles (weeks, months), in particular, SARIMA (Seazonal Auto-Regressive Integrated Moving Average) models are shown to perform well in distinct settings. Alternative models also include climate covariates to improve the quality of the forecasts. However, models that only use historical and climate data may no have sufficient information to capture changes from non-epidemic to an epidemic regime. Two reasons are that there is a delay in the notification of cases and there might not have had epidemics in the previous years. Based on the INFODENGUE monitoring system we argue data including the "effective reproduction number of mosquitoes" (RT) and "number tweets referring to dengue" (tweets) may improve the quality of forecasts in the short (1 week) to long (8 weeks) range. We show that time series models including RT and climate information often outperform SARIMA models in terms of mean squared predictive error (RMSE). Inclusion of twitter did not improve the RMSE. Forecast SARIMA Dengue fever Climate Twitter data Previsão Dengue Clima Twitter Matemática Dengue - Previsão Análise de séries temporais Previsão com metodologia de Box-Jenkis Modelagem de dados Dengue - Fatores climáticos
7	Application of Deep Learning in Intelligent Transportation Systems Dabiri, Sina 01 February 2019 (has links) The rapid growth of population and the permanent increase in the number of vehicles engender several issues in transportation systems, which in turn call for an intelligent and cost-effective approach to resolve the problems in an efficient manner. A cost-effective approach for improving and optimizing transportation-related problems is to unlock hidden knowledge in ever-increasing spatiotemporal and crowdsourced information collected from various sources such as mobile phone sensors (e.g., GPS sensors) and social media networks (e.g., Twitter). Data mining and machine learning techniques are the major tools for analyzing the collected data and extracting useful knowledge on traffic conditions and mobility behaviors. Deep learning is an advanced branch of machine learning that has enjoyed a lot of success in computer vision and natural language processing fields in recent years. However, deep learning techniques have been applied to only a small number of transportation applications such as traffic flow and speed prediction. Accordingly, my main objective in this dissertation is to develop state-of-the-art deep learning architectures for resolving the transport-related applications that have not been treated by deep learning architectures in much detail, including (1) travel mode detection, (2) vehicle classification, and (3) traffic information system. To this end, an efficient representation for spatiotemporal and crowdsourced data (e.g., GPS trajectories) is also required to be designed in such a way that not only be adaptable with deep learning architectures but also contains efficient information for solving the task-at-hand. Furthermore, since the good performance of a deep learning algorithm is primarily contingent on access to a large volume of training samples, efficient data collection and labeling strategies are developed for different data types and applications. Finally, the performance of the proposed representations and models are evaluated by comparing to several state-of-the-art techniques in literature. The experimental results clearly and consistently demonstrate the superiority of the proposed deep-learning based framework for each application. / PHD / The rapid growth of population and the permanent increase in the number of vehicles engender several issues in transportation systems, which in turn call for an intelligent and cost-effective approach to resolve the problems in an efficient manner. Furthermore, the recent advances in positioning tools (e.g., GPS sensors) and ever-popularity of social media networks have enabled generation of massive spatiotemporal and crowdsourced data. This dissertation aims to leverage the advances in artificial intelligence so as to unlock the rick knowledge in the recorded data and in turn, optimizing the transportation systems in a cost-effective way. In particular, this dissertation seeks for proposing end-to-end frameworks based on deep learning models, as an advanced branch of artificial intelligence, as well as spatiotemporal and crowdsourced datasets (e.g., GPS trajectory and social media) for improving three transportation problems. (1) Travel Mode Detection, which is defined as identifying users’ transportation mode(s) (e.g., walk, bike, bus, car, and train) when traveling around the traffic network. (2) Vehicle Classification, which is defined as identifying the vehicle’s type (e.g., passenger car and truck) while moving in a traffic network. (3) traffic information system based on social media networks, which is defined as detecting traffic events (e.g., crash) and capturing traffic information (e.g., traffic congestion) on a real-time basis from users’ tweets. The experimental results clearly and consistently demonstrate the superiority of the proposed deep-learning based framework for each application. Deep learning (Machine learning) Intelligent Transportation Systems GPS Data Twitter Data Travel Mode Detection Vehicle Classification Traffic Information System Machine learning Natural Language Processing
8	Sentiment Analysis of COVID-19 Vaccine Discourse on Twitter Andersson, Patrik January 2024 (has links) The rapid development and disitribution of COVID-19 vaccines have sparked diverse public reactions globally, often reflected through social media platförms like Twitter. This study aims to analyze the sentiment andd public discourse surrounding COVID-19 vaccines on Twitter, utilizing advanced text classification techniques to navigare the vast, unstructured nature of sicial media dfata. By implementing sentiment analysis, the research categoizes tweets into positive, negative, and neutral sentiments to gauge public opinion more effectively. In-depth analysis thorugh topic modelingtecniques helped identify seven key topicvs influencing public sentiment including aspects related to efficiacy, logisticl challenges, safety concens, and personal experiences, each varying in prominence depending on the country, as well as the specific timeline of vaccine deployment. Additionally, this study explorers geographical variations in sentiment, notig significant differences in public opinion across different countries. These variations could be tied to local cultural, social, and political contexts. Reults from this study show a polarized response towards vaccination, with significant discourse clusers showing either strong supprt for or resistance against the COVID-19 vaccination efforts. This polarization is further pronounced by the logistical challenges and trust issues related to vaccine science, particularly emphasized in tweets from couintries with lower vaccine acceptance rates. This sentiment analysis on Twitter offers valuable insights into the public's perception and acceptancce of COVID-19 vaccines, providing a useful tool for policymakers and public health officials to understand and address publiv concerns effectively. By identifying and understanding the key factors influencing vaccine sentiment, tageted communication strategies can be developed to enhance publiv engagement and vaccine uptake. COVID-19 Vaccine Sentiment Analysis Twitter Data Vaccine Hesitancy Public Opinion Text Mining Topic Modeling VADER Sentiment Analysis Geographic Sentiment Distribution Social Media Analysis Computer and Information Sciences Data- och informationsvetenskap
9	AUTOMATED OPTIMAL FORECASTING OF UNIVARIATE MONITORING PROCESSES : Employing a novel optimal forecast methodology to define four classes of forecast approaches and testing them on real-life monitoring processes Razroev, Stanislav January 2019 (has links) This work aims to explore practical one-step-ahead forecasting of structurally changing data, an unstable behaviour, that real-life data connected to human activity often exhibit. This setting can be characterized as monitoring process. Various forecast models, methods and approaches can range from being simple and computationally "cheap" to very sophisticated and computationally "expensive". Moreover, different forecast methods handle different data-patterns and structural changes differently: for some particular data types or data intervals some particular forecast methods are better than the others, something that is usually not known beforehand. This raises a question: "Can one design a forecast procedure, that effectively and optimally switches between various forecast methods, adapting the forecast methods usage to the changes in the incoming data flow?" The thesis answers this question by introducing optimality concept, that allows optimal switching between simultaneously executed forecast methods, thus "tailoring" forecast methods to the changes in the data. It is also shown, how another forecast approach: combinational forecasting, where forecast methods are combined using weighted average, can be utilized by optimality principle and can therefore benefit from it. Thus, four classes of forecast results can be considered and compared: basic forecast methods, basic optimality, combinational forecasting, and combinational optimality. The thesis shows, that the usage of optimality gives results, where most of the time optimality is no worse or better than the best of forecast methods, that optimality is based on. Optimality reduces also scattering from multitude of various forecast suggestions to a single number or only a few numbers (in a controllable fashion). Optimality gives additionally lower bound for optimal forecasting: the hypothetically best achievable forecast result. The main conclusion is that optimality approach makes more or less obsolete other traditional ways of treating the monitoring processes: trying to find the single best forecast method for some structurally changing data. This search still can be sought, of course, but it is best done within optimality approach as its innate component. All this makes the proposed optimality approach for forecasting purposes a valid "representative" of a more broad ensemble approach (which likewise motivated development of now popular Ensemble Learning concept as a valid part of Machine Learning framework). / Denna avhandling syftar till undersöka en praktisk ett-steg-i-taget prediktering av strukturmässigt skiftande data, ett icke-stabilt beteende som verkliga data kopplade till människoaktiviteter ofta demonstrerar. Denna uppsättning kan alltså karakteriseras som övervakningsprocess eller monitoringsprocess. Olika prediktionsmodeller, metoder och tillvägagångssätt kan variera från att vara enkla och "beräkningsbilliga" till sofistikerade och "beräkningsdyra". Olika prediktionsmetoder hanterar dessutom olika mönster eller strukturförändringar i data på olika sätt: för vissa typer av data eller vissa dataintervall är vissa prediktionsmetoder bättre än andra, vilket inte brukar vara känt i förväg. Detta väcker en fråga: "Kan man skapa en predictionsprocedur, som effektivt och på ett optimalt sätt skulle byta mellan olika prediktionsmetoder och för att adaptera dess användning till ändringar i inkommande dataflöde?" Avhandlingen svarar på frågan genom att introducera optimalitetskoncept eller optimalitet, något som tillåter ett optimalbyte mellan parallellt utförda prediktionsmetoder, för att på så sätt skräddarsy prediktionsmetoder till förändringar i data. Det visas också, hur ett annat prediktionstillvägagångssätt: kombinationsprediktering, där olika prediktionsmetoder kombineras med hjälp av viktat medelvärde, kan utnyttjas av optimalitetsprincipen och därmed få nytta av den. Alltså, fyra klasser av prediktionsresultat kan betraktas och jämföras: basprediktionsmetoder, basoptimalitet, kombinationsprediktering och kombinationsoptimalitet. Denna avhandling visar, att användning av optimalitet ger resultat, där optimaliteten för det mesta inte är sämre eller bättre än den bästa av enskilda prediktionsmetoder, som själva optimaliteten är baserad på. Optimalitet reducerar också spridningen från mängden av olika prediktionsförslag till ett tal eller bara några enstaka tal (på ett kontrollerat sätt). Optimalitet producerar ytterligare en nedre gräns för optimalprediktion: det hypotetiskt bästa uppnåeliga prediktionsresultatet. Huvudslutsatsen är följande: optimalitetstillvägagångssätt gör att andra traditionella sätt att ta hand om övervakningsprocesser blir mer eller mindre föråldrade: att leta bara efter den enda bästa enskilda prediktionsmetoden för data med strukturskift. Sådan sökning kan fortfarande göras, men det är bäst att göra den inom optimalitetstillvägagångssättet, där den ingår som en naturlig komponent. Allt detta gör det föreslagna optimalitetstillvägagångssättetet för prediktionsändamål till en giltig "representant" för det mer allmäna ensembletillvägagångssättet (något som också motiverade utvecklingen av numera populär Ensembleinlärning som en giltig del av Maskininlärning). predictions forecasting optimal forecasting forecast classes optimality rules ensemble forecasting state switching combinational forecasting optimality framework exponential smoothing ARIMA ARMA SARIMA Double-Seasonal Holt-Winters time series Wikipedia Wikimedia Wikipedia data Twitter Twitter data electricity data monitoring processes monitoring process monitoring error metrics outliers missing values Mathematics Matematik

Search results