Return to search

Stock Price Movement Prediction Using Sentiment Analysis and Machine Learning

Stock price prediction is of strong interest but a challenging task to both researchers and investors. Recently, sentiment analysis and machine learning have been adopted in stock price movement prediction. In particular, retail investors’ sentiment from online forums has shown their power to influence the stock market. In this paper, a novel system was built to predict stock price movement for the following trading day. The system includes a web scraper, an enhanced sentiment analyzer, a machine learning engine, an evaluation module, and a recommendation module. The system can automatically select the best prediction model from four state-of-the-art machine learning models (Long Short-Term Memory, Support Vector Machine, Random Forest, and Extreme Boost Gradient Tree) based on the acquired data and the models’ performance. Moreover, stock market lexicons were created using large-scale text mining on the Yahoo Finance Conversation boards and natural language processing. Experiments using the top 30 stocks on the Yahoo users’ watchlists and a randomly selected stock from NASDAQ were performed to examine the system performance and proposed methods. The experimental results show that incorporating sentiment analysis can improve the prediction for stocks with a large daily discussion volume. Long Short-Term Memory model outperformed other machine learning models when using both price and sentiment analysis as inputs. In addition, the Extreme Boost Gradient Tree (XGBoost) model achieved the highest accuracy using the price-only feature on low-volume stocks. Last but not least, the models using the enhanced sentiment analyzer outperformed the VADER sentiment analyzer by 1.96%.

Identiferoai:union.ndltd.org:CALPOLY/oai:digitalcommons.calpoly.edu:theses-3897
Date01 June 2021
CreatorsWang, Jenny Zheng
PublisherDigitalCommons@CalPoly
Source SetsCalifornia Polytechnic State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceMaster's Theses

Page generated in 0.0018 seconds