Global ETD Search

Return to search

Deep reinforcement learning approach to portfolio management / Deep reinforcement learning metod för portföljförvaltning

This thesis evaluates the use of a Deep Reinforcement Learning (DRL) approach to portfolio management on the Swedish stock market. The idea is to construct a portfolio that is adjusted daily using the DRL algorithm Proximal policy optimization (PPO) with a multi perceptron neural network. The input to the neural network was historical data in the form of open, high, and low price data. The portfolio is evaluated by its performance against the OMX Stockholm 30 index (OMXS30). Furthermore, three different approaches for optimization are going to be studied, in that three different reward functions are going to be used. These functions are Sharp ratio, cumulative reward (Daily return) and Value at risk reward (which is a daily return with a value at risk penalty). The historival data that is going to be used is from the period 2010-01-01 to 2015-12-31 and the DRL approach is then tested on two different time periods which represents different marked conditions, 2016-01-01 to 2018-12-31 and 2019-01-01 to 2021-12-31. The results show that in the first test period all three methods (corresponding to the three different reward functions) outperform the OMXS30 benchmark in returns and sharp ratio, while in the second test period none of the methods outperform the OMXS30 index. / Målet med det här arbetet var att utvärdera användningen av "Deep reinforcement learning" (DRL) metod för portföljförvaltning på den svenska aktiemarknaden. Idén är att konstruera en portfölj som justeras dagligen med hjälp av DRL algoritmen "Proximal policy optimization" (PPO) med ett neuralt nätverk med flera perceptroner. Inmatningen till det neurala nätverket var historiska data i form av öppnings, lägsta och högsta priser. Portföljen utvärderades utifrån dess prestation mot OMX Stockholm 30 index (OMXS30). Dessutom studerades tre olika tillvägagångssätt för optimering, genom att använda tre olika belöningsfunktioner. Dessa funktioner var Sharp ratio, kumulativ belöning (Daglig avkastning) och Value at risk-belöning (som är en daglig avkastning minus Value at risk-belöning). Den historiska data som användes var från perioden 2010-01-01 till 2015-12-31 och DRL-metoden testades sedan på två olika tidsperioder som representerar olika marknadsförhållanden, 2016-01-01 till 2018-12-31 och 2019-01-01 till 2021-12-31. Resultatet visar att i den första testperioden så överträffade alla tre metoder (vilket motsvarar de tre olika belöningsfunktionerna) OMXS30 indexet i avkastning och sharp ratio, medan i den andra testperioden så överträffade ingen av metoderna OMXS30 indexet.

http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-345005

Deep reinforcement learning

Proximal policy optimization

reward function

portfolio management

Deep reinforcement learning

Proximal policy optimization

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-345005
Date	January 2023
Creators	Jama, Fuaad
Publisher	KTH, Matematik (Avd.)
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds

Deep reinforcement learning approach to portfolio management / Deep reinforcement learning metod för portföljförvaltning

Description

Links & Downloads

Tags

Additional Fields