Global ETD Search

1	A Comparative Study of Reinforcement-based and Semi-classical Learning in Sensor Fusion Bodén, Johan January 2021 (has links) Reinforcement learning has proven itself very useful in certain areas, such as games. However, the approach has been seen as quite limited. Reinforcement-based learning has for instance not been commonly used for classification tasks as it is receiving feedback on how well it did for an action performed on a specific input. This slows the performance convergence rate as compared to other classification approaches which has the input and the corresponding output to train on. Nevertheless, this thesis aims to investigate whether reinforcement-based learning could successfully be employed on a classification task. Moreover, as sensor fusion is an expanding field which can for instance assist autonomous vehicles in understanding its surroundings, it is also interesting to see how sensor fusion, i.e., fusion between lidar and RGB images, could increase the performance in a classification task. In this thesis, a reinforcement-based learning approach is compared to a semi-classical approach. As an example of a reinforcement learning model, a deep Q-learning network was chosen, and a support vector machine classifier built on top of a deep neural network, was chosen as an example of a semi-classical model. In this work, these frameworks are compared with and without sensor fusion to see whether fusion improves their performance. Experiments show that the evaluated reinforcement-based learning approach underperforms in terms of metrics but mainly due to its slow learning process, in comparison to the semi-classical approach. However, on the other hand using reinforcement-based learning to carry out a classification task could still in some cases be advantageous, as it still performs fairly well in terms of the metrics presented in this work, e.g. F1-score, or for instance imbalanced datasets. As for the impact of sensor fusion, a notable improvement can be seen, e.g. when training the deep Q-learning model for 50 episodes, the F1-score increased with 0.1329; especially, when taking into account that the most of the lidar data used in the fusion is lost since this work projects the 3D lidar data onto the same 2D plane as the RGB images. machine learning reinforcement learning deep Q-learning network classical learning supervised learning support vector machine deep neural network sensor fusion Computer and Information Sciences Data- och informationsvetenskap
2	Deep Reinforcement Learning in Cart Pole and Pong Kuurne Uussilta, Dennis, Olsson, Viktor January 2020 (has links) In this project, we aim to reproduce previous resultsachieved with Deep Reinforcement Learning. We present theMarkov Decision Process model as well as the algorithms Q-learning and Deep Q-learning Network (DQN). We implement aDQN agent, first in an environment called CartPole, and later inthe game Pong.Our agent was able to solve the CartPole environment in lessthan 300 episodes. We assess the impact some of the parametershad on the agents performance. The performance of the agentis particularly sensitive to the learning rate and seeminglyproportional to the dimension of the neural network. The DQNagent implemented in Pong was unable to learn, performing atthe same level as an agent picking actions at random, despiteintroducing various modifications to the algorithm. We discusspossible sources of error, including the RAM used as input,possibly not containing sufficient information. Furthermore, wediscuss the possibility of needing additional modifications to thealgorithm in order to achieve convergence, as it is not guaranteedfor DQN. / Målet med detta projekt är att reproducera tidigare resultat som uppnåtts med Deep Reinforcement Learning. Vi presenterar Markov Decision Process-modellen samt algoritmerna Q-learning och Deep Q-learning Network (DQN). Vi implementerar en DQN agent, först i miljön CartPole, sedan i spelet Pong. Vår agent lyckades lösa CartPole på mindre än 300 episoder. Vi gör en bedömning av vissa parametrars påverkan på agentens prestanda. Agentens prestanda är särskilt känslig för värdet på ”learning rate” och verkar vara proportionell mot dimensionen av det neurala nätverket. DQN-agenten som implementerades i Pong var oförmögen att lära sig och spelade på samma nivå som en agent som agerar slumpmässigt, trots att vi introducerade diverse modifikationer. Vi diskuterar möjliga felkällor, bland annat att RAM, som används som indata till agenten, eventuellt saknar tillräcklig information. Dessutom diskuterar vi att ytterligare modifikationer kan vara nödvändiga för uppnå konvergens eftersom detta inte är garanterat för DQN. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm Artificial Intelligence Machine Learning Rein-forcement Learning Deep Q-learning Network CartPole Pong Elektroteknik och elektronik

1

Page generated in 0.0923 seconds