• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Performance Evaluation of Imitation Learning Algorithms with Human Experts

Båvenstrand, Erik, Berggren, Jakob January 2019 (has links)
The purpose of this thesis was to compare the performance of three different imitation learning algorithms with human experts, with limited expert time. The central question was, ”How should one implement imitation learning in a simulated car racing environment, using human experts, to achieve the best performance when access to the experts is limited?”. We limited the work to only consider the three algorithms Behavior Cloning, DAGGER, and HG-DAGGER and limited the implementation to the car racing simulator TORCS. The agents consisted of the same type of feedforward neural network that utilized sensor data provided by TORCS. Through comparison in the performance of the different algorithms on a different amount of expert time, we can conclude that HGDAGGER performed the best. In this case, performance is regarded as a distance covered given set time. Its performance also seemed to scale well with more expert time, which the others did not. This result confirmed previously published results when comparing these algorithms. / Målet med detta examensarbete var att jämföra prestandan av tre olika algoritmer inom området imitationinlärning med mänskliga experter, där experttiden är begränsad. Arbetets frågeställning var, ”Hur ska man implementera imitationsinlärning i en bilsimulator, för att få bäst prestanda, med mänskliga experter där experttiden är begränsad?”. Vi begränsade arbetet till att endast omfatta de tre algoritmerna, Behavior Cloning, DAGGER och HG-DAGGER, och begränsade implementationsmiljön till bilsimulatorn TORCS. Alla agenterna bestod av samma sorts feedforward neuralt nätverk som använde sig av sensordata från TROCS. Genom jämförelse i prestanda på olika mängder experttid kan vi dra slutsatsen att HG-DAGGER gav bäst resultat. I detta fall motsvarar prestanda körsträcka, givet en viss tid. Dess prestanda verkar även utvecklas väl med ytterligare experttid, vilket de övriga inte gjorde. Detta resultat bekräftar tidigare publicerade resultat om jämförelse av de tre olika algoritmerna.

Page generated in 0.0274 seconds