A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. 14 August 2014. / Reinforcement learning is a branch of machine learning wherein an agent is given rewards which it uses
to guide its learning. These rewards are often manually specified in terms of a reward function. The
agent performs a certain action in a particular state and is given a reward accordingly (where a state is
a configuration of the environment). A problem arises when the reward function is either difficult or
impossible to manually specify. Apprenticeship learning via inverse reinforcement learning can be used
in these cases in order to ascertain a reward function, given a set of expert trajectories. The research
presented in this document used apprenticeship learning in order to ascertain a reward function in a
musical context. The agent then optimized its performance in terms of this reward function. This was
accomplished by presenting the learning agents with pieces of music composed by the author. These
were the expert trajectories from which the learning agent discovered a reward function. This reward
function allowed the agents to attempt to discover an optimal strategy for maximizing its value.
Three learning agents were created. Two were drum-beat generating agents and one a melody composing
agent. The first two agents were used to recreate expert drum-beats as well as generate new
drum-beats. The melody agent was used to generate new melodies given a set of expert melodies. The
results show that apprenticeship learning can be used both to recreate expert musical pieces as well as
generate new musical pieces which are similar to with the expert musical pieces. Further, the results using
the melody agent indicate that the agent has learned to generate new melodies in a given key, without
having been given explicit information about key signatures.
Identifer | oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:wits/oai:wiredspace.wits.ac.za:10539/16861 |
Date | 04 February 2015 |
Creators | Messer, Orry |
Source Sets | South African National ETD Portal |
Language | English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Page generated in 0.0026 seconds