Return to search

Model Checked Reinforcement Learning For Multi-Agent Planning

Autonomous systems, or agents as they sometimes are called can be anything from drones, self-driving cars, or autonomous construction equipment. The systems are often given tasks of accomplishing missions in a group or more. This may require that they can work within the same area without colliding or disturbing other agents' tasks. There are several tools for planning and designing such systems, one of them being UPPAAL STRATEGO. Multi-agent planning (MAP) is about planning actions in optimal ways such that the agents can accomplish their mission efficiently. A method of doing this named MCRL, utilizes Q learning as the algorithm for  finding an optimal plan. These plans then need to be verified to ensure that they can accomplish what a user intended within the allowed time, something that UPPAAL STRATEGO can do. This is because a Q-learning algorithm does not have a correctness guarantee. Using this method alleviates the state-explosion problem that exists with an increasing number of agents. Using UPPAAL STRATEGO it is also possible to acquire the best and worst-case execution time (BCET and WCET) and their corresponding traces. This thesis aims to obtain the BCET and WCET and their corresponding traces in the model.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:mdh-64359
Date January 2023
CreatorsWetterholm, Erik
PublisherMälardalens universitet, Akademin för innovation, design och teknik
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0015 seconds