Return to search

Learning to Search for Targets : A Deep Reinforcement Learning Approach to Visual Search in Unseen Environments / Inlärd sökning efter mål

Visual search is the perceptual task of locating a target in a visual environment. Due to applications in areas like search and rescue, surveillance, and home assistance, it is of great interest to automate visual search. An autonomous system can potentially search more efficiently than a manually controlled one and has the advantages of reduced risk and cost of labor. In many environments, there is structure that can be utilized to find targets quicker. However, manually designing search algorithms that properly utilize structure to search efficiently is not trivial. Different environments may exhibit vastly different characteristics, and visual cues may be difficult to pick up. A learning system has the advantage of being applicable to any environment where there is a sufficient number of samples to learn from. In this thesis, we investigate how an agent that learns to search can be implemented with deep reinforcement learning. Our approach jointly learns control of visual attention, recognition, and localization from a set of sample search scenarios. A recurrent convolutional neural network takes an image of the visible region and the agent's position as input. Its outputs indicate whether a target is visible and control where the agent looks next. The recurrent step serves as a memory that lets the agent utilize features of the explored environment when searching. We compare two memory architectures: an LSTM, and a spatial memory that remembers structured visual information. Through experimentation in three simulated environments, we find that the spatial memory architecture achieves superior search performance. It also searches more efficiently than a set of baselines that do not utilize the appearance of the environment and achieves similar performance to that of a human searcher. Finally, the spatial memory scales to larger search spaces and is better at generalizing from a limited number of training samples.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-186537
Date January 2022
CreatorsLundin, Oskar
PublisherLinköpings universitet, Statistik och maskininlärning, FOI
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0018 seconds