Global ETD Search

Return to search

Learning to Search for Targets : A Deep Reinforcement Learning Approach to Visual Search in Unseen Environments / Inlärd sökning efter mål

Visual search is the perceptual task of locating a target in a visual environment. Due to applications in areas like search and rescue, surveillance, and home assistance, it is of great interest to automate visual search. An autonomous system can potentially search more efficiently than a manually controlled one and has the advantages of reduced risk and cost of labor. In many environments, there is structure that can be utilized to find targets quicker. However, manually designing search algorithms that properly utilize structure to search efficiently is not trivial. Different environments may exhibit vastly different characteristics, and visual cues may be difficult to pick up. A learning system has the advantage of being applicable to any environment where there is a sufficient number of samples to learn from. In this thesis, we investigate how an agent that learns to search can be implemented with deep reinforcement learning. Our approach jointly learns control of visual attention, recognition, and localization from a set of sample search scenarios. A recurrent convolutional neural network takes an image of the visible region and the agent's position as input. Its outputs indicate whether a target is visible and control where the agent looks next. The recurrent step serves as a memory that lets the agent utilize features of the explored environment when searching. We compare two memory architectures: an LSTM, and a spatial memory that remembers structured visual information. Through experimentation in three simulated environments, we find that the spatial memory architecture achieves superior search performance. It also searches more efficiently than a set of baselines that do not utilize the appearance of the environment and achieves similar performance to that of a human searcher. Finally, the spatial memory scales to larger search spaces and is better at generalizing from a limited number of training samples.

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-186537

visual search

reinforcement learning

förstärkningsinlärning

djupinlärning

datorseende

autonoma system

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-186537
Date	January 2022
Creators	Lundin, Oskar
Publisher	Linköpings universitet, Statistik och maskininlärning, FOI
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0018 seconds

Learning to Search for Targets : A Deep Reinforcement Learning Approach to Visual Search in Unseen Environments / Inlärd sökning efter mål

Description

Links & Downloads

Tags

Additional Fields