Return to search

Deep Learning for Point Detection in Images

The main result of this thesis is a deep learning model named BearNet, which can be trained to detect an arbitrary amount of objects as a set of points. The model is trained using the Weighted Hausdorff distance as loss function. BearNet has been applied and tested on two problems from the industry. These are: From an intensity image, detect two pocket points of an EU-pallet which an autonomous forklift could utilize when determining where to insert its forks. From a depth image, detect the start, bend and end points of a straw attached to a juice package, in order to help determine if the straw has been attached correctly. In the development process of BearNet I took inspiration from the designs of U-Net, UNet++ and a high resolution network named HRNet. Further, I used a dataset containing RGB-images from a surveillance camera located inside a mall, on which the aim was to detect head positions of all pedestrians. In an attempt to reproduce a result from another study, I found that the mall dataset suffers from training set contamination when a model is trained, validated, and tested on it with random sampling. Hence, I propose that the mall dataset is evaluated with a sequential data split strategy, to limit the problem. I found that the BearNet architecture is well suited for both the EU-pallet and straw datasets, and that it can be successfully used on either RGB,  intensity or depth images. On the EU-pallet and straw datasets, BearNet consistently produces point estimates within five and six pixels of ground truth, respectively. I also show that the straw dataset only constitutes a small subset of all the challenges that exist in the problem domain related to the attachment of a straw to a juice package, and that one therefore cannot train a robust deep learning model on it. As an example of this, models trained on the straw dataset cannot correctly handle samples in which there is no straw visible.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-166644
Date January 2020
CreatorsRunow, Björn
PublisherLinköpings universitet, Datorseende
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0122 seconds