The main result of this thesis is a deep learning model named BearNet, which can be trained to detect an arbitrary amount of objects as a set of points. The model is trained using the Weighted Hausdorff distance as loss function. BearNet has been applied and tested on two problems from the industry. These are: From an intensity image, detect two pocket points of an EU-pallet which an autonomous forklift could utilize when determining where to insert its forks. From a depth image, detect the start, bend and end points of a straw attached to a juice package, in order to help determine if the straw has been attached correctly. In the development process of BearNet I took inspiration from the designs of U-Net, UNet++ and a high resolution network named HRNet. Further, I used a dataset containing RGB-images from a surveillance camera located inside a mall, on which the aim was to detect head positions of all pedestrians. In an attempt to reproduce a result from another study, I found that the mall dataset suffers from training set contamination when a model is trained, validated, and tested on it with random sampling. Hence, I propose that the mall dataset is evaluated with a sequential data split strategy, to limit the problem. I found that the BearNet architecture is well suited for both the EU-pallet and straw datasets, and that it can be successfully used on either RGB, intensity or depth images. On the EU-pallet and straw datasets, BearNet consistently produces point estimates within five and six pixels of ground truth, respectively. I also show that the straw dataset only constitutes a small subset of all the challenges that exist in the problem domain related to the attachment of a straw to a juice package, and that one therefore cannot train a robust deep learning model on it. As an example of this, models trained on the straw dataset cannot correctly handle samples in which there is no straw visible.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-166644 |
Date | January 2020 |
Creators | Runow, Björn |
Publisher | Linköpings universitet, Datorseende |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0122 seconds