Return to search

Instance Segmentation on depth images using Swin Transformer for improved accuracy on indoor images / Instans-segmentering på bilder med djupinformation för förbättrad prestanda på inomhusbilder

The Simultaneous Localisation And Mapping (SLAM) problem is an open fundamental problem in autonomous mobile robotics. One of the latest most researched techniques used to enhance the SLAM methods is instance segmentation. In this thesis, we implement an instance segmentation system using Swin Transformer combined with two of the state of the art methods of instance segmentation namely Cascade Mask RCNN and Mask RCNN. Instance segmentation is a technique that simultaneously solves the problem of object detection and semantic segmentation. We show that depth information enhances the average precision (AP) by approximately 7%. We also show that the Swin Transformer backbone model can work well with depth images. Our results also show that Cascade Mask RCNN outperforms Mask RCNN. However, the results are to be considered due to the small size of the NYU-depth v2 dataset. Most of the instance segmentation researches use the COCO dataset which has a hundred times more images than the NYU-depth v2 dataset but it does not have the depth information of the image.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-184179
Date January 2022
CreatorsHagberg, Alfred, Musse, Mustaf Abdullahi
PublisherLinköpings universitet, Artificiell intelligens och integrerade datorsystem
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationarXiv.org

Page generated in 0.0024 seconds