Return to search

Depth-Aware Deep Learning Networks for Object Detection and Image Segmentation

The rise of convolutional neural networks (CNNs) in the context of computer vision
has occurred in tandem with the advancement of depth sensing technology.
Depth cameras are capable of yielding two-dimensional arrays storing at each pixel
the distance from objects and surfaces in a scene from a given sensor, aligned with
a regular color image, obtaining so-called RGBD images. Inspired by prior models
in the literature, this work develops a suite of RGBD CNN models to tackle
the challenging tasks of object detection, instance segmentation, and semantic
segmentation. Prominent architectures for object detection and image segmentation
are modified to incorporate dual backbone approaches inputting RGB and
depth images, combining features from both modalities through the use of novel
fusion modules. For each task, the models developed are competitive with state-of-the-art RGBD architectures. In particular, the proposed RGBD object detection
approach achieves 53.5% mAP on the SUN RGBD 19-class object detection
benchmark, while the proposed RGBD semantic segmentation architecture yields
69.4% accuracy with respect to the SUN RGBD 37-class semantic segmentation
benchmark. An original 13-class RGBD instance segmentation benchmark is introduced for the SUN RGBD dataset, for which the proposed model achieves 38.4%
mAP. Additionally, an original depth-aware panoptic segmentation model is developed, trained, and tested for new benchmarks conceived for the NYUDv2 and
SUN RGBD datasets. These benchmarks offer researchers a baseline for the task
of RGBD panoptic segmentation on these datasets, where the novel depth-aware
model outperforms a comparable RGB counterpart.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/42619
Date01 September 2021
CreatorsDickens, James
ContributorsPayeur, Pierre
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf
RightsAttribution 4.0 International, http://creativecommons.org/licenses/by/4.0/

Page generated in 0.0025 seconds