Global ETD Search

Return to search

Depth Estimation Using Adaptive Bins via Global Attention at High Resolution

We address the problem of estimating a high quality dense depth map from a
single RGB input image. We start out with a baseline encoder-decoder convolutional
neural network architecture and pose the question of how the global processing of
information can help improve overall depth estimation. To this end, we propose a
transformer-based architecture block that divides the depth range into bins whose
center value is estimated adaptively per image. The final depth values are estimated
as linear combinations of the bin centers. We call our new building block AdaBins.
Our results show a decisive improvement over the state-of-the-art on several popular
depth datasets across all metrics. We also validate the effectiveness of the proposed
block with an ablation study.

Monocular Depth Estimation

3D reconstruction

Transformers

3D scene understanding

adaptive binning

Convolutional Neural Networks

Identifer	oai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/668894
Date	21 April 2021
Creators	Bhat, Shariq
Contributors	Wonka, Peter, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Hadwiger, Markus, Ghanem, Bernard
Source Sets	King Abdullah University of Science and Technology
Language	English
Detected Language	English
Type	Thesis

Page generated in 0.0017 seconds

Depth Estimation Using Adaptive Bins via Global Attention at High Resolution

Description

Links & Downloads

Tags

Additional Fields