Return to search

CountNet3D: A 3D Computer Vision Approach to Infer Counts of Occluded Objects with Quantified Uncertainty

3D scene understanding is an important problem that has experienced great progress in recent years, in large part due to the development of state-of-the-art methods for 3D object detection. However, the performance of 3D object detectors can suffer in scenarios where extreme occlusion of objects is present, or the number of object classes is large. In this paper, we study the problem of inferring 3D counts from densely packed scenes with heterogeneous objects. This problem has applications to important tasks such as inventory management or automatic crop yield estimation. We propose a novel regression-based method, CountNet3D, that uses mature 2D object detectors for finegrained classi- fication and localization, and a PointNet backbone for geo- metric embedding. The network processes fused data from images and point clouds for end-to-end learning of counts. We perform experiments on a novel synthetic dataset for inventory management in retail, which we construct and make publicly available to the community. We also have a proprietary dataset we've collected of real-world scenes. In addition we run experiments to quantify the uncertainty of the models and evaluate the confidence of our predic- tions. Our results show that regression-based 3D counting methods systematically outperform detection-based meth- ods, and reveal that directly learning from raw point clouds greatly assists count estimation under extreme occlusion.

Identiferoai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-11156
Date30 August 2023
CreatorsNelson, Stephen W.
PublisherBYU ScholarsArchive
Source SetsBrigham Young University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceTheses and Dissertations
Rightshttps://lib.byu.edu/about/copyright/

Page generated in 0.0021 seconds