Return to search

Object Detection Using Multiple Level Annotations

Object detection is a fundamental problem in computer vision. Impressive results have been achieved on large-scale detection benchmarks by fully-supervised object detection (FSOD) methods. However, FSOD approaches require tremendous instance-level annotations, which are time-consuming to collect. In contrast, weakly supervised object detection (WSOD) exploits easily-collected image-level labels while it suffers from relatively inferior detection performance.
This thesis studies hybrid learning methods on the object detection problems. We intend to train an object detector from a dataset where both instance-level and image-level labels are employed. Extensive experiments on the challenging PASCAL VOC 2007 and 2012 benchmarks strongly demonstrate the effectiveness of our method, which gives a trade-off between collecting fewer annotations and building a more accurate object detector. Our method is also a strong baseline bridging the wide gap between FSOD and WSOD performances.
Based on the hybrid learning framework, we further study the problem of object detection from a novel perspective in which the annotation budget constraints are taken into consideration. When provided with a fixed budget, we propose a strategy for building a diverse and informative dataset that can be used to optimally train a robust detector. We investigate both optimization and learning-based methods to sample which images to annotate and which level of annotations (strongly or weakly supervised) to annotate them with.
By combining an optimal image/annotation selection scheme with the hybrid supervised learning, we show that one can achieve the performance of a strongly supervised detector on PASCAL-VOC 2007 while saving 12:8% of its original annotation budget. Furthermore, when 100% of the budget is used, it surpasses this performance by 2:0 mAP percentage points.

Identiferoai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/631958
Date04 1900
CreatorsXu, Mengmeng
ContributorsGhanem, Bernard, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Al-Naffouri, Tareq Y., Thabet, Ali Kassem
Source SetsKing Abdullah University of Science and Technology
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0091 seconds