In this thesis, we present ContextNet, a novel general object detection framework for incorporating context cues into a detection pipeline. Current deep learning methods for object detection exploit state-of-the-art image recognition networks for classifying the given region-of-interest (ROI) to predefined classes and regressing a bounding-box around it without using any information about the corresponding scene. ContextNet is based on an intuitive idea of having cues about the general scene (e.g., kitchen and library), and changes the priors about presence/absence of some object classes. We provide a general means for integrating this notion in the decision process about the given ROI by using a pretrained network on the scene recognition datasets in parallel to a pretrained network for extracting object-level features for the corresponding ROI. Using comprehensive experiments on the PASCAL VOC 2007, we demonstrate the effectiveness of our design choices, the resulting system outperforms the baseline in most object classes, and reaches 57.5 mAP (mean Average Precision) on the PASCAL VOC 2007 test set in comparison with 55.6 mAP for the baseline. / MS / The object detection problem is to find objects of interest in a given image and draw boxes around them with object labels. With the emergence of deep learning in recent years, current object detection methods use deep learning technologies. The detection process is solely based on features which are extracted from several thousand regions in the given image. We propose a novel framework for incorporating scene information in the detection process. For example, if we know the image is taken from a kitchen, the probability of seeing a cow or an airplane decreases and observation probability of plates and persons increases. Our new detection network uses this intuition to improve the detection accuracy. Using extensive experiments, we show the proposed methods outperform the baseline for almost all object types.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/88387 |
Date | 15 September 2017 |
Creators | Arefiyan Khalilabad, Seyyed Mostafa |
Contributors | Electrical and Computer Engineering, Abbott, A. Lynn, Tokekar, Pratap, Ramakrishnan, Naren |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Detected Language | English |
Type | Thesis |
Format | ETD, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0015 seconds