With the emergence of Convolutional Neural Network (CNN) models, precision of image classification tasks has been improved significantly over these years. Regional CNN (RCNN) model is proposed to solve object detection tasks with a combination of Region Proposal Network and CNN. This model improves the detection accuracy but suffer from slow inference speed because of its multi-stage structure. The Single Stage Detection (SSD) network is later proposed to further improve the object detection benchmark in terms of accuracy and speed. However, SSD model still suffers from high miss rate on small targets since datasets are usually dominated by medium and large sized objects, which don’t share the same features with small ones.
On the other hand, geometric analysis on dataset images can provide additional information before model training. In this thesis, we propose several SSD-based models with adjusted parameters on feature extraction layers by using geometric analysis on KITTI and Caltech Pedestrian datasets. This analysis extends SSD’s capability on small objects detection. To further improve detection accuracy, we propose a two-stream network, which uses one stream to detect medium to large objects, and another stream specifically for small objects. This two-stream model achieves competitive performance comparing to other algorithms on KITTI and Caltech Pedestrian benchmark. Those results are shown and analysed in this thesis as well.
Identifer | oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/38873 |
Date | 06 March 2019 |
Creators | Wang, Binghao |
Contributors | Laganière, Robert |
Publisher | Université d'Ottawa / University of Ottawa |
Source Sets | Université d’Ottawa |
Language | English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Page generated in 0.0124 seconds