Object detection is a classical computer vision task, encountered in many practical applications such as robotics and autonomous driving. The latter involves serious consequences of failure and a multitude of challenging demands, including high computational efficiency and detection accuracy. Distant objects are notably difficult to detect accurately due to their small scale in the image, consisting of only a few pixels. This is especially problematic in autonomous driving, as objects should be detected at the earliest possible stage to facilitate handling of hazardous situations. Previous work has addressed small objects via use of feature pyramids and super-resolution techniques, but the efficiency of such methods is limited as computational cost increases with image resolution. Therefore, a trade-off must be made between accuracy and cost. Opportunely though, a common characteristic of driving scenarios is the predominance of distant objects in the centre of the image. Thus, the full-frame image can be downsampled to reduce computational cost, and a crop can be extracted from the image centre to preserve resolution for distant vehicles. In this way, short- and long-range images are generated. This thesis investigates the fusion of such images in a convolutional neural network, particularly the fusion level, fusion operation, and spatial alignment. A novel framework — DetSLR — is proposed for the task and examined via the aforementioned aspects. Through adoption of the framework for the well-established SSD detector and MobileNetV2 feature extractor, it is shown that the framework significantly improves upon the original detector without incurring additional cost. The fusion level is shown to have great impact on the performance of the framework, favouring high-level fusion, while only insignificant differences exist between investigated fusion operations. Finally, spatial alignment of features is demonstrated to be a crucial component of the framework.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-167073 |
Date | January 2020 |
Creators | Luusua, Emil |
Publisher | Linköpings universitet, Datorseende |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0017 seconds