Return to search

A Novel Semantic Feature Fusion-based Pedestrian Detection System to Support Autonomous Vehicles

Intelligent transportation systems (ITS) have become a popular method to enhance the safety and efficiency of transportation. Pedestrians, as an essential participant of ITS, are very vulnerable in a traffic collision, compared with the passengers inside the vehicle. In order to protect the safety of all traffic participants and enhance transportation efficiency, the novel autonomous vehicles are required to detect pedestrians accurately and timely.

In the area of pedestrian detection, deep learning-based pedestrian detection methods have gained significant development since the appearance of powerful GPUs. A large number of researchers are paying efforts to improve the accuracy of pedestrian detection by utilizing the Convolutional Neural Network (CNN)-based detectors.

In this thesis, we propose a one-stage anchor-free pedestrian detector named Bi-Center Network (BCNet), which is aided by the semantic features of pedestrians' visible parts. The framework of our BCNet has two main modules: the feature extraction module produces the concatenated feature maps that extracted from different layers of ResNet, and the four parallel branches in the detection module produce the full body center keypoint heatmap, visible part center keypoint heatmap, heights, and offsets, respectively. The final bounding boxes are converted from the high response points on the fused center keypoint heatmap and corresponding predicted heights and offsets.

The fused center keypoint heatmap contains the semantic feature fusion of the full body and the visible part of each pedestrian. Thus, we conduct ablation studies and discover the efficiency of feature fusion and how visibility features benefit the detector's performance by proposing two types of approaches: introducing two weighting hyper-parameters and applying three different attention mechanisms.

Our BCNet gains 9.82% MR-2 (the lower the better) on the Reasonable setup of the CityPersons dataset, compared to baseline model which gains 12.14% MR-2 .
The experimental results indicate that the performance of pedestrian detection could be significantly improved because the visibility semantic could prompt stronger responses on the heatmap. We compare our BCNet with state-of-the-art models on the CityPersons dataset and ETH dataset, which shows that our detector is effective and achieves a promising performance.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/42213
Date27 May 2021
CreatorsSha, Mingzhi
ContributorsBoukerche, Azzedine
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf

Page generated in 0.0024 seconds