Perception for autonomous drive systems is the most essential function for safe and reliable driving. LiDAR sensors can be used for perception and are vying for being crowned as an essential element in this task. In this thesis, we present a novel real-time solution for detection and tracking of moving objects which utilizes deep learning based 3D object detection. Moreover, we present a joint solution which utilizes the predictability of Kalman Filters to infer object properties and semantics to the object detection algorithm, resulting in a closed loop of object detection and object tracking.On one hand, we present YOLO++, a 3D object detection network on point clouds only. A network that expands YOLOv3, the latest contribution to standard real-time object detection for three-channel images. Our object detection solution is fast. It processes images at 20 frames per second. Our experiments on the KITTI benchmark suite show that we achieve state-of-the-art efficiency but with a mediocre accuracy for car detection, which is comparable to the result of Tiny-YOLOv3 on the COCO dataset. The main advantage with YOLO++ is that it allows for fast detection of objects with rotated bounding boxes, something which Tiny-YOLOv3 can not do. YOLO++ also performs regression of the bounding box in all directions, allowing for 3D bounding boxes to be extracted from a bird's eye view perspective. On the other hand, we present a Multi-threaded Object Tracking (MTKF) solution for multiple object tracking. Each unique observation is associated to a thread with a novel concurrent data association process. Each of the threads contain an Extended Kalman Filter that is used for predicting and estimating an associated object's state over time. Furthermore, a LiDAR odometry algorithm was used to obtain absolute information about the movement of objects, since the movement of objects are inherently relative to the sensor perceiving them. We obtain 33 state updates per second with an equal amount of threads to the number of cores in our main workstation.Even if the joint solution has not been tested on a system with enough computational power, it is ready for deployment. Using YOLO++ in combination with MTKF, our real-time constraint of 10 frames per second is satisfied by a large margin. Finally, we show that our system can take advantage of the predicted semantic information from the Kalman Filters in order to enhance the inference process in our object detection architecture.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:umu-160180 |
Date | January 2019 |
Creators | Söderlund, Henrik |
Publisher | Umeå universitet, Institutionen för tillämpad fysik och elektronik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0032 seconds