1 |
Towards a Robust and Efficient Deep Neural Network for the Lidar Point Cloud PerceptionZhou, Zixiang 01 January 2023 (has links) (PDF)
In recent years, LiDAR has emerged as a crucial perception tool for robotics and autonomous vehicles. However, most LiDAR perception methods are adapted from 2D image-based deep learning methods, which are not well-suited to the unique geometric structure of LiDAR point cloud data. This domain gap poses challenges for the fast-growing LiDAR perception tasks. This dissertation aims to investigate suitable deep network structures tailored for LiDAR point cloud data, and therefore design a more efficient and robust LiDAR perception framework. Our approach to address this challenge is twofold. First, we recognize that LiDAR point cloud data is characterized by an imbalanced and sparse distribution in the 3D space, which is not effectively captured by traditional voxel-based convolution methods that treat the 3D map uniformly. To address this issue, we aim to develop a more efficient feature extraction method by either counteracting the imbalanced feature distribution or incorporating global contextual information using a transformer decoder. Second, besides the gap between the 2D and 3D domains, we acknowledge that different LiDAR perception tasks have unique requirements and therefore require separate network designs, resulting in significant network redundancy. To address this, we aim to improve the efficiency of the network design by developing a unified multi-task network that shares the feature-extracting stage and performs different tasks using specific heads. More importantly, we aim to enhance the accuracy of different tasks by leveraging the multi-task learning framework to enable mutual improvements. We propose different models based on these motivations and evaluate them on several large-scale LiDAR point cloud perception datasets, achieving state-of-the-art performance. Lastly, we summarize the key findings of this dissertation and propose future research directions.
|
2 |
Tracking Human Movement Indoors Using Terrestrial LidarKarki, Shashank 03 June 2024 (has links)
Recent developments in surveying and mapping technologies have greatly enhanced our ability to model and analyze both outdoor and indoor environments. This research advances the traditional concept of digital twins—static representations of physical spaces—by integrating real-time data on human occupancy and movement to develop a dynamic digital twin. Utilizing the newly constructed mixed-use building at Virginia Tech as a case study, this research leverages 11 terrestrial lidar sensors to develop a dynamic digital model that continuously captures human activities within public spaces of the building.
Three distinct object detection methodologies were evaluated: deep learning models, OpenCV-based techniques, and Blickfeld's lidar perception software, Percept. The deep learning and OpenCV techniques analyzed projected 2D raster images, while Percept utilized real-time 3D point clouds to detect and track human movement. The deep learning approach, specifically the YOLOv5 model, demonstrated high accuracy with an F1 score of 0.879. In contrast, OpenCV methods, while less computationally demanding, showed lower accuracy and higher rates of false detections. Percept, operating on real-time 3D lidar streams, performed well but was susceptible to errors due to temporal misalignment.
This study underscores the potential and challenges of employing advanced lidar-based technologies to create more comprehensive and dynamic models of indoor spaces. These models significantly enhance our understanding of how buildings serve their users, offering insights that could improve building design and functionality. / Master of Science / Americans spend an average 87% of their time indoors, but mapping these spaces has been a challenge. Traditional methods like satellite imaging and drones do not work well indoors, and camera-based models can be invasive and limiting. By contrast, lidar technology can create detailed maps of indoor spaces while also protecting people's privacy—something especially important in buildings like schools.
Currently, most technology creates static digital maps of places, called digital twins, but these do not show how people actually use these spaces. My study aims to take this a step further by developing a dynamic digital twin. This enhanced model shows the physical space and incorporates real-time information about where and how people move within it.
For my research, I used lidar data collected from 11 sensors in a mixed-use building at Virginia Tech to create detailed images that track movement. I applied advanced computer techniques, including machine learning and computer vision, to detect human movement within the study space. Specifically, I used methods such as YOLOv5 for deep learning and OpenCV for movement detection to find and track people's movements inside the building.
I also compared my techniques with a known software called Percept by Blickfeld, which detects moving objects in real-time from lidar data. To evaluate how well my methods worked, I measured them using traditional and innovative statistical metrics against a standard set of manually tagged images. This way, I could see how accurately my system could track indoor dynamics, offering a richer, more dynamic view of how indoor spaces are used.
|
3 |
Maximizing the performance of point cloud 4D panoptic segmentation using AutoML technique / Maximera prestandan för punktmoln 4D panoptisk segmentering med hjälp av AutoML-teknikMa, Teng January 2022 (has links)
Environment perception is crucial to autonomous driving. Panoptic segmentation and objects tracking are two challenging tasks, and the combination of both, namely 4D panoptic segmentation draws researchers’ attention recently. In this work, we implement 4D panoptic LiDAR segmentation (4D-PLS) on Volvo datasets and provide a pipeline of data preparation, model building and model optimization. The main contributions of this work include: (1) building the Volvo datasets; (2) adopting an 4D-PLS model improved by Hyperparameter Optimization (HPO). We annotate point cloud data collected from Volvo CE, and take a supervised learning approach by employing a Deep Neural Network (DNN) to extract features from point cloud data. On the basis of the 4D-PLS model, we employ Bayesian Optimization to find the best hyperparameters for our data, and improve the model performance within a small training budget. / Miljöuppfattning är avgörande för autonom körning. Panoptisk segmentering och objektspårning är två utmanande uppgifter, och kombinationen av båda, nämligen 4D panoptisk segmentering, har nyligen uppmärksammat forskarna. I detta arbete implementerar vi 4D-PLS på Volvos datauppsättningar och tillhandahåller en pipeline av dataförberedelse, modellbyggande och modelloptimering. De huvudsakliga bidragen från detta arbete inkluderar: (1) bygga upp Volvos datauppsättningar; (2) anta en 4D-PLS-modell förbättrad av HPO. Vi kommenterar punktmolndata som samlats in från Volvo CE och använder ett övervakat lärande genom att använda en DNN för att extrahera funktioner från punktmolnsdata. På basis av 4D-PLS-modellen använder vi Bayesian Optimization för att hitta de bästa hyperparametrarna för vår data och förbättra modellens prestanda inom en liten utbildningsbudget.
|
Page generated in 0.0839 seconds