Global ETD Search

1	Pointwise and Instance Segmentation for 3D Point Cloud Gujar, Sanket 11 April 2019 (has links) The camera is the cheapest and computationally real-time option for detecting or segmenting the environment for an autonomous vehicle, but it does not provide the depth information and is undoubtedly not reliable during the night, bad weather, and tunnel flash outs. The risk of an accident gets higher for autonomous cars when driven by a camera in such situations. The industry has been relying on LiDAR for the past decade to solve this problem and focus on depth information of the environment, but LiDAR also has its shortcoming. The industry methods commonly use projections methods to create a projection image and run detection and localization network for inference, but LiDAR sees obscurants in bad weather and is sensitive enough to detect snow, making it difficult for robustness in projection based methods. We propose a novel pointwise and Instance segmentation deep learning architecture for the point clouds focused on self-driving application. The model is only dependent on LiDAR data making it light invariant and overcoming the shortcoming of the camera in the perception stack. The pipeline takes advantage of both global and local/edge features of points in points clouds to generate high-level feature. We also propose Pointer-Capsnet which is an extension of CapsNet for small 3D point clouds. 3D deep learning point cloud
2	Compositional and Low-shot Understanding of 3D Objects Li, Yuchen 12 April 2022 (has links) Despite the significant progress in 3D vision in recent years, collecting large amounts of high-quality 3D data remains a challenge. Hence, developing solutions to extract 3D object information efficiently is a significant problem. We aim for an effective shape classification algorithm to facilitate accurate recognition and efficient search of sizeable 3D model databases. This thesis has two contributions in this space: a) a novel meta-learning approach for 3D object recognition and b) propose a new compositional 3D recognition task and dataset. For 3D recognition, we proposed a few-shot semi-supervised meta-learning model based on Pointnet++ representation with a prototypical random walk loss. In particular, we developed the random walk semi-supervised loss that enables fast learning from a few labeled examples by enforcing global consistency over the data manifold and magnetizing unlabeled points around their class prototypes. On the compositional recognition front, we create a large-scale, richly annotated stylized dataset called 3D CoMPaT. This large dataset primarily focuses on stylizing 3D shapes at part-level with compatible materials. We introduce Grounded CoMPaT Recognition as the task of collectively recognizing and grounding compositions of materials on parts of 3D Objects. Point Cloud 3D Deep Learning Compositional Understanding Low-shot Learning
3	Towards Scalable Deep 3D Perception and Generation Qian, Guocheng 11 October 2023 (has links) Scaling up 3D deep learning systems emerges as a paramount issue, comprising two primary facets: (1) Model scalability that designs a 3D network that is scalefriendly, i.e. model archives improving performance with increasing parameters and can run efficiently. Unlike 2D convolutional networks, 3D networks have to accommodate the irregularities of 3D data, such as respecting permutation invariance in point clouds. (2) Data scalability: high-quality 3D data is conspicuously scarce in the 3D field. 3D data acquisition and annotations are both complex and costly, hampering the development of scalable 3D deep learning. This dissertation delves into 3D deep learning including both perception and generation, addressing the scalability challenges. To address model scalability in 3D perception, I introduce ASSANet which outlines an approach for efficient 3D point cloud representation learning, allowing the model to scale up with a low cost of computation, and notably achieving substantial accuracy gains. I further introduce the PointNeXt framework, focusing on data augmentation and scalability of the architecture, that outperforms state-of-the-art 3D point cloud perception networks. To address data scalability, I present Pix4Point which explores the utilization of abundant 2D images to enhance 3D understanding. For scalable 3D generation, I propose Magic123 which leverages a joint 2D and 3D diffusion prior for zero-shot image-to-3D content generation without the necessity of 3D supervision. These collective efforts provide pivotal solutions to model and data scalability in 3D deep learning. 3D Deep Learning 3D Understanding 3D Generation Point Cloud
4	A comparative evaluation of 3d and spatio-temporal deep learning techniques for crime classification and prediction Matereke, Tawanda Lloyd January 2021 (has links) >Magister Scientiae - MSc / This research is on a comparative evaluation of 3D and spatio-temporal deep learning methods for crime classification and prediction using the Chicago crime dataset, which has 7.29 million records, collected from 2001 to 2020. In this study, crime classification experiments are carried out using two 3D deep learning algorithms, i.e., 3D Convolutional Neural Network and the 3D Residual Network. The crime classification models are evaluated using accuracy, F1 score, Area Under Receiver Operator Curve (AUROC), and Area Under Curve - Precision-Recall (AUCPR). The effectiveness of spatial grid resolutions on the performance of the classification models is also evaluated during training, validation and testing. Crime classification Chicago crime dataset 3D deep learning algorithms 3D residual network Crime
5	Self-supervised Representation Learning via Image Out-painting for Medical Image Analysis January 2020 (has links) abstract: In recent years, Convolutional Neural Networks (CNNs) have been widely used in not only the computer vision community but also within the medical imaging community. Specifically, the use of pre-trained CNNs on large-scale datasets (e.g., ImageNet) via transfer learning for a variety of medical imaging applications, has become the de facto standard within both communities. However, to fit the current paradigm, 3D imaging tasks have to be reformulated and solved in 2D, losing rich 3D contextual information. Moreover, pre-trained models on natural images never see any biomedical images and do not have knowledge about anatomical structures present in medical images. To overcome the above limitations, this thesis proposes an image out-painting self-supervised proxy task to develop pre-trained models directly from medical images without utilizing systematic annotations. The idea is to randomly mask an image and train the model to predict the missing region. It is demonstrated that by predicting missing anatomical structures when seeing only parts of the image, the model will learn generic representation yielding better performance on various medical imaging applications via transfer learning. The extensive experiments demonstrate that the proposed proxy task outperforms training from scratch in six out of seven medical imaging applications covering 2D and 3D classification and segmentation. Moreover, image out-painting proxy task offers competitive performance to state-of-the-art models pre-trained on ImageNet and other self-supervised baselines such as in-painting. Owing to its outstanding performance, out-painting is utilized as one of the self-supervised proxy tasks to provide generic 3D pre-trained models for medical image analysis. / Dissertation/Thesis / Masters Thesis Computer Science 2020 Computer science Bioinformatics 3D Deep learning Computer Vision Medical Image Analysis Representation Learning Self-supervised learning Transfer Learning

1

Page generated in 0.0513 seconds