The PointNet architecture is a foundational deep learning model for 3D point clouds, solving classification and segmentation tasks. We hypothesize that the full potential of PointNet has not been reached and is greatly restrained by a single Max pooling layer. First, this thesis introduces new and more complex learnable aggregation functions. Secondly, a novel normalization technique, Context Normalization, is proposed for further feature extraction. Context Normalization is similar to Batch Normalization but independently normalizes each point cloud within a mini-batch and always uses dynamic statistics. The experiments show that replacing Max pooling with Principal Neighborhood Aggregation (PNA) increased classification accuracy from 73.3% to 78.7% on an SO(3) augmented version of the ModelNet40 dataset. Combining PNA with Context Normalization further increased accuracy to 84.6%.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-349362 |
Date | January 2024 |
Creators | Isaksson Jonek, Markus |
Publisher | KTH, Skolan för teknikvetenskap (SCI) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | TRITA-SCI-GRU ; 2024:173 |
Page generated in 0.0021 seconds