Global ETD Search

Return to search

Enhancing PointNet: New Aggregation Functions and Contextual Normalization

The PointNet architecture is a foundational deep learning model for 3D point clouds, solving classification and segmentation tasks. We hypothesize that the full potential of PointNet has not been reached and is greatly restrained by a single Max pooling layer. First, this thesis introduces new and more complex learnable aggregation functions. Secondly, a novel normalization technique, Context Normalization, is proposed for further feature extraction. Context Normalization is similar to Batch Normalization but independently normalizes each point cloud within a mini-batch and always uses dynamic statistics. The experiments show that replacing Max pooling with Principal Neighborhood Aggregation (PNA) increased classification accuracy from 73.3% to 78.7% on an SO(3) augmented version of the ModelNet40 dataset. Combining PNA with Context Normalization further increased accuracy to 84.6%.

http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-349362

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-349362
Date	January 2024
Creators	Isaksson Jonek, Markus
Publisher	KTH, Skolan för teknikvetenskap (SCI)
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	TRITA-SCI-GRU ; 2024:173

Page generated in 0.0021 seconds

Enhancing PointNet: New Aggregation Functions and Contextual Normalization

Description

Links & Downloads

Tags

Additional Fields