Goto

Collaborating Authors

 point-rtd


Point-RTD: Replaced Token Denoising for Pretraining Transformer Models on Point Clouds

arXiv.org Artificial Intelligence

Abstract--Pre-training strategies play a critical role in advancing the performance of transformer-based models for 3D point cloud tasks. In this paper, we introduce Point-RTD (Replaced T oken Denoising), a novel pretraining strategy designed to improve token robustness through a corruption-reconstruction framework. Unlike traditional mask-based reconstruction tasks that hide data segments for later prediction, Point-RTD corrupts point cloud tokens and leverages a discriminator-generator architecture for denoising. This shift enables more effective learning of structural priors and significantly enhances model performance and efficiency. On the ShapeNet dataset, Point-RTD reduces reconstruction error by over 93% compared to PointMAE, and achieves more than 14 lower Chamfer Distance on the test set. Point clouds have become an essential data representation in various fields such as remote sensing, autonomous driving, and robotics [1].