Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-training

Jun-17-2026, 00:09:55 GMT–Neural Information Processing Systems

Self-supervised pre-training is essential for 3D point cloud representation learning, as annotating their irregular, topology-free structures is costly and labor-intensive. Masked autoencoders (MAEs) offer a promising framework but rely on explicit positional embeddings, such as patch center coordinates, which leak geometric information and limit data-driven structural learning. In this work, we propose Point-MaDi, a novel Point cloud Masked autoencoding Diffusion framework for pre-training that integrates a dual-diffusion pretext task into an MAE architecture to address this issue. Specifically, we introduce a center diffusion mechanism in the encoder, noising and predicting the coordinates of both visible and masked patch centers without ground-truth positional embeddings. These predicted centers are processed using a transformer with self-attention and cross-attention to capture intra-and inter-patch relationships. In the decoder, we design a conditional patch diffusion process, guided by the encoder's latent features and predicted centers to reconstruct masked patches directly from noise.

artificial intelligence, machine learning, point cloud, (13 more...)

Neural Information Processing Systems

Jun-17-2026, 00:09:55 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.28)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.92)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found