Goto

Collaborating Authors

 point cloud model


Multimodal Robust Prompt Distillation for 3D Point Cloud Models

Gu, Xiang, Lu, Liming, Zheng, Xu, Du, Anan, Zhou, Yongbin, Pang, Shuchao

arXiv.org Artificial Intelligence

Adversarial attacks pose a significant threat to learning-based 3D point cloud models, critically undermining their reliability in security-sensitive applications. Existing defense methods often suffer from (1) high computational overhead and (2) poor generalization ability across diverse attack types. To bridge these gaps, we propose a novel yet efficient teacher-student framework, namely Multimodal Robust Prompt Distillation (MRPD) for distilling robust 3D point cloud model. It learns lightweight prompts by aligning student point cloud model's features with robust embeddings from three distinct teachers: a vision model processing depth projections, a high-performance 3D model, and a text encoder. To ensure a reliable knowledge transfer, this distillation is guided by a confidence-gated mechanism which dynamically balances the contribution of all input modalities. Notably, since the distillation is all during the training stage, there is no additional computational cost at inference. Extensive experiments demonstrate that MRPD substantially outperforms state-of-the-art defense methods against a wide range of white-box and black-box attacks, while even achieving better performance on clean data. Our work presents a new, practical paradigm for building robust 3D vision systems by efficiently harnessing multimodal knowledge.


Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models

Wang, Ziyi, Yu, Xumin, Rao, Yongming, Zhou, Jie, Lu, Jiwen

arXiv.org Artificial Intelligence

With the overwhelming trend of mask image modeling led by MAE, generative pre-training has shown a remarkable potential to boost the performance of fundamental models in 2D vision. However, in 3D vision, the over-reliance on Transformer-based backbones and the unordered nature of point clouds have restricted the further development of generative pre-training. In this paper, we propose a novel 3D-to-2D generative pre-training method that is adaptable to any point cloud model. We propose to generate view images from different instructed poses via the cross-attention mechanism as the pre-training scheme. Generating view images has more precise supervision than its point cloud counterpart, thus assisting 3D backbones to have a finer comprehension of the geometrical structure and stereoscopic relations of the point cloud. Experimental results have proved the superiority of our proposed 3D-to-2D generative pre-training over previous pre-training methods. Our method is also effective in boosting the performance of architecture-oriented approaches, achieving state-of-the-art performance when fine-tuning on ScanObjectNN classification and ShapeNetPart segmentation tasks. Code is available at https://github.com/wangzy22/TAP.


Comparison of Point Cloud and Image-based Models for Calorimeter Fast Simulation

Acosta, Fernando Torales, Mikuni, Vinicius, Nachman, Benjamin, Arratia, Miguel, Karki, Bishnu, Milton, Ryan, Karande, Piyush, Angerami, Aaron

arXiv.org Artificial Intelligence

Score based generative models are a new class of generative models that have been shown to accurately generate high dimensional calorimeter datasets. Recent advances in generative models have used images with 3D voxels to represent and model complex calorimeter showers. Point clouds, however, are likely a more natural representation of calorimeter showers, particularly in calorimeters with high granularity. Point clouds preserve all of the information of the original simulation, more naturally deal with sparse datasets, and can be implemented with more compact models and data files. In this work, two state-of-the-art score based models are trained on the same set of calorimeter simulation and directly compared.


TPC: Transformation-Specific Smoothing for Point Cloud Models

Chu, Wenda, Li, Linyi, Li, Bo

arXiv.org Artificial Intelligence

Point cloud models with neural network architectures have achieved great success and have been widely used in safety-critical applications, such as Lidar-based recognition systems in autonomous vehicles. However, such models are shown vulnerable to adversarial attacks which aim to apply stealthy semantic transformations such as rotation and tapering to mislead model predictions. In this paper, we propose a transformation-specific smoothing framework TPC, which provides tight and scalable robustness guarantees for point cloud models against semantic transformation attacks. We first categorize common 3D transformations into three categories: additive (e.g., shearing), composable (e.g., rotation), and indirectly composable (e.g., tapering), and we present generic robustness certification strategies for all categories respectively. We then specify unique certification protocols for a range of specific semantic transformations and their compositions. Extensive experiments on several common 3D transformations show that TPC significantly outperforms the state of the art. For example, our framework boosts the certified accuracy against twisting transformation along z-axis (within 20$^\circ$) from 20.3$\%$ to 83.8$\%$. Codes and models are available at https://github.com/chuwd19/Point-Cloud-Smoothing.


Point-E: A System for Generating 3D Point Clouds from Complex Prompts

Nichol, Alex, Jun, Heewoo, Dhariwal, Prafulla, Mishkin, Pamela, Chen, Mark

arXiv.org Artificial Intelligence

While recent work on text-conditional 3D object generation has shown promising results, the state-of-the-art methods typically require multiple GPU-hours to produce a single sample. This is in stark contrast to state-of-the-art generative image models, which produce samples in a number of seconds or minutes. In this paper, we explore an alternative method for 3D object generation which produces 3D models in only 1-2 minutes on a single GPU. Our method first generates a single synthetic view using a text-to-image diffusion model, and then produces a 3D point cloud using a second diffusion model which conditions on the generated image. While our method still falls short of the state-of-the-art in terms of sample quality, it is one to two orders of magnitude faster to sample from, offering a practical trade-off for some use cases. We release our pre-trained point cloud diffusion models, as well as evaluation code and models, at https://github.com/openai/point-e.


3DVerifier: Efficient Robustness Verification for 3D Point Cloud Models

Mu, Ronghui, Ruan, Wenjie, Marcolino, Leandro S., Ni, Qiang

arXiv.org Artificial Intelligence

3D point cloud models are widely applied in safety-critical scenes, which delivers an urgent need to obtain more solid proofs to verify the robustness of models. Existing verification method for point cloud model is time-expensive and computationally unattainable on large networks. Additionally, they cannot handle the complete PointNet model with joint alignment network (JANet) that contains multiplication layers, which effectively boosts the performance of 3D models. This motivates us to design a more efficient and general framework to verify various architectures of point cloud models. The key challenges in verifying the large-scale complete PointNet models are addressed as dealing with the cross-non-linearity operations in the multiplication layers and the high computational complexity of high-dimensional point cloud inputs and added layers. Thus, we propose an efficient verification framework, 3DVerifier, to tackle both challenges by adopting a linear relaxation function to bound the multiplication layer and combining forward and backward propagation to compute the certified bounds of the outputs of the point cloud models. Our comprehensive experiments demonstrate that 3DVerifier outperforms existing verification algorithms for 3D models in terms of both efficiency and accuracy. Notably, our approach achieves an orders-of-magnitude improvement in verification efficiency for the large network, and the obtained certified bounds are also significantly tighter than the state-of-the-art verifiers. We release our tool 3DVerifier via https://github.com/TrustAI/3DVerifier for use by the community.


Semantic Detection of Potential Wind-borne Debris in Construction Jobsites: Digital Twining for Hurricane Preparedness and Jobsite Safety

Kamari, Mirsalar, Ham, Youngjib

arXiv.org Artificial Intelligence

In the United States, hurricanes are the most devastating natural disasters causing billions of dollars worth of damage every year. More importantly, construction jobsites are classified among the most vulnerable environments to severe wind events. During hurricanes, unsecured and incomplete elements of construction sites, such as scaffoldings, plywoods, and metal rods, will become the potential wind-borne debris, causing cascading damages to the construction projects and the neighboring communities. Thus, it is no wonder that construction firms implement jobsite emergency plans to enforce preparedness responses before extreme weather events. However, relying on checklist-based emergency action plans to carry out a thorough hurricane preparedness is challenging in large-scale and complex site environments. For enabling systematic responses for hurricane preparedness, we have proposed a vision-based technique to identify and analyze the potential wind-borne debris in construction jobsites. Building on this, this paper demonstrates the fidelity of a new machine vision-based method to support construction site hurricane preparedness and further discuss its implications. The outcomes indicate that the convenience of visual data collection and the advantages of the machine vision-based frameworks enable rapid scene understanding and thus, provide critical heads up for practitioners to recognize and localize the potential wind-borne derbies in construction jobsites and effectively implement hurricane preparedness.


Robustness Certification for Point Cloud Models

Lorenz, Tobias, Ruoss, Anian, Balunović, Mislav, Singh, Gagandeep, Vechev, Martin

arXiv.org Artificial Intelligence

The use of deep 3D point cloud models in safety-critical applications, such as autonomous driving, dictates the need to certify the robustness of these models to semantic transformations. This is technically challenging as it requires a scalable verifier tailored to point cloud models that handles a wide range of semantic 3D transformations. In this work, we address this challenge and introduce 3DCertify, the first verifier able to certify robustness of point cloud models. 3DCertify is based on two key insights: (i) a generic relaxation based on first-order Taylor approximations, applicable to any differentiable transformation, and (ii) a precise relaxation for global feature pooling, which is more complex than pointwise activations (e.g., ReLU or sigmoid) but commonly employed in point cloud models. We demonstrate the effectiveness of 3DCertify by performing an extensive evaluation on a wide range of 3D transformations (e.g., rotation, twisting) for both classification and part segmentation tasks. For example, we can certify robustness against rotations by $\pm60^\circ$ for 95.7% of point clouds, and our max pool relaxation increases certification by up to 15.6%.