Goto

Collaborating Authors

 rotational invariance




Achieving Rotational Invariance with Bessel-Convolutional Neural Networks

Neural Information Processing Systems

For many applications in image analysis, learning models that are invariant to translations and rotations is paramount. This is the case, for example, in medical imaging where the objects of interest can appear at arbitrary positions, with arbitrary orientations. As of today, Convolutional Neural Networks (CNN) are one of the most powerful tools for image analysis. They achieve, thanks to convolutions, an invariance with respect to translations. In this work, we present a new type of convolutional layer that takes advantage of Bessel functions, well known in physics, to build Bessel-CNNs (B-CNNs) that are invariant to all the continuous set of possible rotation angles by design.


Change-of-Basis Pruning via Rotational Invariance

Ning, Alex, Rangaraju, Vainateya

arXiv.org Artificial Intelligence

Structured pruning removes entire neurons or channels, but its effectiveness depends on how importance is distributed across the representation space. Change-of-basis (CoB) pruning addresses this challenge by applying orthogonal linear transformations that concentrate importance within certain dimensions. However, many standard deep learning architectures are not inherently invariant to such transformations. To enable compatibility, we introduce two-subspace radial activations (TSRAs): an activation family that is invariant to orthogonal linear transformations applied independently within its two activation subspaces. This invariance allows CoB transformations to be merged into surrounding weights without incurring extra parameters. We position this work as a proof-of-concept that a rotationally invariant design may offer a principled approach towards change-of-basis pruning. We do not provide an analysis of multiple TSRA candidates nor do we explore weight initialization for any TSRAs. These limitations, combined with other necessary modifications we make to permit rotational invariance, result in a slight accuracy drop of $4.52\%$ compared to a ReLU-based control. However, using activation-magnitude importance, VGG-16 implementing our CoB+TSRA framework shows encouraging results on CIFAR-10. Under fixed-ratio structured pruning, CoB improves accuracy over a TSRA baseline at all pruning ratios and extends reliable pruning frontier from roughly $30\%$ to $70\%$ of parameters without post-prune fine tuning. Under threshold-based pruning strategies, CoB prunes $90-96\%$ of parameters while maintaining $1-6\%$ accuracy drop after fine-tuning. Together, these results indicate that rotationally invariant architectures may offer a promising path towards CoB pruning.


Achieving Rotational Invariance with Bessel-Convolutional Neural Networks

Neural Information Processing Systems

As of today, Convolutional Neural Networks (CNN) are one of the most powerful tools for image analysis. They achieve, thanks to convolutions, an invariance with respect to translations.



Rotational Sampling: A Plug-and-Play Encoder for Rotation-Invariant 3D Molecular GNNs

Jin, Dian

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) have achieved remarkable success in molecular property prediction. However, traditional graph representations struggle to effectively encode the inherent 3D spatial structures of molecules, as molecular orientations in 3D space introduce significant variability, severely limiting model generalization and robustness. Existing approaches primarily focus on rotation-invariant and rotation-equivariant methods. Invariant methods often rely heavily on prior knowledge and lack sufficient generalizability, while equivariant methods suffer from high computational costs. To address these limitations, this paper proposes a novel plug-and-play 3D encoding module leveraging rotational sampling. By computing the expectation over the SO(3) rotational group, the method naturally achieves approximate rotational invariance. Furthermore, by introducing a carefully designed post-alignment strategy, strict invariance can be achieved without compromising performance. Experimental evaluations on the QM9 and C10 Datasets demonstrate superior predictive accuracy, robustness, and generalization performance compared to existing methods. Moreover, the proposed approach maintains low computational complexity and enhanced interpretability, providing a promising direction for efficient and effective handling of 3D molecular information in drug discovery and material design.


ROSAQ: Rotation-based Saliency-Aware Weight Quantization for Efficiently Compressing Large Language Models

Yoon, Junho, Lee, Geom, Jeon, Donghyeon, Kang, Inho, Na, Seung-Hoon

arXiv.org Artificial Intelligence

Quantization has been widely studied as an effective technique for reducing the memory requirement of large language models (LLMs), potentially improving the latency time as well. Utilizing the characteristic of rotational invariance of transformer, we propose the rotation-based saliency-aware weight quantization (ROSAQ), which identifies salient channels in the projection feature space, not in the original feature space, where the projected "principal" dimensions are naturally considered as "salient" features. The proposed ROSAQ consists of 1) PCA-based projection, which first performs principal component analysis (PCA) on a calibration set and transforms via the PCA projection, 2) Salient channel dentification, which selects dimensions corresponding to the K-largest eigenvalues as salient channels, and 3) Saliency-aware quantization with mixed-precision, which uses FP16 for salient dimensions and INT3/4 for other dimensions. Experiment results show that ROSAQ shows improvements over the baseline saliency-aware quantization on the original feature space and other existing quantization methods. With kernel fusion, ROSAQ presents about 2.3x speed up over FP16 implementation in generating 256 tokens with a batch size of 64.


Achieving Rotational Invariance with Bessel-Convolutional Neural Networks

Neural Information Processing Systems

For many applications in image analysis, learning models that are invariant to translations and rotations is paramount. This is the case, for example, in medical imaging where the objects of interest can appear at arbitrary positions, with arbitrary orientations. As of today, Convolutional Neural Networks (CNN) are one of the most powerful tools for image analysis. They achieve, thanks to convolutions, an invariance with respect to translations. In this work, we present a new type of convolutional layer that takes advantage of Bessel functions, well known in physics, to build Bessel-CNNs (B-CNNs) that are invariant to all the continuous set of possible rotation angles by design.


ReFeree: Radar-Based Lightweight and Robust Localization using Feature and Free space

Kim, Hogyun, Choi, Byunghee, Choi, Euncheol, Cho, Younggun

arXiv.org Artificial Intelligence

Place recognition plays an important role in achieving robust long-term autonomy. Real-world robots face a wide range of weather conditions (e.g. overcast, heavy rain, and snowing) and most sensors (i.e. camera, LiDAR) essentially functioning within or near-visible electromagnetic waves are sensitive to adverse weather conditions, making reliable localization difficult. In contrast, radar is gaining traction due to long electromagnetic waves, which are less affected by environmental changes and weather independence. In this work, we propose a radar-based lightweight and robust place recognition. We achieve rotational invariance and lightweight by selecting a one-dimensional ring-shaped description and robustness by mitigating the impact of false detection utilizing opposite noise characteristics between free space and feature. In addition, the initial heading can be estimated, which can assist in building a SLAM pipeline that combines odometry and registration, which takes into account onboard computing. The proposed method was tested for rigorous validation across various scenarios (i.e. single session, multi-session, and different weather conditions). In particular, we validate our descriptor achieving reliable place recognition performance through the results of extreme environments that lacked structural information such as an OORD dataset.