Goto

Collaborating Authors

 iniclr


Revisiting Hilbert Schmidt Information Bottleneck for Adversarial Robustness

Neural Information Processing Systems

We investigate the HSIC (Hilbert-Schmidt independence criterion) bottleneck as a regularizer for learning an adversarially robust deep neural network classifier. In addition to the usual cross-entropy loss, we add regularization terms for every intermediate layer to ensure that the latent representations retain useful information for output prediction while reducing redundant information. We show that the HSIC bottleneck enhances robustness to adversarial attacks both theoretically and experimentally. In particular, we prove that the HSIC bottleneck regularizer reduces the sensitivity of the classifier to adversarial examples. Our experiments on multiple benchmark datasets and architectures demonstrate that incorporating an HSIC bottleneck regularizer attains competitive natural accuracy and improves adversarial robustness, both with and without adversarial examples during training. Our code and adversarially robust models are publicly available.2


Self-Routing Capsule Networks

Neural Information Processing Systems

In this work, we propose a novel and surprisingly simple routing strategy called self-routing, where each capsule is routed independently by its subordinate routing network. Therefore, the agreement between capsules is not required anymore, but both poses and activations of upper-level capsules are obtained in a way similar to Mixture-of-Experts. Our experiments on CIFAR10, SVHN, and SmallNORB showthat the self-routing performs more robustly against white-box adversarial attacks and affine transformations, requiring less computation.



DebiasingGraphNeuralNetworksviaLearning DisentangledCausalSubstructure

Neural Information Processing Systems

With the disentangled representations, we synthesize the counterfactual unbiased training samples to further decorrelate causal and bias variables.




DiscoveringSparsityAllocationforLayer-wise PruningofLargeLanguageModels

Neural Information Processing Systems

In this paper, we present DSA, the first automated framework for discovering sparsity allocation schemes for layer-wise pruning in Large Language Models (LLMs). LLMs have become increasingly powerful, but their large parameter counts make them computationally expensive. Existing pruning methods for compressing LLMs primarily focus on evaluating redundancies and removing element-wise weights. However, these methods fail to allocate adaptive layerwise sparsities, leading to performance degradation in challenging tasks.



Tree-to-tree Neural Networks for Program Translation

Neural Information Processing Systems

Program translation isanimportant tool tomigrate legacycode inone language into an ecosystem built in a different language. In this work, we are the first to employ deep neural networks toward tackling this problem.