AITopics | iht

Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.

iterative hard thresholding, learning sparse distribution, name change, (2 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

5eaa54503005d9125ad6aa3044e912d8-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 04:06:26 GMT

computation, iteration, parameter vector, (16 more...)

Neural Information Processing Systems

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Learning Sparse Distributions using Iterative Hard Thresholding

Neural Information Processing SystemsOct-2-2025, 04:21:37 GMT

In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures.

artificial intelligence, machine learning, projection, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback

would like to address all concerns raised

Neural Information Processing SystemsOct-2-2025, 04:21:23 GMT

We would like to thank all of the reviewers for their valuable time and their constructive comments. We will incorporate the proposed minor corrections in the final version of the paper. On whether support set changes during iterations, we observe that in experiments (subsection 4.1) IHT changes support, We thank the reviewer for the supportive and constructive review. Regarding the comment in lines 198-202, we apologize for any confusion. Regarding variance in experiments, we have observed high variance is not enough for the algorithm to get "lucky".

artificial intelligence, machine learning, projection, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

A Recovery Guarantee for Sparse Neural Networks

Fridovich-Keil, Sara, Pilanci, Mert

arXiv.org Machine LearningSep-25-2025

We prove the first guarantees of sparse recovery for ReLU neural networks, where the sparse network weights constitute the signal to be recovered. Specifically, we study structural properties of the sparse network weights for two-layer, scalar-output networks under which a simple iterative hard thresholding algorithm recovers these weights exactly, using memory that grows linearly in the number of nonzero weights. We validate this theoretical result with simple experiments on recovery of sparse planted MLPs, MNIST classification, and implicit neural representations. Experimentally, we find performance that is competitive with, and often exceeds, a high-performing but memory-inefficient baseline based on iterative magnitude pruning.

assumption 2, mlp, probability, (17 more...)

arXiv.org Machine Learning

2509.20323

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning Sparse Distributions using Iterative Hard Thresholding

Neural Information Processing SystemsMay-27-2025, 08:21:23 GMT

Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.

iht, iterative hard thresholding, learning sparse distribution

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Deep greedy unfolding: Sorting out argsorting in greedy sparse recovery algorithms

Mohammad-Taheri, Sina, Colbrook, Matthew J., Brugiapaglia, Simone

arXiv.org Artificial IntelligenceMay-22-2025

Gradient-based learning imposes (deep) neural networks to be differentiable at all steps. This includes model-based architectures constructed by unrolling iterations of an iterative algorithm onto layers of a neural network, known as algorithm unrolling. However, greedy sparse recovery algorithms depend on the non-differentiable argsort operator, which hinders their integration into neural networks. In this paper, we address this challenge in Orthogonal Matching Pursuit (OMP) and Iterative Hard Thresholding (IHT), two popular representative algorithms in this class. We propose permutation-based variants of these algorithms and approximate permutation matrices using "soft" permutation matrices derived from softsort, a continuous relaxation of argsort. We demonstrate -- both theoretically and numerically -- that Soft-OMP and Soft-IHT, as differentiable counterparts of OMP and IHT and fully compatible with neural network training, effectively approximate these algorithms with a controllable degree of accuracy. This leads to the development of OMP- and IHT-Net, fully trainable network architectures based on Soft-OMP and Soft-IHT, respectively. Finally, by choosing weights as "structure-aware" trainable parameters, we connect our approach to structured sparse recovery and demonstrate its ability to extract latent sparsity patterns from data.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.15661

Country: