AITopics | iterative hard thresholding

Learning Sparse Distributions using Iterative Hard Thresholding

Neural Information Processing SystemsDec-25-2025, 01:21:23 GMT

Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.

iterative hard thresholding, learning sparse distribution, name change, (2 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

Learning Sparse Distributions using Iterative Hard Thresholding

Neural Information Processing SystemsMay-27-2025, 08:21:23 GMT

Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.

iht, iterative hard thresholding, learning sparse distribution

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Reviews: Learning Sparse Distributions using Iterative Hard Thresholding

Neural Information Processing SystemsJan-21-2025, 20:20:32 GMT

Post-rebuttal: I have downgraded my overall score to 7. I am troubled by the lack of motivation (and that in the rebuttal, the authors defer more discussion of model compression to future work). Also, I'd have liked to see in the rebuttal more details about the "more comprehensive discussion" regarding alternate algorithms. Apparently, this is the first work that studies this problem for general loss functions. The goal of the work is to adapt the well-known IHT algorithm, a form of projected gradient descent, to this problem. Again, as far as I can see, this approach is original to this work.

algorithm, iterative hard thresholding, learning sparse distribution, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

Add feedback

Learning Sparse Distributions using Iterative Hard Thresholding

Neural Information Processing SystemsOct-9-2024, 14:00:49 GMT

Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.

iht, iterative hard thresholding, learning sparse distribution

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Learning Sparse Distributions using Iterative Hard Thresholding

Zhang, Jacky Y., Khanna, Rajiv, Kyrillidis, Anastasios, Koyejo, Oluwasanmi O.

Neural Information Processing SystemsMar-18-2020, 23:17:02 GMT

Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.

iht, iterative hard thresholding, learning sparse distribution

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

Iterative Hard Thresholding for Low CP-rank Tensor Models

Grotheer, Rachel, Li, Shuang, Ma, Anna, Needell, Deanna, Qin, Jing

arXiv.org Machine LearningAug-22-2019

Recovery of low-rank matrices from a small number of linear measurements is now well-known to be possible under various model assumptions on the measurements. Such results demonstrate robustness and are backed with provable theoretical guarantees. However, extensions to tensor recovery have only recently began to be studied and developed, despite an abundance of practical tensor applications. Recently, a tensor variant of the Iterative Hard Thresholding method was proposed and theoretical results were obtained that guarantee exact recovery of tensors with low Tucker rank. In this paper, we utilize the same tensor version of the Restricted Isometry Property (RIP) to extend these results for tensors with low CANDECOMP/PARAFAC (CP) rank. In doing so, we leverage recent results on efficient approximations of CP decompositions that remove the need for challenging assumptions in prior works. We complement our theoretical findings with empirical results that showcase the potential of the approach.

artificial intelligence, machine learning, tensor, (17 more...)

arXiv.org Machine Learning

1908.08479

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Dual Iterative Hard Thresholding: From Non-convex Sparse Minimization to Non-smooth Concave Maximization

Liu, Bo, Yuan, Xiao-Tong, Wang, Lezi, Liu, Qingshan, Metaxas, Dimitris N.

arXiv.org Machine LearningJun-20-2017

Iterative Hard Thresholding (IHT) is a class of projected gradient descent methods for optimizing sparsity-constrained minimization models, with the best known efficiency and scalability in practice. As far as we know, the existing IHT-style methods are designed for sparse minimization in primal form. It remains open to explore duality theory and algorithms in such a non-convex and NP-hard problem setting. In this paper, we bridge this gap by establishing a duality theory for sparsity-constrained minimization with $\ell_2$-regularized loss function and proposing an IHT-style algorithm for dual maximization. Our sparse duality theory provides a set of sufficient and necessary conditions under which the original NP-hard/non-convex problem can be equivalently solved in a dual formulation. The proposed dual IHT algorithm is a super-gradient method for maximizing the non-smooth dual objective. An interesting finding is that the sparse recovery performance of dual IHT is invariant to the Restricted Isometry Property (RIP), which is required by virtually all the existing primal IHT algorithms without sparsity relaxation. Moreover, a stochastic variant of dual IHT is proposed for large-scale stochastic optimization. Numerical results demonstrate the superiority of dual IHT algorithms to the state-of-the-art primal IHT-style algorithms in model estimation accuracy and computational efficiency.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1703.00119

Country: Asia > China (0.28)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback