AITopics | structured sparsity

Collaborating Authors

structured sparsity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

Anima Anandkumar, Daniel J. Hsu, Majid Janzamin, Sham M. Kakade

Neural Information Processing SystemsOct-3-2025, 08:26:08 GMT

Neural Information Processing Systems http://nips.cc/

overcomplete topic model identifiable, structured sparsity, tensor tucker decomposition, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.40)

Add feedback

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

Neural Information Processing SystemsSep-30-2025, 11:23:45 GMT

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime, where the number of latent topics can greatly exceed the size of the observed word vocabulary. While general overcomplete topic models are not identifiable, we establish {\em generic} identifiability under a constraint, referred to as {\em topic persistence}. Our sufficient conditions for identifiability involve a novel set of higher order'' expansion conditions on the {\em topic-word matrix} or the {\em population structure} of the model. This set of higher-order expansion conditions allow for overcomplete models, and require the existence of a perfect matching from latent topics to higher order observed words. We establish that random structured topic models are identifiable w.h.p. in the overcomplete regime. Our identifiability results allow for general (non-degenerate) distributions for modeling the topic proportions, and thus, we can handle arbitrarily correlated topics in our framework. Our identifiability results imply uniqueness of a class of tensor decompositions with structured sparsity which is contained in the class of {\em Tucker} decompositions, but is more general than the {\em Candecomp/Parafac} (CP) decomposition.

name change, overcomplete topic model identifiable, tensor tucker decomposition, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

S {2} FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity

Neural Information Processing SystemsMay-27-2025, 04:41:48 GMT

Current PEFT methods for LLMs can achieve high quality, efficient training, or scalable serving, but not all three simultaneously. To address this limitation, we investigate sparse fine-tuning and observe a remarkable improvement in generalization ability. Utilizing this key insight, we propose a family of Structured Sparse Fine-Tuning (S { 2} FT) methods for LLMs, which concurrently achieve state-of-the-art fine-tuning performance, training efficiency, and inference scalability. S { 2} FT accomplishes this by "selecting sparsely and computing densely". Based on the coupled structures in LLMs, \model selects a few attention heads and channels in the MHA and FFN modules for each Transformer block, respectively.

artificial intelligence, large language model, natural language, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic Segmentation

Ozaydin, Baran, Zhang, Tong, Bhattacharjee, Deblina, Süsstrunk, Sabine, Salzmann, Mathieu

arXiv.org Artificial IntelligenceApr-5-2024

Unsupervised Semantic Segmentation (USS) involves segmenting images without relying on predefined labels, aiming to alleviate the burden of extensive human labeling. Existing methods utilize features generated by self-supervised models and specific priors for clustering. However, their clustering objectives are not involved in the optimization of the features during training. Additionally, due to the lack of clear class definitions in USS, the resulting segments may not align well with the clustering objective. In this paper, we introduce a novel approach called Optimally Matched Hierarchy (OMH) to simultaneously address the above issues. The core of our method lies in imposing structured sparsity on the feature space, which allows the features to encode information with different levels of granularity. The structure of this sparsity stems from our hierarchy (OMH). To achieve this, we learn a soft but sparse hierarchy among parallel clusters through Optimal Transport. Our OMH yields better unsupervised segmentation performance compared to existing USS methods. Our extensive experiments demonstrate the benefits of OMH when utilizing our differentiable paradigm. We will make our code publicly available.

hierarchy, optimally matched hierarchy, sparsity, (11 more...)

arXiv.org Artificial Intelligence

2403.06546

Country:

Europe > Germany > Brandenburg > Potsdam (0.05)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)

Add feedback

A Family of Penalty Functions for Structured Sparsity

Neural Information Processing SystemsFeb-16-2024, 10:20:06 GMT

We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern. We present a family of convex penalty functions, which encode this prior knowledge by means of a set of constraints on the absolute values of the regression coefficients. This family subsumes the \ell_1 norm and is flexible enough to include different models of sparsity patterns, which are of practical and theoretical importance. We establish some important properties of these functions and discuss some examples where they can be computed explicitly. Moreover, we present a convergent optimization algorithm for solving regularized least squares with these penalty functions.

penalty function, sparsity pattern, structured sparsity

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Network Flow Algorithms for Structured Sparsity

Neural Information Processing SystemsFeb-16-2024, 09:52:09 GMT

We consider a class of learning problems that involve a structured sparsity-inducing norm defined as the sum of \ell_\infty -norms over groups of variables. Whereas a lot of effort has been put in developing fast optimization methods when the groups are disjoint or embedded in a specific hierarchical structure, we address here the case of general overlapping groups. To this end, we show that the corresponding optimization problem is related to network flow optimization. More precisely, the proximal problem associated with the norm we consider is dual to a quadratic min-cost flow problem. We propose an efficient procedure which computes its solution exactly in polynomial time.

network flow algorithm, structured sparsity

Neural Information Processing Systems

Technology:

Information Technology > Communications > Networks (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)
Information Technology > Artificial Intelligence > Machine Learning (0.45)

Add feedback

Dynamic Sparse Training with Structured Sparsity

Lasby, Mike, Golubeva, Anna, Evci, Utku, Nica, Mihai, Ioannou, Yani

arXiv.org Artificial IntelligenceOct-3-2023

Dynamic Sparse Training (DST) methods achieve state-of-the-art results in sparse neural network training, matching the generalization of dense models while enabling sparse training and inference. Although the resulting models are highly sparse and theoretically less computationally expensive, achieving speedups with unstructured sparsity on real-world hardware is challenging. In this work, we propose a sparse-to-sparse DST method, Structured RigL (SRigL), to learn a variant of fine-grained structured N:M sparsity by imposing a constant fan-in constraint. Using our empirical analysis of existing DST methods at high sparsity, we additionally employ a neuron ablation method which enables SRigL to achieve state-of-the-art sparse-to-sparse structured DST performance on a variety of Neural Network (NN) architectures. We demonstrate reduced real-world timings on CPU for online inference -- 3.6x/2x faster at 90% sparsity than equivalent dense/unstructured sparse layers, respectively. Our source code is available at https://github.com/calgaryml/condensed-sparsity

dynamic sparse training, structured sparsity

arXiv.org Artificial Intelligence

2305.02299

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)

Add feedback

Factorized Latent Spaces with Structured Sparsity

Neural Information Processing SystemsApr-6-2023, 13:33:55 GMT

Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities. Unfortunately, these approaches involve minimizing non-convex objective functions. In this paper, we propose an approach to learning such factorized representations inspired by sparse coding techniques. In particular, we show that structured sparsity allows us to address the multi-view learning problem by alternately solving two convex optimization problems. Furthermore, the resulting factorized latent spaces generalize over existing approaches in that they allow:having latent dimensions shared between any subset of the views instead of between all the views only.

factorized latent space, structured sparsity

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Modelling Heterogeneity Using Bayesian Structured Sparsity

Goplerud, Max

arXiv.org Machine LearningMar-29-2021

How to estimate heterogeneity, e.g. the effect of some variable differing across observations, is a key question in political science. Methods for doing so make simplifying assumptions about the underlying nature of the heterogeneity to draw reliable inferences. This paper allows a common way of simplifying complex phenomenon (placing observations with similar effects into discrete groups) to be integrated into regression analysis. The framework allows researchers to (i) use their prior knowledge to guide which groups are permissible and (ii) appropriately quantify uncertainty. The paper does this by extending work on "structured sparsity" from a traditional penalized likelihood approach to a Bayesian one by deriving new theoretical results and inferential techniques. It shows that this method outperforms state-of-the-art methods for estimating heterogeneous effects when the underlying heterogeneity is grouped and more effectively identifies groups of observations with different effects in observational data.

heterogeneity, posterior, sparsity, (15 more...)

arXiv.org Machine Learning

2103.15919

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Government > Regional Government (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Family of Penalty Functions for Structured Sparsity

Morales, Jean, Micchelli, Charles A., Pontil, Massimiliano

Neural Information Processing SystemsFeb-15-2020, 02:28:18 GMT

We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern. We present a family of convex penalty functions, which encode this prior knowledge by means of a set of constraints on the absolute values of the regression coefficients. This family subsumes the $\ell_1$ norm and is flexible enough to include different models of sparsity patterns, which are of practical and theoretical importance. We establish some important properties of these functions and discuss some examples where they can be computed explicitly. Moreover, we present a convergent optimization algorithm for solving regularized least squares with these penalty functions.

penalty function, sparsity pattern, structured sparsity

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback