AITopics | continuous sparsification

Collaborating Authors

continuous sparsification

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

83004190b1793d7aa15f8d0d49a13eba-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 04:44:30 GMT

iterative mag, test accuracy, ticket, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

83004190b1793d7aa15f8d0d49a13eba-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 04:44:22 GMT

Recently, combining these two strategies has lead to new methods which discover efficient architectures through optimization instead of design [7, 8].

artificial intelligence, machine learning, subnetwork, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Winning the Lottery with Continuous Sparsification

Neural Information Processing SystemsDec-24-2025, 06:06:32 GMT

The search for efficient, sparse deep neural network models is most prominently performed by pruning: training a dense, overparameterized network and removing parameters, usually via following a manually-crafted heuristic. Additionally, the recent Lottery Ticket Hypothesis conjectures that, for a typically-sized neural network, it is possible to find small sub-networks which, when trained from scratch on a comparable budget, match the performance of the original dense counterpart. We revisit fundamental aspects of pruning algorithms, pointing out missing ingredients in previous approaches, and develop a method, Continuous Sparsification, which searches for sparse networks based on a novel approximation of an intractable $\ell_0$ regularization. We compare against dominant heuristic-based methods on pruning as well as ticket search -- finding sparse subnetworks that can be successfully re-trained from an early iterate. Empirical results show that we surpass the state-of-the-art for both objectives, across models and datasets, including VGG trained on CIFAR-10 and ResNet-50 trained on ImageNet. In addition to setting a new standard for pruning, Continuous Sparsification also offers fast parallel ticket search, opening doors to new applications of the Lottery Ticket Hypothesis.

continuous sparsification, lottery, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

83004190b1793d7aa15f8d0d49a13eba-Supplemental.pdf

Neural Information Processing SystemsAug-14-2025, 22:38:43 GMT

iterative mag, test accuracy, ticket, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Winning the Lottery with Continuous Sparsification

Neural Information Processing SystemsAug-14-2025, 22:38:35 GMT

CIFAR-10 and ResNet-50 trained on ImageNet.

pruning, subnetwork, ticket search, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)

Industry: Leisure & Entertainment (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Winning the Lottery with Continuous Sparsification

Neural Information Processing SystemsOct-10-2024, 16:14:48 GMT

The search for efficient, sparse deep neural network models is most prominently performed by pruning: training a dense, overparameterized network and removing parameters, usually via following a manually-crafted heuristic. Additionally, the recent Lottery Ticket Hypothesis conjectures that, for a typically-sized neural network, it is possible to find small sub-networks which, when trained from scratch on a comparable budget, match the performance of the original dense counterpart. We revisit fundamental aspects of pruning algorithms, pointing out missing ingredients in previous approaches, and develop a method, Continuous Sparsification, which searches for sparse networks based on a novel approximation of an intractable \ell_0 regularization. We compare against dominant heuristic-based methods on pruning as well as ticket search -- finding sparse subnetworks that can be successfully re-trained from an early iterate. Empirical results show that we surpass the state-of-the-art for both objectives, across models and datasets, including VGG trained on CIFAR-10 and ResNet-50 trained on ImageNet.

continuous sparsification, lottery, ticket search, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models

Mondorf, Philipp, Wold, Sondre, Plank, Barbara

arXiv.org Artificial IntelligenceOct-2-2024

A fundamental question in interpretability research is to what extent neural networks, particularly language models, implement reusable functions via subnetworks that can be composed to perform more complex tasks. Recent developments in mechanistic interpretability have made progress in identifying subnetworks, often referred to as circuits, which represent the minimal computational subgraph responsible for a model's behavior on specific tasks. However, most studies focus on identifying circuits for individual tasks without investigating how functionally similar circuits relate to each other. To address this gap, we examine the modularity of neural networks by analyzing circuits for highly compositional subtasks within a transformer-based language model. Specifically, given a probabilistic context-free grammar, we identify and compare circuits responsible for ten modular string-edit operations. Our results indicate that functionally similar circuits exhibit both notable node overlap and cross-task faithfulness. Moreover, we demonstrate that the circuits identified can be reused and combined through subnetwork set operations to represent more complex functional capabilities of the model. Neural networks can be effectively modeled as causal graphs that illustrate how inputs are mapped to the output space (Mueller et al., 2024). For instance, the feed-forward and attention modules within the Transformer architecture (Vaswani et al., 2017) can be interpreted as a series of causal nodes that guide the transformation from input to output via the residual stream (Ferrando et al., 2024). This abstraction is commonly used in mechanistic interpretability to identify computational subgraphs, or circuits, responsible for the network's behavior on specific tasks (Wang et al., 2023). Circuits are typically identified through causal mediation analysis, which quantifies the causal influence of model components on the network's predictions (Mueller et al., 2024). However, a notable limitation of existing studies is their focus on identifying circuits for isolated, individual tasks. Few studies compare circuits responsible for different functional behaviors of the model, and those that do primarily focus on tasks with limited cross-functional similarity (Hanna et al., 2024b).

ablation, opération, string-edit operation, (14 more...)

arXiv.org Artificial Intelligence

2410.01434

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Singapore (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Growing Efficient Deep Networks by Structured Continuous Sparsification

Yuan, Xin, Savarese, Pedro, Maire, Michael

arXiv.org Machine LearningJul-30-2020

We develop an approach to training deep networks while dynamically adjusting their architecture, driven by a principled combination of accuracy and sparsity objectives. Unlike conventional pruning approaches, our method adopts a gradual continuous relaxation of discrete network structure optimization and then samples sparse subnetworks, enabling efficient deep networks to be trained in a growing and pruning manner. Extensive experiments across CIFAR-10, ImageNet, PASCAL VOC, and Penn Treebank, with convolutional models for image classification and semantic segmentation, and recurrent models for language modeling, show that our training scheme yields efficient networks that are smaller and more accurate than those produced by competing pruning methods.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2007.15353

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Winning the Lottery with Continuous Sparsification

Savarese, Pedro, Silva, Hugo, Maire, Michael

arXiv.org Machine LearningDec-9-2019

The Lottery Ticket Hypothesis from Frankle & Carbin (2019) conjectures that, for typically-sized neural networks, it is possible to find small sub-networks which train faster and yield superior performance than their original counterparts. The proposed algorithm to search for "winning tickets", Iterative Magnitude Pruning, consistently finds sub-networks with $90-95\%$ less parameters which train faster and better than the overparameterized models they were extracted from, creating potential applications to problems such as transfer learning. In this paper, we propose Continuous Sparsification, a new algorithm to search for winning tickets which continuously removes parameters from a network during training, and learns the sub-network's structure with gradient-based methods instead of relying on pruning strategies. We show empirically that our method is capable of finding tickets that outperforms the ones learned by Iterative Magnitude Pruning, and at the same time providing faster search, when measured in number of training epochs or wall-clock time.

iteration, sparsity, ticket, (12 more...)

arXiv.org Machine Learning

1912.04427

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Research Report (0.82)
Contests & Prizes (0.79)

Industry: Leisure & Entertainment (0.79)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback