AITopics | pruning pattern

Collaborating Authors

pruning pattern

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From 2:4 to 8:16 sparsity patterns in LLMs for Outliers and Weights with Variance Correction

Maximov, Egor, Kuzkina, Yulia, Kanametov, Azamat, Prutko, Alexander, Goncharov, Aleksei, Zhelnin, Maxim, Shvetsov, Egor

arXiv.org Artificial IntelligenceJul-8-2025

As large language models (LLMs) grow in size, efficient compression techniques like quantization and sparsification are critical. While quantization maintains performance with reduced precision, structured sparsity methods, such as N:M sparsification, often fall short due to limited flexibility, and sensitivity to outlier weights. We explore 8:16 semi-structured sparsity, demonstrating its ability to surpass the Performance Threshold-where a compressed model matches the accuracy of its uncompressed or smaller counterpart under equivalent memory constraints. Compared to 2:4 sparsity, 8:16 offers greater flexibility with minimal storage overhead (0.875 vs. 0.75 bits/element). We also apply sparse structured patterns for salient weights, showing that structured sparsity for outliers is competitive with unstructured approaches leading to equivalent or better results. Finally, we demonstrate that simple techniques such as variance correction and SmoothQuant like weight equalization improve sparse models performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.03052

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

EvoP: Robust LLM Inference via Evolutionary Pruning

Wu, Shangyu, Du, Hongchao, Xiong, Ying, Chen, Shuai, Kuo, Tei-wei, Guan, Nan, Xue, Chun Jason

arXiv.org Artificial IntelligenceFeb-19-2025

Large Language Models (LLMs) have achieved remarkable success in natural language processing tasks, but their massive size and computational demands hinder their deployment in resource-constrained environments. Existing structured pruning methods address this issue by removing redundant structures (e.g., elements, channels, layers) from the model. However, these methods employ a heuristic pruning strategy, which leads to suboptimal performance. Besides, they also ignore the data characteristics when pruning the model. To overcome these limitations, we propose EvoP, an evolutionary pruning framework for robust LLM inference. EvoP first presents a cluster-based calibration dataset sampling (CCDS) strategy for creating a more diverse calibration dataset. EvoP then introduces an evolutionary pruning pattern searching (EPPS) method to find the optimal pruning pattern. Compared to existing structured pruning techniques, EvoP achieves the best performance while maintaining the best efficiency. Experiments across different LLMs and different downstream tasks validate the effectiveness of the proposed EvoP, making it a practical and scalable solution for deploying LLMs in real-world applications.

calibration dataset, pruning pattern, sparsity, (14 more...)

arXiv.org Artificial Intelligence

2502.1491

Country:

Europe > Austria > Vienna (0.15)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
(9 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Multiobjective Evolutionary Pruning of Deep Neural Networks with Transfer Learning for improving their Performance and Robustness

Poyatos, Javier, Molina, Daniel, Martínez, Aitor, Del Ser, Javier, Herrera, Francisco

arXiv.org Artificial IntelligenceFeb-20-2023

Evolutionary Computation algorithms have been used to solve optimization problems in relation with architectural, hyper-parameter or training configuration, forging the field known today as Neural Architecture Search. These algorithms have been combined with other techniques such as the pruning of Neural Networks, which reduces the complexity of the network, and the Transfer Learning, which lets the import of knowledge from another problem related to the one at hand. The usage of several criteria to evaluate the quality of the evolutionary proposals is also a common case, in which the performance and complexity of the network are the most used criteria. This work proposes MO-EvoPruneDeepTL, a multi-objective evolutionary pruning algorithm. \proposal uses Transfer Learning to adapt the last layers of Deep Neural Networks, by replacing them with sparse layers evolved by a genetic algorithm, which guides the evolution based in the performance, complexity and robustness of the network, being the robustness a great quality indicator for the evolved models. We carry out different experiments with several datasets to assess the benefits of our proposal. Results show that our proposal achieves promising results in all the objectives, and direct relation are presented among them. The experiments also show that the most influential neurons help us explain which parts of the input images are the most relevant for the prediction of the pruned neural network. Lastly, by virtue of the diversity within the Pareto front of pruning patterns produced by the proposal, it is shown that an ensemble of differently pruned models improves the overall performance and robustness of the trained networks.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.10253

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

EVE: Environmental Adaptive Neural Network Models for Low-power Energy Harvesting System

Islam, Sahidul, Zhou, Shanglin, Ran, Ran, Jin, Yufang, Wen, Wujie, Ding, Caiwen, Xie, Mimi

arXiv.org Artificial IntelligenceSep-26-2022

However, when IoT devices are increasingly being implemented with neural network DNN models come to on-board, there is a grand challenge to accommodate models to enable smart applications. Energy harvesting (EH) the giant models to tiny IoT devices with limited memory technology that harvests energy from ambient environment is a and computing resources [3, 11-13, 20, 22]. Particularly, first, embedded promising alternative to batteries for powering those devices due IoT devices have limited computational units and low CPU to the low maintenance cost and wide availability of the energy frequency (e.g., 1-16MHZ). Since DNNs are computationally expensive, sources. However, the power provided by the energy harvester is DNN algorithm takes long on-board execution time. Second, low and has an intrinsic drawback of instability since it varies with embedded IoT devices are equipped with small memory (e.g., hundreds the ambient environment. This paper proposes EVE, an automated of KBs) which can not even afford tiny DNN models (e.g., machine learning (autoML) co-exploration framework to search Tens of MBs). Third, these battery-powered devices naturally have for desired multi-models with shared weights for energy harvesting a limited standby time.

artificial intelligence, machine learning, sparsity, (15 more...)

arXiv.org Artificial Intelligence

2207.09258

Country:

North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Energy > Energy Storage (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback