AITopics | moefication

Collaborating Authors

moefication

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion Filip Szatkowski IDEAS NCBR Warsaw University of Technology Bartosz Wójcik

Neural Information Processing SystemsFeb-12-2026, 17:02:40 GMT

Finally, we develop an efficient implementation that translates these computational savings into actual wall-clock speedup.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe > Poland > Masovia Province > Warsaw (0.40)
Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications (0.71)

Add feedback

Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion Filip Szatkowski IDEAS NCBR Warsaw University of Technology Bartosz Wójcik

Neural Information Processing SystemsOct-10-2025, 01:40:35 GMT

Finally, we develop an efficient implementation that translates these computational savings into actual wall-clock speedup.

activation sparsity, experiment, moefication, (13 more...)

Neural Information Processing Systems

Country:

Europe > Poland > Masovia Province > Warsaw (0.40)
Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications (0.71)

Add feedback

Modularity in Transformers: Investigating Neuron Separability & Specialization

Pochinkov, Nicholas, Jones, Thomas, Rahman, Mohammed Rashidur

arXiv.org Artificial IntelligenceAug-30-2024

Transformer models are increasingly prevalent in various applications, yet our understanding of their internal workings remains limited. This paper investigates the modularity and task specialization of neurons within transformer architectures, focusing on both vision (ViT) and language (Mistral 7B) models. Using a combination of selective pruning and MoEfication clustering techniques, we analyze the overlap and specialization of neurons across different tasks and data subsets. Our findings reveal evidence of task-specific neuron clusters, with varying degrees of overlap between related tasks. We observe that neuron importance patterns persist to some extent even in randomly initialized models, suggesting an inherent structure that training refines. Additionally, we find that neuron clusters identified through MoEfication correspond more strongly to task-specific neurons in earlier and later layers of the models. This work contributes to a more nuanced understanding of transformer internals and offers insights into potential avenues for improving model interpretability and efficiency.

arxiv preprint arxiv, neuron, overlap, (15 more...)

arXiv.org Artificial Intelligence

2408.17324

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education > Curriculum > Subject-Specific Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Learn To be Efficient: Build Structured Sparsity in Large Language Models

Zheng, Haizhong, Bai, Xiaoyan, Chen, Beidi, Lai, Fan, Prakash, Atul

arXiv.org Artificial IntelligenceFeb-8-2024

Large Language Models (LLMs) have achieved remarkable success with their billion-level parameters, yet they incur high inference overheads. The emergence of activation sparsity in LLMs provides a natural approach to reduce this cost by involving only parts of the parameters for inference. Existing methods only focus on utilizing this naturally formed activation sparsity, overlooking the potential for further amplifying this inherent sparsity. In this paper, we hypothesize that LLMs can learn to be efficient by achieving more structured activation sparsity.To achieve this, we introduce a novel algorithm, Learn-To-be-Efficient (LTE), designed to train efficiency-aware LLMs to learn to activate fewer neurons and achieve a better trade-off between sparsity and performance. Furthermore, unlike SOTA MoEfication methods, which mainly focus on ReLU-based models, LTE can also be applied to LLMs like GPT and LLaMA with soft activation functions. We evaluate LTE on four models and eleven datasets. The experiments show that LTE achieves a better trade-off between sparsity and task performance. For instance, LTE with LLaMA provides a 1.83x-2.59x FLOPs speed-up on language generation tasks, outperforming the state-of-the-art methods.

dataset, neuron, sparsity, (13 more...)

arXiv.org Artificial Intelligence

2402.06126

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Michigan (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models

Zhao, Weilin, Huang, Yuxiang, Han, Xu, Liu, Zhiyuan, Zhang, Zhengyan, Sun, Maosong

arXiv.org Artificial IntelligenceNov-15-2023

Parameter-efficient tuning (PET) has been widely explored in recent years because it tunes much fewer parameters (PET modules) than full-parameter fine-tuning (FT) while still stimulating sufficient knowledge from large language models (LLMs) for downstream tasks. Moreover, when PET is employed to serve multiple tasks, different task-specific PET modules can be built on a frozen LLM, avoiding redundant LLM deployments. Although PET significantly reduces the cost of tuning and deploying LLMs, its inference still suffers from the computational bottleneck of LLMs. To address the above issue, we propose an effective PET framework based on compressed LLMs, named "CPET". In CPET, we evaluate the impact of mainstream LLM compression techniques on PET performance and then introduce knowledge inheritance and recovery strategies to restore the knowledge loss caused by these compression techniques. Our experimental results demonstrate that, owing to the restoring strategies of CPET, collaborating task-specific PET modules with a compressed LLM can achieve comparable performance to collaborating PET modules with the original version of the compressed LLM and outperform directly applying vanilla PET methods to the compressed LLM.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2307.07705

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Exploiting Transformer Activation Sparsity with Dynamic Inference

Piórczyński, Mikołaj, Szatkowski, Filip, Bałazy, Klaudia, Wójcik, Bartosz

arXiv.org Artificial IntelligenceOct-6-2023

At the same time, previous studies have revealed significant activation sparsity in these models, indicating the presence of redundant computations. In this paper, we propose Dynamic Sparsified Transformer Inference (DSTI), a method that radically reduces the inference cost of Transformer models by enforcing activation sparsity and subsequently transforming a dense model into its sparse Mixture of Experts (MoE) version. We demonstrate that it is possible to train small gating networks that successfully predict the relative contribution of each expert during inference. Furthermore, we introduce a mechanism that dynamically determines the number of executed experts individually for each token. DSTI can be applied to any Transformer-based architecture and has negligible impact on the accuracy. For the BERT-base classification model, we reduce inference cost by almost 60%.

activation sparsity, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2310.04361

Country:

Europe > Poland > Masovia Province > Warsaw (0.05)
Europe > Finland > North Karelia > Joensuu (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback