AITopics | compress

Collaborating Authors

compress

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Neural Information Processing SystemsDec-24-2025, 04:03:41 GMT

The pre-trained language models like BERT, though powerful in many natural language processing tasks, are both computation and memory expensive. To alleviate this problem, one approach is to compress them for specific tasks before deployment. However, recent works on BERT compression usually compress the large BERT model to a fixed smaller size, and can not fully satisfy the requirements of different edge devices with various hardware performances. In this paper, we propose a novel dynamic BERT model (abbreviated as DynaBERT), which can flexibly adjust the size and latency by selecting adaptive width and depth. The training process of DynaBERT includes first training a width-adaptive BERT and then allowing both adaptive width and depth, by distilling knowledge from the full-sized model to small sub-networks. Network rewiring is also used to keep the more important attention heads and neurons shared by more sub-networks. Comprehensive experiments under various efficiency constraints demonstrate that our proposed dynamic BERT (or RoBERTa) at its largest size has comparable performance as BERT-base (or RoBERTa-base), while at smaller widths and depths consistently outperforms existing BERT compression methods.

dynabert, dynamic bert, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Compressing Chemistry Reveals Functional Groups

Sharma, Ruben, King, Ross D.

arXiv.org Artificial IntelligenceNov-11-2025

We introduce the first formal large-scale assessment of the utility of traditional chemical functional groups as used in chemical explanations. Our assessment employs a fundamental principle from computational learning theory: a good explanation of data should also compress the data. We introduce an unsupervised learning algorithm based on the Minimum Message Length (MML) principle that searches for substructures that compress around three million biologically relevant molecules. We demonstrate that the discovered substructures contain most human-curated functional groups as well as novel larger patterns with more specific functions. We also run our algorithm on 24 specific bioactivity prediction datasets to discover dataset-specific functional groups. Fingerprints constructed from dataset-specific functional groups are shown to significantly outperform other fingerprint representations, including the MACCS and Morgan fingerprint, when training ridge regression models on bioactivity regression tasks.

artificial intelligence, machine learning, substructure, (19 more...)

arXiv.org Artificial Intelligence

2511.05728

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Materials > Chemicals (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

BaldWhisper: Faster Whisper with Head Shearing and Layer Merging

Sy, Yaya, Cerisara, Christophe, Illina, Irina

arXiv.org Artificial IntelligenceOct-13-2025

Pruning large pre-trained transformers for low-resource languages is challenging, as it often requires massive retraining data to recover performance. For instance, Distill-Whisper prunes Whisper by 40% and retrains on 21,000 hours of speech, far beyond what is available for most languages. Can Whisper be made lighter and faster for edge devices in data-scarce settings? Focusing on Bambara with only 32h of speech-to-text data, we propose a new pruning recipe. Instead of vocabulary pruning, which is unsuitable due to frequent code-switching by Bambara speakers, we compress the embeddings with low-rank decomposition and feature distillation. Rather than removing layers, we merge them to limit performance loss. The final model preserves 90% of the original performance while being 48% smaller and 2.15x faster on a MacBook Air M1.

decoder, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.08599

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.49)

Add feedback

8fc983a91396319d8c394084e2d749d7-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 05:21:37 GMT

We are grateful to the reviewers for their feedback. We address their concerns here. Reviewer #1: Thank you very much for your thoughtful and detailed review. We will include all your suggestions. We now extend eq. 3 in paper to have Reviewer #2: Thank you very much for your constructive feedback.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

PySHRED: A Python package for SHallow REcurrent Decoding for sparse sensing, model reduction and scientific discovery

Ye, David, Williams, Jan, Gao, Mars, Riva, Stefano, Tomasetto, Matteo, Zoro, David, Kutz, J. Nathan

arXiv.org Artificial IntelligenceJul-29-2025

PySHRED is a Python package that implements the SHallow REcurrent D ecoder (SHRED) architecture (Figure 1) and provides a high-level interface for sensing, model reduction and physics discovery tasks. Originally proposed as a sensing strategy which is agnostic to sensor placement [1], SHRED provides a lightweight, data-driven framework for reconstructing and forecasting high-dimensional spatiotemporal states from sparse sensor measurements. SHRED achieves this by (i) encoding time-lagged sensor sequences into a low-dimensional latent space using a sequence model, and (ii) decoding these latent representations back into the full spatial field via a decoder model. Since its introduction as a sparse sensing algorithm, several specialized variants have been developed to extend SHRED's capabilities: SHRED-ROM for parametric reduced-order modeling SINDy-SHRED for discovering sparse latent dynamics and stable long-horizon forecasting Multi-field SHRED for modeling dynamically coupled fields PySHRED unifies these variants into a single open-source, extensible, and thoroughly documented Python package, which is also capable of training on compressed representations of the data, allowing for efficient laptop-level training of models. It is accompanied by a rich example gallery of Jupyter Notebook and Google Colab tutorials.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.20954

Country: North America > United States > Washington > King County > Seattle (0.16)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Communications > Networks (0.66)

Add feedback

RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models

Tran, Hieu, Yao, Zonghai, Wang, Junda, Zhang, Yifan, Yang, Zhichao, Yu, Hong

arXiv.org Artificial IntelligenceDec-9-2024

This work introduces RARE (Retrieval-Augmented Reasoning Enhancement), a versatile extension to the mutual reasoning framework (rStar), aimed at enhancing reasoning accuracy and factual integrity across large language models (LLMs) for complex, knowledge-intensive tasks such as commonsense and medical reasoning. RARE incorporates two innovative actions within the Monte Carlo Tree Search (MCTS) framework: A6, which generates search queries based on the initial problem statement, performs information retrieval using those queries, and augments reasoning with the retrieved data to formulate the final answer; and A7, which leverages information retrieval specifically for generated sub-questions and re-answers these sub-questions with the relevant contextual information. Additionally, a Retrieval-Augmented Factuality Scorer is proposed to replace the original discriminator, prioritizing reasoning paths that meet high standards of factuality. Experimental results with LLaMA 3.1 show that RARE enables open-source LLMs to achieve competitive performance with top open-source models like GPT-4 and GPT-4o. This research establishes RARE as a scalable solution for improving LLMs in domains where logical coherence and factual integrity are critical.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.0283

Country:

North America > United States > Massachusetts > Middlesex County > Lowell (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Worcester County > Worcester (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.95)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Neural Information Processing SystemsOct-10-2024, 12:09:28 GMT

bert model, dynabert, dynamic bert, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Unified Low-rank Compression Framework for Click-through Rate Prediction

Yu, Hao, Fu, Minghao, Ding, Jiandong, Zhou, Yusheng, Wu, Jianxin

arXiv.org Artificial IntelligenceJun-11-2024

Deep Click-Through Rate (CTR) prediction models play an important role in modern industrial recommendation scenarios. However, high memory overhead and computational costs limit their deployment in resource-constrained environments. Low-rank approximation is an effective method for computer vision and natural language processing models, but its application in compressing CTR prediction models has been less explored. Due to the limited memory and computing resources, compression of CTR prediction models often confronts three fundamental challenges, i.e., (1). How to reduce the model sizes to adapt to edge devices? (2). How to speed up CTR prediction model inference? (3). How to retain the capabilities of original models after compression? Previous low-rank compression research mostly uses tensor decomposition, which can achieve a high parameter compression ratio, but brings in AUC degradation and additional computing overhead. To address these challenges, we propose a unified low-rank decomposition framework for compressing CTR prediction models. We find that even with the most classic matrix decomposition SVD method, our framework can achieve better performance than the original model. To further improve the effectiveness of our framework, we locally compress the output features instead of compressing the model weights. Our unified low-rank compression framework can be applied to embedding tables and MLP layers in various CTR prediction models. Extensive experiments on two academic datasets and one real industrial benchmark demonstrate that, with 3-5x model size reduction, our compressed models can achieve both faster inference and higher AUC than the uncompressed original models. Our code is at https://github.com/yuhao318/Atomic_Feature_Mimicking.

dataset, dimension, fine-tune 0, (12 more...)

arXiv.org Artificial Intelligence

2405.18146

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
Asia > China > Jiangsu Province > Nanjing (0.04)
(15 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference

Yang, Dongjie, Han, XiaoDong, Gao, Yan, Hu, Yao, Zhang, Shilin, Zhao, Hai

arXiv.org Artificial IntelligenceJun-5-2024

Large Language Models (LLMs) have shown remarkable comprehension abilities but face challenges in GPU memory usage during inference, hindering their scalability for real-time applications like chatbots. To accelerate inference, we store computed keys and values (KV cache) in the GPU memory. Existing methods study the KV cache compression to reduce memory by pruning the pre-computed KV cache. However, they neglect the inter-layer dependency between layers and huge memory consumption in pre-computation. To explore these deficiencies, we find that the number of crucial keys and values that influence future generations decreases layer by layer and we can extract them by the consistency in attention weights. Based on the findings, we propose PyramidInfer, a method that compresses the KV cache by layer-wise retaining crucial context. PyramidInfer saves significant memory by computing fewer keys and values without sacrificing performance. Experimental results show PyramidInfer improves 2.2x throughput compared to Accelerate with over 54% GPU memory reduction in KV cache.

cache, kv cache, pyramidinfer, (15 more...)

arXiv.org Artificial Intelligence

2405.12532

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Hybrid Continuum-Eversion Robot: Precise Navigation and Decontamination in Nuclear Environments using Vine Robot

Al-Dubooni, Mohammed, Wong, Cuebong, Althoefer, Kaspar

arXiv.org Artificial IntelligenceApr-19-2024

Soft growing vine robots show great potential for navigation and decontamination tasks in the nuclear industry. This paper introduces a novel hybrid continuum-eversion robot designed to address certain challenges in relation to navigating and operating within pipe networks and enclosed remote vessels. The hybrid robot combines the flexibility of a soft eversion robot with the precision of a continuum robot at its tip, allowing for controlled steering and movement in hard to access and/or complex environments. The design enables the delivery of sensors, liquids, and aerosols to remote areas, supporting remote decontamination activities. This paper outlines the design and construction of the robot and the methods by which it achieves selective steering. We also include a comprehensive review of current related work in eversion robotics, as well as other steering devices and actuators currently under research, which underpin this novel active steering approach. This is followed by an experimental evaluation that demonstrates the robot's real-world capabilities in delivering liquids and aerosols to remote locations. The experiments reveal successful outcomes, with over 95% success in precision spraying tests. The paper concludes by discussing future work alongside limitations in the current design, ultimately showcasing its potential as a solution for remote decontamination operations in the nuclear industry.

diameter, eversion robot, robot, (14 more...)

arXiv.org Artificial Intelligence

2404.13135

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy > Power Industry > Utilities > Nuclear (0.69)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback