Efficient Adaptive Transformer: An Empirical Study and Reproducible Framework

Oct-16-2025–arXiv.org Artificial Intelligence

The Efficient Adaptive Transformer (EAT) framework unifies three adaptive efficiency techniques - progressive token pruning, sparse attention, and dynamic early exiting - into a single, reproducible architecture for input-adaptive inference. EAT provides an open-source benchmarking pipeline that automates data processing, timing, and ablation across GLUE tasks (SST-2, QQP, MNLI). Although this empirical study finds that combining these mechanisms can increase latency in shallow six-layer models, it demonstrates that EAT achieves slightly higher accuracy than the optimized DistilBERT baseline on SST-2, illustrating the potential of dynamic computation for latency-sensitive NLP. The main contribution is the open, end-to-end reproducible framework - complete with scripts, CSV logging, and analysis utilities - intended to serve as a community tool for further research on adaptive transformers.

machine learning, natural language, pruning, (17 more...)

arXiv.org Artificial Intelligence

Oct-16-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Information Technology (0.88)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (0.68)
  - Machine Learning > Neural Networks (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found