Efficient Adaptive Transformer: An Empirical Study and Reproducible Framework
–arXiv.org Artificial Intelligence
The Efficient Adaptive Transformer (EAT) framework unifies three adaptive efficiency techniques - progressive token pruning, sparse attention, and dynamic early exiting - into a single, reproducible architecture for input-adaptive inference. EAT provides an open-source benchmarking pipeline that automates data processing, timing, and ablation across GLUE tasks (SST-2, QQP, MNLI). Although this empirical study finds that combining these mechanisms can increase latency in shallow six-layer models, it demonstrates that EAT achieves slightly higher accuracy than the optimized DistilBERT baseline on SST-2, illustrating the potential of dynamic computation for latency-sensitive NLP. The main contribution is the open, end-to-end reproducible framework - complete with scripts, CSV logging, and analysis utilities - intended to serve as a community tool for further research on adaptive transformers.
arXiv.org Artificial Intelligence
Oct-16-2025
- Country:
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology (0.88)
- Technology: