Transformer Approximations from ReLUs
Hu, Jerry Yao-Chieh, Lu, Mingcheng, Lee, Yi-Chen, Liu, Han
We present a systematic recipe for translating ReLU approximation results to softmax Transformers1. Given a constructive ReLU approximator for a target, we construct an explicit softmax transformer with the same accuracy. The recipe applies to many common approximation targets and yields quantitative resource bounds beyond universal approximation statements. This matters because broad Universal Approximation Properties (UAP) still dominate Transformer approximation theory. For softmax Transformer, many universality results provide explicit constructions and quantitative resource bounds (e.g., parameters, depth, width...etc) [Yun et al., 2020, Kajitsuka and Sato, 2023, Takakura and Suzuki, 2023, Jiang and Li, 2024, Hu et al., 2025,
Apr-29-2026
- Country:
- North America > United States (0.28)
- Asia > Taiwan (0.28)
- Genre:
- Research Report (0.50)
- Technology: