A ImageNet Pre-training Table 1: Training settings on ImageNet classification

Aug-14-2025, 04:13:56 GMT–Neural Information Processing Systems

Both RTFormer-Slim and RTFormer-Base outperform the corresponding DDRNet variants. The self-attention used for comparison is following (12). For linformer attention, we directly give a result without hyper parameter modification. Multi-head external attention can achieve a good inference speed, which is benefit from its linear complexity and the design of sharing external parameter for multiple heads. "#Params" refers to the number of parameters.

gpu-friendly attention, multi-head external attention, proceedings, (9 more...)

Neural Information Processing Systems

Aug-14-2025, 04:13:56 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
training

Similar Docs Excel Report more

Title	Similarity	Source
None found