Auto Learning Attention: Supplementary Material
–Neural Information Processing Systems
The initial learning rate is 0.1, and The weight decay is set as 0.0005. The batch size is 256. The results are summarised in Table 3 of the paper. The learning rate starts from 0.1 We replace it with ResNet50 to evaluate the performance of different attention modules. The conv5_x, average pooling, fc, and the softmax layers are removed from the original classification model.
Neural Information Processing Systems
Oct-2-2025, 02:12:23 GMT
- Country:
- North America > Canada (0.05)
- Oceania > Australia
- New South Wales > Sydney (0.05)
- Asia > China
- Guangdong Province > Shenzhen (0.05)
- Technology: