Reviews: A 2-Nets: Double Attention Networks
–Neural Information Processing Systems
This paper proposes the "double attention block" to aggregate and propagate informative global features from the entire spatio/spatio-temporal space of input. Specifically, the model first generates a set of attention distributions over the input and obtains a set of global feature vectors based on the attention. Then, for each input position, it generates another attention distribution over the set of global feature vectors and uses this to aggregate those global feature vectors into a position-specific feature vector. The proposed component can be easily plugged into existing architectures. Experiments on image recognition (ImageNet-1k) and video classification (Kinetics, UCF-101) show that the proposed model outperforms the baselines and is more efficient.
Neural Information Processing Systems
Oct-8-2024, 07:28:27 GMT
- Technology: