Empirical Capacity Model for Self-Attention Neural Networks

Open in new window