Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning Appendix Table of Contents

Neural Information Processing Systems 

We display the observed computational costs in Tables 6 and 7 for the Protein and Higgs datasets.