Revisiting Kernel Attention with Correlated Gaussian Process Representation

Open in new window