Revisiting Kernel Attention with Correlated Gaussian Process Representation