Metric Transforms and Low Rank Representations of Kernels for Fast Attention