Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression