Likelihood-guided Regularization in Attention Based Models