Learning heavy-tailed distributions with Wasserstein-proximal-regularized $\alpha$-divergences