Hard ASH: Sparsity and the right optimizer make a continual learner