Is forgetting less a good inductive bias for forward transfer?