On Dataset Transferability in Active Learning for Transformers