Towards Learning Universal Hyperparameter Optimizers with Transformers