Steering CLIP's vision transformer with sparse autoencoders