Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design