Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design Anonymous Author(s) Affiliation Address email Scaling laws have been recently employed to derive compute-optimal model size

Open in new window