Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

Neural Information Processing Systems 

However, the simple power-law relation becomes more complicated when compute is considered.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found