Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design Ibrahim Alabdulmohsin null, Xiaohua Zhai null, Alexander Kolesnikov, Lucas Beyer null

Neural Information Processing Systems 

However, the simple power-law relation becomes more complicated when compute is considered.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found