Stable and low-precision training for large-scale vision-language models Mitchell Wortsman 1 Tim Dettmers 1 Luke Zettlemoyer

Neural Information Processing Systems 

Our main focus is int8 as GPU support for float8 is rare, though we also analyze float8 training through simulation.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found