Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models

Neural Information Processing Systems 

With Monarch matrices, Kronecker factorizations, and post-training quantization, we achieve non-vacuous generalization bounds for LLMs as large as LLaMA2-70B.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found