How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models

Open in new window