Scaling Laws for Linear Complexity Language Models

Open in new window