Flash normalization: fast RMSNorm for LLMs

Open in new window