Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws