Resolving Discrepancies in Compute-Optimal Scaling of Language Models
–Neural Information Processing Systems
We explain the discrepancy by reproducing the Kaplan et al. scaling law on two datasets (OpenWebText2 and RefinedWeb)
Neural Information Processing Systems
Feb-17-2026, 16:08:32 GMT
- Country:
- Europe
- Germany (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East
- Jordan (0.04)
- Israel > Tel Aviv District
- Tel Aviv (0.04)
- Europe
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Technology: