BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
–Neural Information Processing Systems
To bridge this gap, we introduce the BABILong benchmark, designed to test language models' ability to reason across facts distributed in extremely
Neural Information Processing Systems
Oct-10-2025, 15:30:59 GMT
- Country:
- North America
- United States (0.04)
- Dominican Republic (0.04)
- Puerto Rico > San Juan
- San Juan (0.04)
- Europe
- United Kingdom > England
- Greater London > London (0.04)
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- United Kingdom > England
- Asia
- Russia (0.14)
- Singapore (0.04)
- Indonesia > Bali (0.04)
- China (0.04)
- Japan > Kyūshū & Okinawa
- Kyūshū > Nagasaki Prefecture > Nagasaki (0.04)
- Africa > Rwanda
- North America
- Genre:
- Research Report > New Finding (0.92)
- Industry:
- Law (1.00)
- Technology: