TIME: AMulti-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios

Jun-19-2026, 06:27:10 GMT–Neural Information Processing Systems

Temporal reasoning is pivotal for Large Language Models (LLMs) to comprehend the real world. However, existing works neglect the real-world challenges for temporal reasoning: (1) intensive temporal information, (2) fast-changing event dynamics, and (3) complex temporal dependencies in social interactions. To bridge this gap, we propose a multi-level benchmark TIME, designed for temporal reasoning in real-world scenarios. TIME consists of 38,522 QA pairs, covering 3 levels with 11 fine-grained sub-tasks. This benchmark encompasses 3 sub-datasets reflecting different real-world challenges: TIME-WIKI, TIME-NEWS, and TIMEDIAL. We conduct extensive experiments on reasoning models and non-reasoning models. And we conducted an in-depth analysis of temporal reasoning performance across diverse real-world scenarios and tasks, and summarized the impact of test-time scaling on temporal reasoning capabilities. Additionally, we release TIME-LITE, a human-annotated subset to foster future research and standardized evaluation in temporal reasoning.

large language model, machine learning, temporal reasoning, (19 more...)

Neural Information Processing Systems

Jun-19-2026, 06:27:10 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (1.00)
- Europe (1.00)
- Asia (1.00)

Genre:
- Research Report > Experimental Study (1.00)
- Overview (0.67)

Industry:
- Education (0.93)
- Media (0.92)
- Leisure & Entertainment > Sports (0.68)
- Information Technology > Security & Privacy (0.67)
- Government > Regional Government
  - North America Government > United States Government (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Temporal Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found