Measuring temporal effects of agent knowledge by date-controlled tool use
Xian, R. Patrick, Cui, Qiming, Bauer, Stefan, Abbasi-Asl, Reza
–arXiv.org Artificial Intelligence
Temporal progression is an integral part of knowledge accumulation and update. Web search is frequently adopted as grounding for agent knowledge, yet its inappropriate configuration affects the quality of agent responses. Here, we construct a tool-based out-of-sample testing framework to measure the knowledge variability of large language model (LLM) agents from distinct date-controlled tools (DCTs). We demonstrate the temporal effects of an LLM agent as a writing assistant, which can use web search to help complete scientific publication abstracts. We show that temporal effects of the search engine translates into tool-dependent agent performance but can be alleviated with base model choice and explicit reasoning instructions such as chain-of-thought prompting. Our results indicate that agent evaluation should take a dynamical view and account for the temporal influence of tools and the updates of external resources.
arXiv.org Artificial Intelligence
Mar-6-2025
- Country:
- Asia
- Europe
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Slovenia > Drava
- Municipality of Benedikt > Benedikt (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Germany > Bavaria
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- California > San Francisco County
- San Francisco (0.04)
- Colorado > Boulder County
- Boulder (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- New York > New York County
- New York City (0.04)
- California > San Francisco County
- Mexico > Mexico City
- Genre:
- Research Report > New Finding (0.66)
- Industry:
- Health & Medicine (1.00)
- Technology: