StreamBench: Towards Benchmarking Continuous Improvement of Language Agents

Open in new window