Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents