MARPLE: A Benchmark for Long-Horizon Inference Emily Jin

Open in new window