WebArXiv: Evaluating Multimodal Agents on Time-Invariant arXiv Tasks

Open in new window