AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories