Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
–Neural Information Processing Systems
T propose o address a no the vel challenge Agent-as-a-Judge of evaluating framew time-v ork. Our arying method and construct complex s answers, task-specific we judg answer of ten e a frontier g correctness ents based agentic and on a search source tree-structured systems attribution.
Neural Information Processing Systems
Jun-23-2026, 04:20:24 GMT
- Country:
- North America > United States (1.00)
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Government (1.00)
- Information Technology > Security & Privacy (0.45)
- Banking & Finance > Economy (0.45)
- Technology:
- Information Technology
- Information Management > Search (1.00)
- Communications (1.00)
- Artificial Intelligence
- Representation & Reasoning > Agents (1.00)
- Cognitive Science (0.92)
- Vision (0.67)
- Natural Language
- Large Language Model (1.00)
- Chatbot (1.00)
- Machine Learning > Neural Networks
- Deep Learning (0.94)
- Information Technology