Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Feb-14-2026, 16:52:46 GMT–Neural Information Processing Systems

While humans accomplish 72.4% of the tasks, the best

large language model, machine learning, programming language, (22 more...)

Neural Information Processing Systems

Feb-14-2026, 16:52:46 GMT

Conferences PDF

Country:
- Asia
  - China > Hong Kong (0.04)
  - Japan > Honshū
    - Chūbu > Toyama Prefecture > Toyama (0.04)

Genre:
- Instructional Material > Course Syllabus & Notes (0.46)

Industry:
- Law (1.00)
- Information Technology > Software (0.69)
- Education
  - Educational Setting > Online (0.93)
  - Educational Technology (0.67)

Technology:
- Information Technology
  - Hardware (0.93)
  - Human Computer Interaction > Interfaces (0.93)
  - Software > Programming Languages (0.68)
  - Information Management (0.68)
  - Sensing and Signal Processing > Image Processing (0.67)
  - Communications
    - Social Media (1.00)
    - Mobile (1.00)
    - Web (0.68)
  - Artificial Intelligence
    - Representation & Reasoning > Agents (1.00)
    - Vision (0.92)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (0.93)
    - Machine Learning
      - Neural Networks > Deep Learning (0.93)
      - Learning Graphical Models > Undirected Networks
        Markov Models (0.45)

Duplicate Docs Excel Report

Title
Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Similar Docs Excel Report more

Title	Similarity	Source
None found