MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Zhu, Zichen, Tang, Hao, Li, Yansi, Lan, Kunyao, Jiang, Yixuan, Zhou, Hao, Wang, Yixiao, Zhang, Situo, Sun, Liangtai, Chen, Lu, Yu, Kai
–arXiv.org Artificial Intelligence
Current mobile assistants are limited by dependence on system APIs or struggle with complex user instructions and diverse interfaces due to restricted comprehension and decision-making abilities. To address these challenges, we propose MobA, a novel Mobile phone Agent powered by multimodal large language models that enhances comprehension and planning capabilities through a sophisticated two-level agent architecture. The high-level Global Agent (GA) is responsible for understanding user commands, tracking history memories, and planning tasks. The low-level Local Agent (LA) predicts detailed actions in the form of function calls, guided by sub-tasks and memory from the GA. Integrating a Reflection Module allows for efficient task completion and enables the system to handle previously unseen complex tasks. MobA demonstrates significant improvements in task execution efficiency and completion rate in real-life evaluations, underscoring the potential of MLLM-empowered mobile assistants.
arXiv.org Artificial Intelligence
Oct-17-2024
- Country:
- North America > United States
- District of Columbia > Washington (0.04)
- New York > New York County
- New York City (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Colorado > Denver County
- Denver (0.04)
- California > San Francisco County
- San Francisco (0.14)
- Europe
- Germany > Hamburg (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Asia
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- China
- Shanghai > Shanghai (0.04)
- Jiangsu Province > Nanjing (0.04)
- Beijing > Beijing (0.04)
- Sichuan Province > Chengdu (0.04)
- Guangdong Province > Shenzhen (0.04)
- Hong Kong (0.04)
- Guangxi Province > Nanning (0.04)
- Thailand > Bangkok
- North America > United States
- Genre:
- Workflow (0.93)
- Industry:
- Media (0.93)
- Health & Medicine (0.67)
- Consumer Products & Services > Travel (0.67)
- Transportation
- Technology: