META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI

Sun, Liangtai, Chen, Xingyu, Chen, Lu, Dai, Tianle, Zhu, Zichen, Yu, Kai

Nov-24-2022–arXiv.org Artificial Intelligence

Task-oriented dialogue (TOD) systems have been widely used by mobile phone intelligent assistants to accomplish tasks such as calendar scheduling or hotel reservation. Current TOD systems usually focus on multi-turn text/speech interaction, then they would call back-end APIs designed for TODs to perform the task. However, this API-based architecture greatly limits the information-searching capability of intelligent assistants and may even lead to task failure if TOD-specific APIs are not available or the task is too complicated to be executed by the provided APIs. In this paper, we propose a new TOD architecture: GUI-based task-oriented dialogue system (GUI-TOD). A GUI-TOD system can directly perform GUI operations on real APPs and execute tasks without invoking TOD-specific backend APIs. Furthermore, we release META-GUI, a dataset for training a Multi-modal convErsaTional Agent on mobile GUI. We also propose a multi-model action prediction and response model, which show promising results on META-GUI. The dataset, codes and leaderboard are publicly available.

artificial intelligence, dialogue, natural language, (18 more...)

arXiv.org Artificial Intelligence

Nov-24-2022

arXiv.org PDF

Add feedback

Country:
- Pacific Ocean > North Pacific Ocean
  - San Francisco Bay (0.04)
- North America > United States
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
  - California > San Francisco County
    - San Francisco (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - China > Shanghai
    - Shanghai (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology
  - Graphics (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Personal Assistant Systems (1.00)
    - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found