Deploying Foundation Model Powered Agent Services: A Survey
Xu, Wenchao, Chen, Jinyu, Zheng, Peirong, Yi, Xiaoquan, Tian, Tianyi, Zhu, Wenhui, Wan, Quan, Wang, Haozhao, Fan, Yunfeng, Su, Qinliang, Shen, Xuemin
–arXiv.org Artificial Intelligence
Foundation model (FM) powered agent services are regarded as a promising solution to develop intelligent and personalized applications for advancing toward Artificial General Intelligence (AGI). To achieve high reliability and scalability in deploying these agent services, it is essential to collaboratively optimize computational and communication resources, thereby ensuring effective resource allocation and seamless service delivery. In pursuit of this vision, this paper proposes a unified framework aimed at providing a comprehensive survey on deploying FM-based agent services across heterogeneous devices, with the emphasis on the integration of model and resource optimization to establish a robust infrastructure for these services. Particularly, this paper begins with exploring various low-level optimization strategies during inference and studies approaches that enhance system scalability, such as parallelism techniques and resource scaling methods. The paper then discusses several prominent FMs and investigates research efforts focused on inference acceleration, including techniques such as model compression and token reduction. Moreover, the paper also investigates critical components for constructing agent services and highlights notable intelligent applications. Finally, the paper presents potential research directions for developing real-time agent services with high Quality of Service (QoS).
arXiv.org Artificial Intelligence
Dec-17-2024
- Country:
- Asia > China (0.67)
- North America (0.45)
- Genre:
- Overview (1.00)
- Research Report > Promising Solution (1.00)
- Industry:
- Education (0.93)
- Energy (1.00)
- Health & Medicine (1.00)
- Information Technology
- Technology:
- Information Technology
- Architecture > Real Time Systems (1.00)
- Artificial Intelligence
- Cognitive Science > Problem Solving (1.00)
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language
- Chatbot (1.00)
- Large Language Model (1.00)
- Representation & Reasoning
- Agents (1.00)
- Optimization (1.00)
- Planning & Scheduling (1.00)
- Search (1.00)
- Vision (1.00)
- Communications > Networks (0.92)
- Data Science > Data Mining (0.92)
- Information Technology