Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows

Yang, Yuting, Merlina, Andrea, Song, Weijia, Yuan, Tiancheng, Birman, Ken, Vitenberg, Roman

Feb-28-2024–arXiv.org Artificial Intelligence

Yet We consider ML query processing in distributed systems intelligent edge applications differ from cloud microservices where GPU-enabled workers coordinate to execute complex in important ways, so we cannot just use the same techniques queries: a computing style often seen in applications that interact employed in web frameworks. Whereas the outer tiers of with users in support of image processing and natural today's cloud are dominated by lightweight, stateless, containerized language processing. In such systems, coscheduling of GPU applications that can be upscaled or downscaled memory management and task placement represents a promising at low cost, ML depends on large objects (hyperparameters, opportunity. We propose Compass, a novel framework model parameters, and supporting databases) and often entails that unifies these functions to reduce job latency while using hardware-accelerated computation using devices preconfigured resources efficiently, placing tasks where data dependencies with the proper firmware. When shifting a task to a will be satisfied, collocating tasks from the same job (when device that has not previously run it, computation cannot begin this will not overload the host or its GPU), and efficiently managing until all the prerequisites are in place. We can and do GPU memory. Comparison with other state of the art launch new ML instances when additional capacity is needed, schedulers shows a significant reduction in completion times but scheduling strategies must evolve to avoid thrashing.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

Feb-28-2024

arXiv.org PDF

Add feedback

Country:
- Europe > Norway (0.14)
- North America > United States (0.14)

Genre:
- Research Report (0.50)
- Workflow (0.84)

Industry:
- Information Technology > Services (0.46)

Technology:
- Information Technology
  - Architecture (1.00)
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)
    - Natural Language
      - Chatbot (1.00)
      - Large Language Model (0.67)
    - Representation & Reasoning > Personal Assistant Systems (0.93)
    - Vision (1.00)
  - Cloud Computing (1.00)
  - Hardware (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found