Allo: A Programming Model for Composable Accelerator Design
Chen, Hongzheng, Zhang, Niansong, Xiang, Shaojie, Zeng, Zhichen, Dai, Mengjia, Zhang, Zhiru
–arXiv.org Artificial Intelligence
Special-purpose hardware accelerators are increasingly pivotal for sustaining performance improvements in emerging applications, especially as the benefits of technology scaling continue to diminish. However, designers currently lack effective tools and methodologies to construct complex, high-performance accelerator architectures in a productive manner. Existing high-level synthesis (HLS) tools often require intrusive source-level changes to attain satisfactory quality of results. Despite the introduction of several new accelerator design languages (ADLs) aiming to enhance or replace HLS, their advantages are more evident in relatively simple applications with a single kernel. Existing ADLs prove less effective for realistic hierarchical designs with multiple kernels, even if the design hierarchy is flattened. In this paper, we introduce Allo, a composable programming model for efficient spatial accelerator design. Allo decouples hardware customizations, including compute, memory, communication, and data type from algorithm specification, and encapsulates them as a set of customization primitives. Allo preserves the hierarchical structure of an input program by combining customizations from different functions in a bottom-up, type-safe manner. This approach facilitates holistic optimizations that span across function boundaries. We conduct comprehensive experiments on commonly-used HLS benchmarks and several realistic deep learning models. Our evaluation shows that Allo can outperform state-of-the-art HLS tools and ADLs on all test cases in the PolyBench. For the GPT2 model, the inference latency of the Allo generated accelerator is 1.7x faster than the NVIDIA A100 GPU with 5.4x higher energy efficiency, demonstrating the capability of Allo to handle large-scale designs.
arXiv.org Artificial Intelligence
Apr-7-2024
- Country:
- North America
- United States
- District of Columbia > Washington (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- New York > New York County
- New York City (0.06)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Florida > Orange County
- Orlando (0.04)
- Colorado > Denver County
- Denver (0.04)
- California
- San Francisco County > San Francisco (0.28)
- Santa Clara County > Santa Clara (0.04)
- Orange County > Irvine (0.04)
- Los Angeles County > Long Beach (0.04)
- San Diego County
- Monterey County
- Canada
- Quebec > Montreal (0.04)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- United States
- Europe
- United Kingdom
- Scotland > City of Edinburgh
- Edinburgh (0.04)
- Northern Ireland
- County Down > Belfast (0.04)
- County Antrim > Belfast (0.04)
- England > Greater London
- London (0.04)
- Scotland > City of Edinburgh
- Switzerland > Vaud
- Lausanne (0.04)
- United Kingdom
- Asia
- China (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- South Korea > Seoul
- Seoul (0.04)
- Middle East > Saudi Arabia
- Riyadh Province > Riyadh (0.04)
- North America
- Genre:
- Research Report (0.63)
- Industry:
- Information Technology (0.48)
- Technology: