Simba
Package-level integration using multi-chip-modules (MCMs) is a promising approach for building large-scale systems. Compared to a large monolithic die, an MCM combines many smaller chiplets into a larger system, substantially reducing fabrication and design costs. Current MCMs typically only contain a handful of coarse-grained large chiplets due to the high area, performance, and energy overheads associated with inter-chiplet communication. This work investigates and quantifies the costs and benefits of using MCMs with finegrained chiplets for deep learning inference, an application domain with large compute and on-chip storage requirements. To evaluate the approach, we architected, implemented, fabricated, and tested Simba, a 36-chiplet prototype MCM system for deep-learning inference. Each chiplet achieves 4 TOPS peak performance, and the 36-chiplet MCM package achieves up to 128 TOPS and up to 6.1 TOPS/W. The MCM is configurable to support a flexible mapping of DNN layers to the distributed compute and storage units. To mitigate inter-chiplet communication overheads, we introduce three tiling optimizations that improve data locality. These optimizations achieve up to 16% speedup compared to the baseline layer mapping. Our evaluation shows that Simba can process 1988 images/s running ResNet-50 with a batch size of one, delivering an inference latency of 0.50 ms. Deep learning (DL) has become critical for addressing complex real-world problems.
May-25-2021, 15:40:12 GMT
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- United States
- District of Columbia > Washington (0.04)
- Texas > Travis County
- Austin (0.05)
- Nevada > Clark County
- Las Vegas (0.04)
- Ohio > Franklin County
- Columbus (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- North Carolina
- Durham County > Durham (0.05)
- Wake County > Raleigh (0.04)
- Massachusetts > Middlesex County
- Wisconsin > Dane County
- Madison (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Los Angeles County > Los Angeles (0.14)
- Alameda County > Berkeley (0.14)
- Santa Clara County
- Santa Clara (0.05)
- Stanford (0.04)
- San Jose (0.04)
- Palo Alto (0.04)
- New York > New York County
- New York City (0.05)
- Canada > Ontario
- Toronto (0.05)
- United States
- Asia
- South Korea > Seoul
- Seoul (0.04)
- Japan > Honshū
- Kansai > Kyoto Prefecture > Kyoto (0.04)
- South Korea > Seoul
- Oceania > Australia
- Genre:
- Research Report (0.34)
- Industry:
- Information Technology (0.51)
- Technology: