MatFormer: Nested Transformer for Elastic Inference Devvrit Aditya Kusupati + Tim Dettmers
–Neural Information Processing Systems
Foundation models are applied in a broad spectrum of settings with different inference constraints, from massive multi-accelerator clusters to resource-constrained standalone mobile devices. However, the substantial costs associated with training these models often limit the number of unique model sizes that can be offered. Consequently, practitioners are compelled to select a model that may not be optimally aligned with their specific latency and cost requirements.
Neural Information Processing Systems
Jun-2-2025, 09:32:18 GMT
- Country:
- North America > United States
- California (0.14)
- Texas (0.14)
- North America > United States
- Genre:
- Research Report (1.00)
- Industry:
- Government (0.46)
- Technology:
- Information Technology
- Artificial Intelligence
- Cognitive Science (0.68)
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language
- Chatbot (0.93)
- Large Language Model (1.00)
- Representation & Reasoning (1.00)
- Vision (0.68)
- Communications (0.87)
- Artificial Intelligence
- Information Technology