Tilus: A Tile-Level GPGPU Programming Language for Low-Precision Computation
Ding, Yaoyao, Hou, Bohan, Zhang, Xiao, Lin, Allan, Chen, Tianqi, Hao, Cody Yu, Wang, Yida, Pekhimenko, Gennady
–arXiv.org Artificial Intelligence
Serving Large Language Models (LLMs) is critical for AI-powered applications, yet it demands substantial computational resources, particularly in memory bandwidth and computational throughput. Low-precision computation has emerged as a key technique to improve efficiency while reducing resource consumption. Existing approaches for generating low-precision kernels are limited to weight bit widths that are powers of two and suffer from suboptimal performance because of high-level GPU programming abstractions. These abstractions restrict critical optimizations, such as fine-grained register management and optimized memory access patterns, that are essential for efficient low-precision computations. In this paper, we introduce Tilus, a domain-specific language designed for General-Purpose GPU (GPGPU) computing that supports low-precision data types with arbitrary bit widths from 1 to 8 while maintaining GPU programmability. Tilus features a thread-block-level programming model, a hierarchical memory space, a novel algebraic layout system, and extensive support for diverse low-precision data types. Tilus programs are compiled into highly efficient GPU programs through automatic vectorization and instruction selection. Extensive experiments demonstrate that Tilus efficiently supports a full spectrum of low-precision data types, and outperforms state-of-the-art low-precision kernels. Compared to existing compilers such as Triton and Ladder, as well as hand-optimized kernels such as QuantLLM and Marlin, Tilus achieves performance improvements of: $1.75\times$, $2.61\times$, $1.29\times$ and $1.03\times$, respectively. We open-source Tilus at https://github.com/NVIDIA/tilus.
arXiv.org Artificial Intelligence
Sep-3-2025
- Country:
- Europe (1.00)
- Asia (1.00)
- North America
- United States > New York
- New York County > New York City (0.15)
- Canada > Ontario
- Toronto (0.15)
- United States > New York
- Genre:
- Research Report (0.82)
- Industry:
- Information Technology (0.50)
- Technology: