radix tree
- Europe > Austria > Vienna (0.14)
- North America > United States > Virginia (0.04)
- North America > United States > Texas (0.04)
- (3 more...)
- Europe > Austria > Vienna (0.14)
- North America > United States > Virginia (0.04)
- North America > United States > Texas (0.04)
- (4 more...)
Efficiently Programming Large Language Models using SGLang
Zheng, Lianmin, Yin, Liangsheng, Xie, Zhiqiang, Huang, Jeff, Sun, Chuyue, Yu, Cody Hao, Cao, Shiyi, Kozyrakis, Christos, Stoica, Ion, Gonzalez, Joseph E., Barrett, Clark, Sheng, Ying
Large language models (LLMs) are increasingly used for complex tasks requiring multiple chained generation calls, advanced prompting techniques, control flow, and interaction with external environments. However, efficient systems for programming and executing these applications are lacking. To bridge this gap, we introduce SGLang, a Structured Generation Language for LLMs. SGLang is designed for the efficient programming of LLMs and incorporates primitives for common LLM programming patterns. We have implemented SGLang as a domain-specific language embedded in Python, and we developed an interpreter, a compiler, and a high-performance runtime for SGLang. These components work together to enable optimizations such as parallelism, batching, caching, sharing, and other compilation techniques. Additionally, we propose RadixAttention, a novel technique that maintains a Least Recently Used (LRU) cache of the Key-Value (KV) cache for all requests in a radix tree, enabling automatic KV cache reuse across multiple generation calls at runtime. SGLang simplifies the writing of LLM programs and boosts execution efficiency. Our experiments demonstrate that SGLang can speed up common LLM tasks by up to 5x, while reducing code complexity and enhancing control.
- North America > United States > Virginia (0.04)
- North America > United States > Texas (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
A Local-Pattern Related Look-Up Table
Shih, Chung-Chin, Wei, Ting Han, Wu, Ti-Rong, Wu, I-Chen
This paper describes a Relevance-Zone pattern table (RZT) that can be used to replace a traditional transposition table. An RZT stores exact game values for patterns that are discovered during a Relevance-Zone-Based Search (RZS), which is the current state-of-the-art in solving L&D problems in Go. Positions that share the same pattern can reuse the same exact game value in the RZT. The pattern matching scheme for RZTs is implemented using a radix tree, taking into consideration patterns with different shapes. To improve the efficiency of table lookups, we designed a heuristic that prevents redundant lookups. The heuristic can safely skip previously queried patterns for a given position, reducing the overhead to 10% of the original cost. We also analyze the time complexity of the RZT both theoretically and empirically. Experiments show the overhead of traversing the radix tree in practice during lookup remain flat logarithmically in relation to the number of entries stored in the table. Experiments also show that the use of an RZT instead of a traditional transposition table significantly reduces the number of searched nodes on two data sets of 7x7 and 19x19 L&D Go problems.
- Asia > China (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts (0.04)
- (2 more...)
Efficient Operations On MDDs for Building Constraint Programming Models
Perez, Guillaume (University Nice Sophia Antipolis) | Régin, Jean-Charles (University Nice-Sophia Antipolis)
For instance, phrase generation problem involves domains having more than d 10, 000 values. Thus, We propose improved algorithms for defining the we cannot use an algorithm whose time or space complexity most common operations on Multi-Valued Decision is mainly based on Ω(nd), where n is the number of nodes Diagrams (MDDs): creation, reduction, complement, of the MDDs. Therefore, we need to improve the algorithms intersection, union, difference, symmetric performing the main operations on MDDs: creation, reduction difference, complement of union and complement and combinations. of intersection. Then, we show that with these algorithms The new creation algorithm we propose, exploits the origin and thanks to the recent development of an of the definition of the MDD. If the MDD represents an automaton efficient algorithm establishing arc consistency for (like with a regular constraint) or a repeated pattern MDD based constraints (MDD4R), we can simply (like with dynamic programming), then its creation may be solve some problems by modeling them as a set of sped-up.