Piecewise Linear Approximation in Learned Index Structures: Theoretical and Empirical Analysis
Qin, Jiayong, Zhu, Xianyu, Liu, Qiyu, Zhang, Guangyi, Cai, Zhigang, Liao, Jianwei, Hu, Sha, Peng, Jingshu, Shao, Yingxia, Chen, Lei
–arXiv.org Artificial Intelligence
A growing trend in the database and system communities is to augment conventional index structures, such as B+-trees, with machine learning (ML) models. Among these, error-bounded Piecewise Linear Approximation ($ε$-PLA) has emerged as a popular choice due to its simplicity and effectiveness. Despite its central role in many learned indexes, the design and analysis of $ε$-PLA fitting algorithms remain underexplored. In this paper, we revisit $ε$-PLA from both theoretical and empirical perspectives, with a focus on its application in learned index structures. We first establish a fundamentally improved lower bound of $Ω(κ\cdot ε^2)$ on the expected segment coverage for existing $ε$-PLA fitting algorithms, where $κ$ is a data-dependent constant. We then present a comprehensive benchmark of state-of-the-art $ε$-PLA algorithms when used in different learned data structures. Our results highlight key trade-offs among model accuracy, model size, and query performance, providing actionable guidelines for the principled design of future learned data structures.
arXiv.org Artificial Intelligence
Jun-26-2025
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom
- Genre:
- Research Report > New Finding (0.88)
- Technology: