MERLIN: Multi-stagE query performance prediction for dynamic paRallel oLap pIpeliNe
Zhang, Kaixin, Wang, Hongzhi, Gu, Kunkai, Li, Ziqi, Zhao, Chunyu, Li, Yingze, Yan, Yu
–arXiv.org Artificial Intelligence
High-performance OLAP database technology has emerged with the growing demand for massive data analysis. To achieve much higher performance, many DBMSs adopt sophisticated designs including SIMD operators, parallel execution, and dynamic pipeline modification. However, such advanced OLAP query execution mechanisms still lack targeted Query Performance Prediction (QPP) methods because most existing methods target conventional tree-shaped query plans and static serial executors. To address this problem, in this paper, we proposed MERLIN a multi-stage query performance prediction method for high-performance OLAP DBMSs. MERLIN first establishes resource cost models for each physical operator. Then, it constructs a DAG that consists of a data-flow tree backbone and resource competition relationships among concurrent operators. After using a GAT with an extra attention mechanism to calibrate the cost, the cost vector tree is extracted and summarized by a TCN, ultimately enabling effective query performance prediction. Experimental results demonstrate that MERLIN yields higher performance prediction precision than existing methods.
arXiv.org Artificial Intelligence
Dec-1-2024
- Country:
- Asia
- China > Heilongjiang Province
- Harbin (0.04)
- Middle East > Jordan (0.04)
- China > Heilongjiang Province
- Europe
- Netherlands > North Holland
- Amsterdam (0.04)
- Sweden > Uppsala County
- Uppsala (0.04)
- Netherlands > North Holland
- North America > United States
- Arizona > Maricopa County
- Phoenix (0.04)
- California
- Los Angeles County > Long Beach (0.04)
- Santa Clara County > Santa Clara (0.04)
- District of Columbia > Washington (0.04)
- Hawaii (0.04)
- New York > New York County
- New York City (0.04)
- Arizona > Maricopa County
- Asia
- Genre:
- Research Report > New Finding (0.34)
- Technology: