Goto

Collaborating Authors

 Siddiqui, Tarique


Sibyl: Forecasting Time-Evolving Query Workloads

arXiv.org Artificial Intelligence

For workload-based optimization, the input workload plays a crucial role and needs to be a good representation of the expected Database systems often rely on historical query traces to perform workload. Traditionally, historical query traces have been used as workload-based performance tuning. However, real production input workloads with the assumption that workloads are mostly workloads are time-evolving, making historical queries ineffective static. However, as we discuss in 2, many real workloads exhibit for optimizing future workloads. To address this challenge, we propose highly recurring query structures with changing patterns in both Sibyl, an end-to-end machine learning-based framework that their arrival intervals and data accesses. For instance, query templates accurately forecasts a sequence of future queries, with the entire are often shared across users, teams, and applications, but query statements, in various prediction windows. Drawing insights may be customized with different parameter values to access varying from real-workloads, we propose template-based featurization techniques data at different points in time. Consider a log analysis query and develop a stacked-LSTM with an encoder-decoder architecture that reports errors for different devices and error types: "SELECT for accurate forecasting of query workloads. We also * FROM T WHERE deviceType =? AND errorType =? AND develop techniques to improve forecasting accuracy over large prediction eventDate BETWEEN?


ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges

arXiv.org Artificial Intelligence

The scale and complexity of workloads in modern cloud services have brought into sharper focus a critical challenge in automated index tuning -- the need to recommend high-quality indexes while maintaining index tuning scalability. This challenge is further compounded by the requirement for automated index implementations to introduce minimal query performance regressions in production deployments, representing a significant barrier to achieving scalability and full automation. This paper directs attention to these challenges within automated index tuning and explores ways in which machine learning (ML) techniques provide new opportunities in their mitigation. In particular, we reflect on recent efforts in developing ML techniques for workload selection, candidate index filtering, speeding up index configuration search, reducing the amount of query optimizer calls, and lowering the chances of performance regressions. We highlight the key takeaways from these efforts and underline the gaps that need to be closed for their effective functioning within the traditional index tuning framework. Additionally, we present a preliminary cross-platform design aimed at democratizing index tuning across multiple SQL-like systems -- an imperative in today's continuously expanding data system landscape. We believe our findings will help provide context and impetus to the research and development efforts in automated index tuning.