treap
Robust Learning-Augmented Dictionaries
Zeynali, Ali, Kamali, Shahin, Hajiesmaili, Mohammad
We present the first learning-augmented data structure for implementing dictionaries with optimal consistency and robustness. Our data structure, named RobustSL, is a skip list augmented by predictions of access frequencies of elements in a data sequence. With proper predictions, RobustSL has optimal consistency (achieves static optimality). At the same time, it maintains a logarithmic running time for each operation, ensuring optimal robustness, even if predictions are generated adversarially. Therefore, RobustSL has all the advantages of the recent learning-augmented data structures of Lin, Luo, and Woodruff (ICML 2022) and Cao et al. (arXiv 2023), while providing robustness guarantees that are absent in the previous work. Numerical experiments show that RobustSL outperforms alternative data structures using both synthetic and real datasets.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada (0.04)
- Europe > Spain > Aragón (0.04)
Learning-Augmented B-Trees
Cao, Xinyuan, Chen, Jingbang, Chen, Li, Lambert, Chris, Peng, Richard, Sleator, Daniel
The development of machine learning has sparked significant interest in its potential to enhance traditional data structures. First proposed by Kraska et al. [KBCDP18], the notion of learned index has gained much attention since then [KBCDP18; DMYWDLZCGK+20; FV20]. Algorithms with predictions have also been developed for an increasingly wide range of problems, including shortest path [CSVZ22], network flow [PZ22; LMRX20], matching [CSVZ22; DILMV21; CI21], spanning tree [ELMS22], and triangles/cycles counting [CEILNRSWWZ22], with the goal of obtaining algorithms that get near-optimal performances when the predictions are good, but also recover prediction-less worst-case behavior when predictions have large errors [MV20]. Regarding the original learned index question, which uses learning to speed up search trees, developing data structures optimal to the input sequence has been extensively studied in the field of data structures. Melhorn [Meh75a] showed that a nearly optimal static tree can be constructed in linear time when estimates of key frequencies are provided. Extensive work on this topic culminated in the study of dynamic optimality, where tree balancing algorithms (e.g.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Spain > Aragón (0.04)
- North America > United States > Wisconsin (0.04)
- (5 more...)
How to use Binary Search Trees part1(Data Structures and Algorithms)
Abstract: This paper presents a parallel solution based on the coarse-grained multicomputer (CGM) model using the four-splitting technique to solve the optimal binary search tree problem. The well-known sequential algorithm of Knuth solves this problem in O(n2) time and space, where n is the number of keys used to build the optimal binary search tree. To parallelize this algorithm on the CGM model, the irregular partitioning technique, consisting in subdividing the dependency graph into subgraphs (or blocks) of variable size, has been proposed to tackle the trade-off of minimizing the number of communication rounds and balancing the load of processors. This technique however induces a high latency time of processors (which accounts for most of the global communication time) because varying the blocks' sizes does not enable them to start evaluating some blocks as soon as the data they need are available. The four-splitting technique proposed in this paper solves this shortcoming by evaluating a block as a sequence of computation and communication steps of four subblocks.