AITopics | sstable

Collaborating Authors

sstable

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learned LSM-trees: Two Approaches Using Learned Bloom Filters

Fidalgo, Nicholas, Ye, Puyuan

arXiv.org Artificial IntelligenceAug-5-2025

Modern key-value stores rely heavily on Log-Structured Merge (LSM) trees for write optimization, but this design introduces significant read amplification. Auxiliary structures like Bloom filters help, but impose memory costs that scale with tree depth and dataset size. Recent advances in learned data structures suggest that machine learning models can augment or replace these components, trading handcrafted heuristics for data-adaptive behavior. In this work, we explore two approaches for integrating learned predictions into the LSM-tree lookup path. The first uses a classifier to selectively bypass Bloom filter probes for irrelevant levels, aiming to reduce average-case query latency. The second replaces traditional Bloom filters with compact learned models and small backup filters, targeting memory footprint reduction without compromising correctness. We implement both methods atop a Monkey-style LSM-tree with leveled compaction, per-level Bloom filters, and realistic workloads. Our experiments show that the classifier reduces GET latency by up to 2.28x by skipping over 30% of Bloom filter checks with high precision, though it incurs a modest false-negative rate. The learned Bloom filter design achieves zero false negatives and retains baseline latency while cutting memory usage per level by 70-80%. Together, these designs illustrate complementary trade-offs between latency, memory, and correctness, and highlight the potential of learned index components in write-optimized storage systems.

artificial intelligence, bloom filter, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.00882

Country: Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

DobLIX: A Dual-Objective Learned Index for Log-Structured Merge Trees

Heidari, Alireza, Ahmadi, Amirhossein, Zhang, Wei

arXiv.org Artificial IntelligenceFeb-7-2025

In this paper, we introduce DobLIX, a dual-objective learned index specifically designed for Log-Structured Merge(LSM) tree-based key-value stores. Although traditional learned indexes focus exclusively on optimizing index lookups, they often overlook the impact of data access from storage, resulting in performance bottlenecks. DobLIX addresses this by incorporating a second objective, data access optimization, into the learned index training process. This dual-objective approach ensures that both index lookup efficiency and data access costs are minimized, leading to significant improvements in read performance while maintaining write efficiency in real-world LSM-tree systems. Additionally, DobLIX features a reinforcement learning agent that dynamically tunes the system parameters, allowing it to adapt to varying workloads in real-time. Experimental results using real-world datasets demonstrate that DobLIX reduces indexing overhead and improves throughput by 1.19 to 2.21 times compared to state-of-the-art methods within RocksDB, a widely used LSM-tree-based storage engine.

machine learning, reinforcement learning, throughput, (20 more...)

arXiv.org Artificial Intelligence

2502.05369

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands (0.04)
Asia (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Algorithms Behind Modern Storage Systems

Communications of the ACMJul-23-2018, 20:35:47 GMT

Developing storage systems always presents the same challenges and factors to consider. Deciding what to optimize for has a substantial influence on the result. You can spend more time during write in order to lay out structures for more efficient reads, reserve extra space for in-place updates, facilitate faster writes, and buffer data in memory to ensure sequential write operations. It is impossible, however, to do this all at once. An ideal storage system would have the lowest read cost, lowest write cost, and no overhead.

artificial intelligence, node, sstable, (18 more...)

Communications of the ACM

Country: North America > Canada > Ontario > National Capital Region > Ottawa (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.47)

Add feedback