Goto

Collaborating Authors

 plbf



Fast Partitioned Learned Bloom Filter

Neural Information Processing Systems

Requests for name changes in the electronic proceedings will be accepted with no questions asked. However name changes may cause bibliographic tracking issues. Authors are asked to consider this carefully and discuss it with their co-authors prior to requesting a name change in the electronic proceedings. Use the Report an Issue link to request a name change.



Cascaded Learned Bloom Filter for Optimal Model-Filter Size Balance and Fast Rejection

Sato, Atsuki, Matsui, Yusuke

arXiv.org Artificial Intelligence

Despite numerous attempts to further improve memory efficiency (Mitzenmacher, 2018; Dai & Recent studies have demonstrated that learned Shrivastava, 2020), existing LBFs face two critical unresolved Bloom filters, which combine machine learning issues: (1) the balance between the machine learning with the classical Bloom filter, can achieve superior model size and the Bloom filter size remains suboptimal, memory efficiency. However, existing learned and (2) the reject time cannot be effectively minimized. Bloom filters face two critical unresolved challenges: the balance between the machine learning (1) The Balance Between Machine Learning Model Size model size and the Bloom filter size is not optimal, and Bloom Filter Size: Existing LBFs lack mechanisms and the reject time cannot be minimized to automatically balance the sizes of the machine learning effectively. We propose the Cascaded Learned model and the Bloom filters. An LBF consists of a machine Bloom Filter (CLBF) to address these issues. Our learning model and one or more Bloom filters, aiming to dynamic programming-based optimization automatically minimize the total memory usage, i.e., the sum of the memory selects configurations that achieve an consumed by the machine learning model and the Bloom optimal balance between the model and filter sizes filters. Since a smaller machine learning model often--but while minimizing reject time. Experiments on not always--has lower accuracy, larger Bloom filters are real-world datasets show that CLBF reduces memory needed to maintain the overall false positive rate of an LBF, usage by up to 24% and decreases reject time whereas a larger model often--but not always--allows for by up to 14 times compared to state-of-the-art smaller Bloom filters. Thus, it is a challenging task to strike learned Bloom filters.