Fast and Accurate Triangle Counting in Graph Streams Using Predictions
Boldrin, Cristian, Vandin, Fabio
–arXiv.org Artificial Intelligence
In this work, we present the first efficient and practical algorithm for estimating the number of triangles in a graph stream using predictions. Our algorithm combines waiting room sampling and reservoir sampling with a predictor for the heaviness of edges, that is, the number of triangles in which an edge is involved. As a result, our algorithm is fast, provides guarantees on the amount of memory used, and exploits the additional information provided by the predictor to produce highly accurate estimates. We also propose a simple and domain-independent predictor, based on the degree of nodes, that can be easily computed with one pass on a stream of edges when the stream is available beforehand. Our analytical results show that, when the predictor provides useful information on the heaviness of edges, it leads to estimates with reduced variance compared to the state-of-the-art, even when the predictions are far from perfect. Our experimental results show that, when analyzing a single graph stream, our algorithm is faster than the state-of-the-art for a given memory budget, while providing significantly more accurate estimates. Even more interestingly, when sequences of hundreds of graph streams are analyzed, our algorithm significantly outperforms the state-of-the-art using our simple degree-based predictor built by analyzing only the first graph of the sequence.
arXiv.org Artificial Intelligence
Sep-23-2024
- Country:
- Europe > Italy (0.04)
- North America > United States
- Oregon (0.05)
- California > Santa Clara County
- Palo Alto (0.04)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Information Technology (0.68)
- Technology:
- Information Technology
- Data Science > Data Mining (1.00)
- Communications > Social Media (0.96)
- Artificial Intelligence
- Machine Learning (0.68)
- Representation & Reasoning (0.67)
- Information Technology