Probabilistic Reasoning with LLMs for k-anonymity Estimation
Zheng, Jonathan, Das, Sauvik, Ritter, Alan, Xu, Wei
–arXiv.org Artificial Intelligence
Probabilistic reasoning is a key aspect of both human and artificial intelligence that allows for handling uncertainty and ambiguity in decision-making. In this paper, we introduce a novel numerical reasoning task under uncertainty, focusing on estimating the k-anonymity of user-generated documents containing privacy-sensitive information. We propose BRANCH, which uses LLMs to factorize a joint probability distribution to estimate the k-value-the size of the population matching the given information-by modeling individual pieces of textual information as random variables. The probability of each factor occurring within a population is estimated using standalone LLMs or retrieval-augmented generation systems, and these probabilities are combined into a final k-value. Our experiments show that this method successfully estimates the correct k-value 67% of the time, an 11% increase compared to GPT-4o chain-of-thought reasoning. Additionally, we leverage LLM uncertainty to develop prediction intervals for k-anonymity, which include the correct value in nearly 92% of cases.
arXiv.org Artificial Intelligence
Mar-12-2025
- Country:
- Europe (0.92)
- North America > United States
- Minnesota > Hennepin County > Minneapolis (0.14)
- Genre:
- Research Report > New Finding (0.45)
- Industry:
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Learning Graphical Models > Directed Networks
- Bayesian Learning (0.68)
- Neural Networks > Deep Learning (1.00)
- Performance Analysis > Accuracy (0.92)
- Learning Graphical Models > Directed Networks
- Natural Language
- Chatbot (1.00)
- Large Language Model (1.00)
- Representation & Reasoning > Uncertainty (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence