AITopics | discrete distribution estimation

Collaborating Authors

discrete distribution estimation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability

van der Hoeven, Dirk, Olkhovskaia, Julia, van Erven, Tim

arXiv.org Machine LearningJul-24-2025

We consider the problem of estimating a discrete distribution $p$ with support of size $K$ and provide both upper and lower bounds with high probability in KL divergence. We prove that in the worst case, for any estimator $\widehat{p}$, with probability at least $δ$, $\text{KL}(p \| \widehat{p}) \geq C\max\{K,\ln(K)\ln(1/δ) \}/n $, where $n$ is the sample size and $C > 0$ is a constant. We introduce a computationally efficient estimator $p^{\text{OTB}}$, based on Online to Batch conversion and suffix averaging, and show that with probability at least $1 - δ$ $\text{KL}(p \| \widehat{p}) \leq C(K\log(\log(K)) + \ln(K)\ln(1/δ)) /n$. Furthermore, we also show that with sufficiently many observations relative to $\log(1/δ)$, the maximum likelihood estimator $\bar{p}$ guarantees that with probability at least $1-δ$ $$ 1/6 χ^2(\bar{p}\|p) \leq 1/4 χ^2(p\|\bar{p}) \leq \text{KL}(p|\bar{p}) \leq C(K + \log(1/δ))/n\,, $$ where $χ^2$ denotes the $χ^2$-divergence.

artificial intelligence, machine learning, probability, (15 more...)

arXiv.org Machine Learning

2507.17316

Country:

North America > United States > Massachusetts (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Concentration Bounds for Discrete Distribution Estimation in KL Divergence

Canonne, Clément L., Sun, Ziteng, Suresh, Ananda Theertha

arXiv.org Artificial IntelligenceJun-12-2023

Discrete distribution estimation, i.e., density estimation over discrete domains, is a fundamental problem in Statistics, with a rich history (see, e.g., [9, 10] for an overview and further references). In this work, we address a simple yet surprisingly ill-understood aspect of this question: what is sample complexity of estimating an arbitrary discrete distribution in Kullback-Leibler (KL) divergence with vanishing probability of error? To describe the problem further, a few definitions are in order.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2302.06869

Country:

North America > United States > New York (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

Discrete Distribution Estimation under Local Privacy

Kairouz, Peter, Bonawitz, Keith, Ramage, Daniel

arXiv.org Machine LearningJun-15-2016

The collection and analysis of user data drives improvements in the app and web ecosystems, but comes with risks to privacy. This paper examines discrete distribution estimation under local privacy, a setting wherein service providers can learn the distribution of a categorical statistic of interest without collecting the underlying data. We present new mechanisms, including hashed K-ary Randomized Response (KRR), that empirically meet or exceed the utility of existing mechanisms at all privacy levels. New theoretical results demonstrate the order-optimality of KRR and the existing RAPPOR mechanism at different privacy regimes.

artificial intelligence, discrete distribution estimation, machine learning, (13 more...)

arXiv.org Machine Learning

1602.07387

Country: North America > United States > Illinois (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback