Statistically Valid Information Bottleneck via Multiple Hypothesis Testing
Farzaneh, Amirmohammad, Simeone, Osvaldo
–arXiv.org Artificial Intelligence
The information bottleneck (IB) problem is a widely studied framework in machine learning for extracting compressed features that are informative for downstream tasks. However, current approaches to solving the IB problem rely on a heuristic tuning of hyperparameters, offering no guarantees that the learned features satisfy information-theoretic constraints. In this work, we introduce a statistically valid solution to this problem, referred to as IB via multiple hypothesis testing (IB-MHT), which ensures that the learned features meet the IB constraints with high probability, regardless of the size of the available dataset. The proposed methodology builds on Pareto testing and learn-then-test (LTT), and it wraps around existing IB solvers to provide statistical guarantees on the IB constraints. We demonstrate the performance of IB-MHT on classical and deterministic IB formulations, validating the effectiveness of IB-MHT in outperforming conventional methods in terms of statistical robustness and reliability.
arXiv.org Artificial Intelligence
Sep-11-2024
- Country:
- Europe > United Kingdom
- England > Greater London > London (0.04)
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- Genre:
- Research Report (0.83)
- Technology: