Supplementary Materials Rashomon Capacity: A Metric for Predictive Multiplicity in Classification
–Neural Information Processing Systems
This supplementary materials include omitted proofs for Proposition 1 and 2, additional explanations and discussions, details on experiment setups and training, and additional experiments. For clarity, the numbers with a prefix SM. refer to equations, figures, and tables in the supplementary material; numbers without the prefix refer to equations, figures, and tables in the main paper. By the information inequality [1, Theorem 2.6.3] the mutual information I(M; Y) between the random variables M and Y (defined in Section 3) is non-negative, i.e., I(M; Y) 0. On the other hand, we denote the c models in R(H, ϵ) which output scores are the "vertices" of We now prove the converse statements. Then I(M; Y) = log c and, from non-negativity of entropy and the fact that the uniform distribution maximizes entropy, H(Y) = c and H(Y |M) = 0. Consequently, again from non-negativity of entropy, H(Y |M = m) = 0 for all m supp(P Since H(Y) = c, the result follows. SM. 2.1 Predictive multiplicity: fairness, reproducibility, and security Predictive multiplicity and the Rashomon effect are related to individual fairness [3,4].
Neural Information Processing Systems
Mar-27-2025, 14:12:53 GMT