linearity
bbc92a647199b832ec90d7cf57074e9e-Supplemental.pdf
Before defining our algorithm at each iterationt we first lighten our notation with a shorthandba(X) = b(ˆp(t 1)(X),a) (at different iterationt, ba denotes different functions), andb(X) is the vector of (b1(X),,bK(X)). For the intuition of the algorithm, consider the t-th iteration where the current prediction function is ˆp(t 1). Thestatement of the theorem is identical; the proof is also essentially the same except for the use of some new technicaltools. Conversely, if ˆp is LB decision calibrated, then kE[p (X) ˆp(X)|U]k1 = 0 almost surely (because if the expectation of a non-negative random variable is zero, the random variable must be zero almost surely), which implies thatˆp is distributioncalibrated. For BKa we use the VC dimension approach.
- North America > United States > Michigan (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Research Report > New Finding (0.92)
- Research Report > Experimental Study (0.67)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Israel (0.04)
SHAP-IQ: Unified Approximation of any-order Shapley Interactions
Predominately in explainable artificial intelligence (XAI) research, the Shapley value (SV) is applied to determine feature attributions for any black box model. Shapley interaction indices extend the SV to define any-order feature interactions. Defining a unique Shapley interaction index is an open research question and, so far, three definitions have been proposed, which differ by their choice of axioms. Moreover, each definition requires a specific approximation technique. Here, we propose SHAPley Interaction Quantification (SHAP-IQ), an efficient sampling-based approximator to compute Shapley interactions for arbitrary cardinal interaction indices (CII), i.e. interaction indices that satisfy the linearity, symmetry and dummy axiom.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Netherlands (0.04)
- Europe > Germany > North Rhine-Westphalia (0.04)
- Leisure & Entertainment (0.68)
- Media (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification
We study the problem of the identification of m arms with largest means under a fixed error rate $\delta$ (fixed-confidence Top-m identification), for misspecified linear bandit models. This problem is motivated by practical applications, especially in medicine and recommendation systems, where linear models are popular due to their simplicity and the existence of efficient algorithms, but in which data inevitably deviates from linearity. In this work, we first derive a tractable lower bound on the sample complexity of any $\delta$-correct algorithm for the general Top-m identification problem. We show that knowing the scale of the deviation from linearity is necessary to exploit the structure of the problem. We then describe the first algorithm for this setting, which is both practical and adapts to the amount of misspecification. We derive an upper bound to its sample complexity which confirms this adaptivity and that matches the lower bound when $\delta \rightarrow 0$. Finally, we evaluate our algorithm on both synthetic and real-world data, showing competitive performance with respect to existing baselines.