computational tractability
On the Computational Tractability of the (Many) Shapley Values
Marzouk, Reda, Bassan, Shahaf, Katz, Guy, de la Higuera, Colin
Recent studies have examined the computational complexity of computing Shapley additive explanations (also known as SHAP) across various models and distributions, revealing their tractability or intractability in different settings. However, these studies primarily focused on a specific variant called Conditional SHAP, though many other variants exist and address different limitations. In this work, we analyze the complexity of computing a much broader range of such variants, including Conditional, Interventional, and Baseline SHAP, while exploring both local and global computations. We show that both local and global Interventional and Baseline SHAP can be computed in polynomial time for various ML models under Hidden Markov Model distributions, extending popular algorithms such as TreeSHAP beyond empirical distributions. On the downside, we prove intractability results for these variants over a wide range of neural networks and tree ensembles. We believe that our results emphasize the intricate diversity of computing Shapley values, demonstrating how their complexity is substantially shaped by both the specific SHAP variant, the model type, and the distribution.
IDSS conversations: Guy Bresler
Guy Bresler joined the MIT faculty in September 2015 as the Bonnie and Marty (1964) Tenenbaum Career Development Professor in the Department of Electrical Engineering and Computer Science (EECS). He also joined the Institute for Data, Systems, and Society (IDSS) -- which addresses complex societal challenges by advancing education and research at the intersection of statistics, data science, information and decision systems, and social sciences -- as a member of the Laboratory for Information and Decision Systems (LIDS). Bresler's research investigates the relationship between combinatorial structure and computational tractability of high-dimensional inference in graphical models and other statistical models. His current work focuses on learning graphical models from data, and explores how both data and computation requirements can be reduced if the model is subsequently used for a specific inference task. Bresler is also interested in applications of these methods, especially to recommendation systems and computational biology.
QuickNet: Maximizing Efficiency and Efficacy in Deep Architectures
We present QuickNet, a fast and accurate network architecture that is both faster and significantly more accurate than other fast deep architectures like SqueezeNet. Furthermore, it uses less parameters than previous networks, making it more memory efficient. We do this by making two major modifications to the reference Darknet model (Redmon et al, 2015): 1) The use of depthwise separable convolutions and 2) The use of parametric rectified linear units. We make the observation that parametric rectified linear units are computationally equivalent to leaky rectified linear units at test time and the observation that separable convolutions can be interpreted as a compressed Inception network (Chollet, 2016). Using these observations, we derive a network architecture, which we call QuickNet, that is both faster and more accurate than previous models. Our architecture provides at least four major advantages: (1) A smaller model size, which is more tenable on memory constrained systems; (2) A significantly faster network which is more tenable on computationally constrained systems; (3) A high accuracy of 95.7 percent on the CIFAR-10 Dataset which outperforms all but one result published so far, although we note that our works are orthogonal approaches and can be combined (4) Orthogonality to previous model compression approaches allowing for further speed gains to be realized.
Statistical Limits of Convex Relaxations
Wang, Zhaoran, Gu, Quanquan, Liu, Han
Many high dimensional sparse learning problems are formulated as nonconvex optimization. A popular approach to solve these nonconvex optimization problems is through convex relaxations such as linear and semidefinite programming. In this paper, we study the statistical limits of convex relaxations. Particularly, we consider two problems: Mean estimation for sparse principal submatrix and edge probability estimation for stochastic block model. We exploit the sum-of-squares relaxation hierarchy to sharply characterize the limits of a broad class of convex relaxations. Our result shows statistical optimality needs to be compromised for achieving computational tractability using convex relaxations. Compared with existing results on computational lower bounds for statistical problems, which consider general polynomial-time algorithms and rely on computational hardness hypotheses on problems like planted clique detection, our theory focuses on a broad class of convex relaxations and does not rely on unproven hypotheses.