Not enough data to create a plot.
Try a different view from the menu above.
On Completeness-aware Concept-Based Explanations in Deep Neural Networks, Been Kim
Human explanations of high-level decisions are often expressed in terms of key concepts the decisions are based on. In this paper, we study such concept-based explainability for Deep Neural Networks (DNNs). First, we define the notion of completeness, which quantifies how sufficient a particular set of concepts is in explaining a model's prediction behavior based on the assumption that complete concept scores are sufficient statistics of the model prediction. Next, we propose a concept discovery method that aims to infer a complete set of concepts that are additionally encouraged to be interpretable, which addresses the limitations of existing methods on concept explanations. To define an importance score for each discovered concept, we adapt game-theoretic notions to aggregate over sets and propose ConceptSHAP. Via proposed metrics and user studies, on a synthetic dataset with apriori-known concept explanations, as well as on real-world image and language datasets, we validate the effectiveness of our method in finding concepts that are both complete in explaining the decisions and interpretable.
the questions raised by each reviewer separately. Layer Choice (R2,R3) The layer can be chosen depending on the size of the nearest neighbor patch the user would
We thank all reviewers for their constructive comments. We fixed the architecture g in our previous experiments to a two layer neural network. We discussed how to select the layer choice and its impact above. The computational cost is low since the pretrained model is fixed, and we only optimize for g and c. We didn't test on Imagenet since we can't visualize results for all 1000 classes.
A Limitation of the PAC-Bayes Framework
PAC-Bayes is a useful framework for deriving generalization bounds which was introduced by McAllester ('98). This framework has the flexibility of deriving distribution-and algorithm-dependent bounds, which are often tighter than VCrelated uniform convergence bounds. In this manuscript we present a limitation for the PAC-Bayes framework. We demonstrate an easy learning task which is not amenable to a PAC-Bayes analysis. Specifically, we consider the task of linear classification in 1D; it is well-known that this task is learnable using just O(log(1/δ)/ɛ) examples. On the other hand, we show that this fact can not be proved using a PAC-Bayes analysis: for any algorithm that learns 1-dimensional linear classifiers there exists a (realizable) distribution for which the PAC-Bayes bound is arbitrarily large.
Fair Sparse Regression with Clustering: An Invex Relaxation for a Combinatorial Problem
In this paper, we study the problem of fair sparse regression on a biased dataset where bias depends upon a hidden binary attribute. The presence of a hidden attribute adds an extra layer of complexity to the problem by combining sparse regression and clustering with unknown binary labels. The corresponding optimization problem is combinatorial, but we propose a novel relaxation of it as an invex optimization problem. To the best of our knowledge, this is the first invex relaxation for a combinatorial problem. We show that the inclusion of the debiasing/fairness constraint in our model has no adverse effect on the performance. Rather, it enables the recovery of the hidden attribute.
Teaching Language Model Agents How to Self-Improve
A central piece in enabling intelligent agentic behavior in foundation models is to make them capable of introspecting upon their behavior, reasoning, and correcting their mistakes as more computation or interaction is available. Even the strongest proprietary large language models (LLMs) do not quite exhibit the ability of continually improving their responses sequentially. In this paper, we develop RISE: Recursive IntroSpEction, an approach for fine-tuning LLMs to introduce this capability, despite prior work hypothesizing that this capability may not be possible to attain. Our approach prescribes an iterative fine-tuning procedure, which attempts to teach the model how to alter its response after having executed previously unsuccessful attempts to solve a hard test-time problem, with optionally additional environment feedback. RISE poses fine-tuning for a singleturn prompt as solving a multi-turn Markov decision process (MDP), where the initial state is the prompt. Inspired by principles in online imitation and offline reinforcement learning, we propose strategies for multi-turn data collection and training so as to imbue an LLM with the capability to recursively detect and correct its previous mistakes in subsequent iterations. Our experiments show that RISE enables Llama2, Llama3, and Mistral models to improve themselves with more turns on reasoning tasks, outperforming several single-turn strategies given an equal amount of inference-time computation. We also find that RISE scales well, often attaining larger benefits with more capable models, without disrupting one-turn abilities as a result of expressing more complex distributions.
March Madness TV setups for the ultimate viewing experience
March Madness season means it's time to upgrade your TV setup. Watching March Madness games is one of the highlights of basketball season, but if you don't have a great TV setup, you're not truly experiencing the games as much as you could. A better setup for March Madness means you can watch the games more clearly and hear everything the coaches, fans, players and refs are yelling. A new TV with crystal-clear picture and sound is a must. Or, you can upgrade the TV you currently have with a new soundbar system and a streaming device like Roku Sticks, Apple TV or an Amazon Fire Stick.
A Proofs Throughout this section, we use p(s =a) to denote the probability of the state-action pair at time step t being equal to (s, a), and the probability of a trajectory by p(τ) = p(s, a
Let's first consider the minimum for ˆV, Next, we prove the second part of the theorem regarding f. Note that, unlike the original PPO which samples mini-batches of frames, we sample on a trajectory-by-trajectory basis. For example, assume the batch size is 256 and n = 128 for the backup horizon, then each batch contains 2 128-step trajectories. C.1 Computational resources All the experiments were performed on an internal cluster of NVIDIA A100 GPUs. Training a MinAtar agent in a single environment takes less than 30 minutes (wall-clock time).
Direct Advantage Estimation Hsiao-Ru Pan Nico Gürtler 1 Alexander Neitz 2
The predominant approach in reinforcement learning is to assign credit to actions based on the expected return. However, we show that the return may depend on the policy in a way which could lead to excessive variance in value estimation and slow down learning. Instead, we show that the advantage function can be interpreted as causal effects and shares similar properties with causal representations. Based on this insight, we propose Direct Advantage Estimation (DAE), a novel method that can model the advantage function and estimate it directly from on-policy data while simultaneously minimizing the variance of the return without requiring the (action-)value function. We also relate our method to Temporal Difference methods by showing how value functions can be seamlessly integrated into DAE. The proposed method is easy to implement and can be readily adapted by modern actor-critic methods. We evaluate DAE empirically on three discrete control domains and show that it can outperform generalized advantage estimation (GAE), a strong baseline for advantage estimation, on a majority of the environments when applied to policy optimization.