Goto

Collaborating Authors

 Chen, Lijie


LatentQA: Teaching LLMs to Decode Activations Into Natural Language

arXiv.org Artificial Intelligence

Interpretability methods seek to understand language model representations, yet the outputs of most such methods -- circuits, vectors, scalars -- are not immediately human-interpretable. In response, we introduce LatentQA, the task of answering open-ended questions about model activations in natural language. Towards solving LatentQA, we propose Latent Interpretation Tuning (LIT), which finetunes a decoder LLM on a dataset of activations and associated question-answer pairs, similar to how visual instruction tuning trains on question-answer pairs associated with images. We use the decoder for diverse reading applications, such as extracting relational knowledge from representations or uncovering system prompts governing model behavior. Our decoder also specifies a differentiable loss that we use to control models, such as debiasing models on stereotyped sentences and controlling the sentiment of generations. Finally, we extend LatentQA to reveal harmful model capabilities, such as generating recipes for bioweapons and code for hacking.


Theoretical limitations of multi-layer Transformer

arXiv.org Artificial Intelligence

Transformers, especially the decoder-only variants, are the backbone of most modern large language models; yet we do not have much understanding of their expressive power except for the simple $1$-layer case. Due to the difficulty of analyzing multi-layer models, all previous work relies on unproven complexity conjectures to show limitations for multi-layer Transformers. In this work, we prove the first $\textit{unconditional}$ lower bound against multi-layer decoder-only transformers. For any constant $L$, we prove that any $L$-layer decoder-only transformer needs a polynomial model dimension ($n^{\Omega(1)}$) to perform sequential composition of $L$ functions over an input of $n$ tokens. As a consequence, our results give: (1) the first depth-width trade-off for multi-layer transformers, exhibiting that the $L$-step composition task is exponentially harder for $L$-layer models compared to $(L+1)$-layer ones; (2) an unconditional separation between encoder and decoder, exhibiting a hard task for decoders that can be solved by an exponentially shallower and smaller encoder; (3) a provable advantage of chain-of-thought, exhibiting a task that becomes exponentially easier with chain-of-thought. On the technical side, we propose the multi-party $\textit{autoregressive}$ $\textit{communication}$ $\textit{model}$ that captures the computation of a decoder-only Transformer. We also introduce a new proof technique that finds a certain $\textit{indistinguishable}$ $\textit{decomposition}$ of all possible inputs iteratively for proving lower bounds in this model. We believe our new communication model and proof technique will be helpful to further understand the computational power of transformers.


On Distributed Differential Privacy and Counting Distinct Elements

arXiv.org Machine Learning

We study the setup where each of $n$ users holds an element from a discrete set, and the goal is to count the number of distinct elements across all users, under the constraint of $(\epsilon, \delta)$-differentially privacy: - In the non-interactive local setting, we prove that the additive error of any protocol is $\Omega(n)$ for any constant $\epsilon$ and for any $\delta$ inverse polynomial in $n$. - In the single-message shuffle setting, we prove a lower bound of $\Omega(n)$ on the error for any constant $\epsilon$ and for some $\delta$ inverse quasi-polynomial in $n$. We do so by building on the moment-matching method from the literature on distribution estimation. - In the multi-message shuffle setting, we give a protocol with at most one message per user in expectation and with an error of $\tilde{O}(\sqrt(n))$ for any constant $\epsilon$ and for any $\delta$ inverse polynomial in $n$. Our protocol is also robustly shuffle private, and our error of $\sqrt(n)$ matches a known lower bound for such protocols. Our proof technique relies on a new notion, that we call dominated protocols, and which can also be used to obtain the first non-trivial lower bounds against multi-message shuffle protocols for the well-studied problems of selection and learning parity. Our first lower bound for estimating the number of distinct elements provides the first $\omega(\sqrt(n))$ separation between global sensitivity and error in local differential privacy, thus answering an open question of Vadhan (2017). We also provide a simple construction that gives $\tilde{\Omega}(n)$ separation between global sensitivity and error in two-party differential privacy, thereby answering an open question of McGregor et al. (2011).


Nearly Optimal Sampling Algorithms for Combinatorial Pure Exploration

arXiv.org Machine Learning

We study the combinatorial pure exploration problem Best-Set in stochastic multi-armed bandits. In a Best-Set instance, we are given $n$ arms with unknown reward distributions, as well as a family $\mathcal{F}$ of feasible subsets over the arms. Our goal is to identify the feasible subset in $\mathcal{F}$ with the maximum total mean using as few samples as possible. The problem generalizes the classical best arm identification problem and the top-$k$ arm identification problem, both of which have attracted significant attention in recent years. We provide a novel instance-wise lower bound for the sample complexity of the problem, as well as a nontrivial sampling algorithm, matching the lower bound up to a factor of $\ln|\mathcal{F}|$. For an important class of combinatorial families, we also provide polynomial time implementation of the sampling algorithm, using the equivalence of separation and optimization for convex program, and approximate Pareto curves in multi-objective optimization. We also show that the $\ln|\mathcal{F}|$ factor is inevitable in general through a nontrivial lower bound construction. Our results significantly improve several previous results for several important combinatorial constraints, and provide a tighter understanding of the general Best-Set problem. We further introduce an even more general problem, formulated in geometric terms. We are given $n$ Gaussian arms with unknown means and unit variance. Consider the $n$-dimensional Euclidean space $\mathbb{R}^n$, and a collection $\mathcal{O}$ of disjoint subsets. Our goal is to determine the subset in $\mathcal{O}$ that contains the $n$-dimensional vector of the means. The problem generalizes most pure exploration bandit problems studied in the literature. We provide the first nearly optimal sample complexity upper and lower bounds for the problem.


Towards Instance Optimal Bounds for Best Arm Identification

arXiv.org Machine Learning

In the classical best arm identification (Best-$1$-Arm) problem, we are given $n$ stochastic bandit arms, each associated with a reward distribution with an unknown mean. We would like to identify the arm with the largest mean with probability at least $1-\delta$, using as few samples as possible. Understanding the sample complexity of Best-$1$-Arm has attracted significant attention since the last decade. However, the exact sample complexity of the problem is still unknown. Recently, Chen and Li made the gap-entropy conjecture concerning the instance sample complexity of Best-$1$-Arm. Given an instance $I$, let $\mu_{[i]}$ be the $i$th largest mean and $\Delta_{[i]}=\mu_{[1]}-\mu_{[i]}$ be the corresponding gap. $H(I)=\sum_{i=2}^n\Delta_{[i]}^{-2}$ is the complexity of the instance. The gap-entropy conjecture states that $\Omega\left(H(I)\cdot\left(\ln\delta^{-1}+\mathsf{Ent}(I)\right)\right)$ is an instance lower bound, where $\mathsf{Ent}(I)$ is an entropy-like term determined by the gaps, and there is a $\delta$-correct algorithm for Best-$1$-Arm with sample complexity $O\left(H(I)\cdot\left(\ln\delta^{-1}+\mathsf{Ent}(I)\right)+\Delta_{[2]}^{-2}\ln\ln\Delta_{[2]}^{-1}\right)$. If the conjecture is true, we would have a complete understanding of the instance-wise sample complexity of Best-$1$-Arm. We make significant progress towards the resolution of the gap-entropy conjecture. For the upper bound, we provide a highly nontrivial algorithm which requires \[O\left(H(I)\cdot\left(\ln\delta^{-1} +\mathsf{Ent}(I)\right)+\Delta_{[2]}^{-2}\ln\ln\Delta_{[2]}^{-1}\mathrm{polylog}(n,\delta^{-1})\right)\] samples in expectation. For the lower bound, we show that for any Gaussian Best-$1$-Arm instance with gaps of the form $2^{-k}$, any $\delta$-correct monotone algorithm requires $\Omega\left(H(I)\cdot\left(\ln\delta^{-1} + \mathsf{Ent}(I)\right)\right)$ samples in expectation.


Bounded Rationality of Restricted Turing Machines

AAAI Conferences

Bounded rationality aims to understand the effects of how limited rationality affects decision-making. The traditional models in game theory and multiagent system research, such as finite automata or unrestricted Turing machine, fall short of capturing how intelligent agents make decision in realistic applications. To address this problem, we model bounded rational agents as restricted Turing machines: restrictions on running time and on storage space. We study our model under the context of two-person repeated games. In the case where the running time of Turing machines is restricted, we show that computing the best response of a given strategy is much harder than the strategy itself. In the case where the storage space of the Turing machines is restricted, we show the best response of a space restricted strategy can not be implemented by machines within the same size (up to a constant factor). Finally, we study how these restrictions affect the set of Nash equilibria in infinitely repeated games.We show restricting the agent’s computational resources will give rise to new Nash equilibria.


Nearly Instance Optimal Sample Complexity Bounds for Top-k Arm Selection

arXiv.org Machine Learning

In the Best-$k$-Arm problem, we are given $n$ stochastic bandit arms, each associated with an unknown reward distribution. We are required to identify the $k$ arms with the largest means by taking as few samples as possible. In this paper, we make progress towards a complete characterization of the instance-wise sample complexity bounds for the Best-$k$-Arm problem. On the lower bound side, we obtain a novel complexity term to measure the sample complexity that every Best-$k$-Arm instance requires. This is derived by an interesting and nontrivial reduction from the Best-$1$-Arm problem. We also provide an elimination-based algorithm that matches the instance-wise lower bound within doubly-logarithmic factors. The sample complexity of our algorithm strictly dominates the state-of-the-art for Best-$k$-Arm (module constant factors).