Goto

Collaborating Authors

 blackwell


A Unifying Framework for Unsupervised Concept Extraction

arXiv.org Machine Learning

Techniques for concept extraction, such as sparse autoencoders and transcoders, aim to extract high-level symbolic concepts from low-level nonsymbolic representations. When these extracted concepts are used for downstream tasks such as model steering and unlearning, it is essential to understand their guarantees, or lack thereof. In this work, we present a unified theoretical framework for unsupervised concept extraction, in which we frame the task of concept extraction as identifying a generative model. We present a general meta-theorem for identifiability, which reduces the problem of establishing identifiability guarantees to the problem of characterizing the intersection of two sets. As we demonstrate on a range of widely-used approaches, this meta-theorem substantially simplifies the task of proving such guarantees, thus paving the way for the development of new, principled approaches for concept extraction.



Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point Solving

Neural Information Processing Systems

We develop new parameter-free and scale-free algorithms for solving convexconcave saddle-point problems. Our results are based on a new simple regret minimizer, the Conic Blackwell Algorithm+ (CBA+), which attains O(1/ T) average regret. Intuitively, our approach generalizes to other decision sets of interest ideas from the Counterfactual Regret minimization (CFR+) algorithm, which has very strong practical performance for solving sequential games on simplexes. We show how to implement CBA+ for the simplex, `p norm balls, and ellipsoidal confidence regions in the simplex, and we present numerical experiments for solving matrix games and distributionally robust optimization problems. Our empirical results show that CBA+ is a simple algorithm that outperforms state-ofthe-art methods on synthetic data and real data instances, without the need for any choice of step sizes or other algorithmic parameters.


The Theorems of Dr. David Blackwell and Their Contributions to Artificial Intelligence

arXiv.org Machine Learning

Dr. David Blackwell was a mathematician and statistician of the first rank, whose contributions to statistical theory, game theory, and decision theory predated many of the algorithmic breakthroughs that define modern artificial intelligence. This survey examines three of his most consequential theoretical results the Rao Blackwell theorem, the Blackwell Approachability theorem, and the Blackwell Informativeness theorem (comparison of experiments) and traces their direct influence on contemporary AI and machine learning. We show that these results, developed primarily in the 1940s and 1950s, remain technically live across modern subfields including Markov Chain Monte Carlo inference, autonomous mobile robot navigation (SLAM), generative model training, no-regret online learning, reinforcement learning from human feedback (RLHF), large language model alignment, and information design. NVIDIAs 2024 decision to name their flagship GPU architecture (Blackwell) provides vivid testament to his enduring relevance. We also document an emerging frontier: explicit Rao Blackwellized variance reduction in LLM RLHF pipelines, recently proposed but not yet standard practice. Together, Blackwell theorems form a unified framework addressing information compression, sequential decision making under uncertainty, and the comparison of information sources precisely the problems at the core of modern AI.






Jensen Huang Says Nvidia's New Vera Rubin Chips Are in 'Full Production'

WIRED

Jensen Huang Says Nvidia's New Vera Rubin Chips Are in'Full Production' The chip giant says Vera Rubin will sharply cut the cost of training and running AI models, strengthening the appeal of its integrated computing platform. Nvidia CEO Jensen Huang says that the company's next-generation AI superchip platform, Vera Rubin, is on schedule to begin arriving to customers later this year. "Today, I can tell you that Vera Rubin is in full production," Huang said during a press event on Monday at the annual CES technology trade show in Las Vegas. Rubin will cut the cost of running AI models to about one-tenth of Nvidia's current leading chip system, Blackwell, the company told analysts and journalists during a call on Sunday. Nvidia also said Rubin can train certain large models using roughly one-fourth as many chips as Blackwell requires.


Evaluating the Ability of Large Language Models to Reason about Cardinal Directions, Revisited

arXiv.org Artificial Intelligence

We investigate the abilities of 28 Large language Models (LLMs) to reason about cardinal directions (CDs) using a benchmark generated from a set of templates, extensively testing an LLM's ability to determine the correct CD given a particular scenario. The templates allow for a number of degrees of variation such as means of locomotion of the agent involved, and whether set in the first, second or third person. Even the newer Large Reasoning Models are unable to reliably determine the correct CD for all questions. This paper summarises and extends earlier work presented at COSIT-24.