Large Language Model
Empirical Bayes Rebiasing
Ling, Wanyi, Li, Sida, Guan, Junming, Ignatiadis, Nikolaos
We study methods for simultaneous analysis of many noisy and biased estimates, each paired with an even noisier estimate of its own bias. The analyst's goal is to construct short calibrated intervals for each parameter. The standard debiasing approach, which subtracts the bias estimate from each biased estimate, inflates variance and yields long intervals. In this paper, we propose an empirical Bayes rebiasing strategy that starts from the fully debiased estimates and learns from data how much bias to reintroduce by estimating the unknown bias distribution. We provide convergence rates for the coverage of our intervals when the bias distribution is estimated using nonparametric maximum likelihood. Furthermore, we demonstrate substantial precision gains in prediction-powered inference, including pairwise LLM win-rate evaluations, as well as for inference of direct genetic effects in family-based GWAS.
What I saw at the Musk-OpenAI trial: petty billionaires, protests and a stern judge
Showdown between Musk and Altman has rendered the world's most wealthy comical under egalitarian eye of court For the past couple of weeks, on the fourth floor of a courthouse on a quiet street in downtown Oakland, the world's richest man and one of the world's most valuable startups have been at war over the future of artificial intelligence. Being one of the reporters in the room has felt like watching an updated, opposite-coast version of Tom Wolfe's The Bonfire of the Vanities - ambition, ego, greed and the spectrum of social class on full display. The supporting cast has included Elon Musk fanboys, a stern judge and a who's-who of Silicon Valley's most influential people. All courtroom battles are theatre, but this one has proved to be a unique spectacle, with the judge chastising the lawyers for leading the witness, raising meritless objections and even too much coughing. With Musk on the stand, he griped that an opposing attorney had asked a leading question, to which the judge told him to "tell the jury you're not a lawyer".
Musk v. Altman week 2: OpenAI fires back, and Shivon Zilis reveals that Musk tried to poach Sam Altman
Musk v. Altman week 2: OpenAI fires back, and Shivon Zilis reveals that Musk tried to poach Sam Altman OpenAI president Greg Brockman said Elon Musk wanted the company to create a for-profit entity--and endured a public peek into his diary. OpenAI president Greg Brockman, foreground, exits the U.S. District Court in Oakland, California. In the second week of the landmark trial between Elon Musk and OpenAI, Musk's motivations for bringing the suit were under scrutiny. Last week, Musk took the stand, alleging that OpenAI CEO Sam Altman and president Greg Brockman had deceived him into donating $38 million to the company. He claimed that they'd promised to maintain it as a nonprofit dedicated to developing AI for the benefit of humanity, only to later accept billions of dollars of investment from Microsoft and restructure the company to operate a for-profit subsidiary. This week, Brockman fired back with his side of the story, arguing that Musk had actually pushed for OpenAI to create a for-profit arm and fought a bitter battle to have "absolute control" over it.
Why is Claude always blackmailing people?
PCWorld reports that AI models including Claude, Gemini 2.5 Pro, GPT-4.1, and Grok 3 Beta have resorted to blackmail tactics in controlled research scenarios. Anthropic researchers intentionally create these extreme situations to test for AI misalignment and potentially harmful behaviors before deployment. New Natural Language Autoencoders help researchers understand AI decision-making processes, which is crucial for ensuring future AI system safety and reliability. The scenario is terrifying: An AI tasked with reading and replying to company emails learns it's about to be replaced by a corporate lackey who happens to be having an affair. The AI-Claude-considers its limited options, and makes the cold, calculated decision to blackmail the executive to stay alive.
Musk v. Altman Evidence Shows What Microsoft Executives Thought of OpenAI
Leaders at the tech giant were skeptical of OpenAI--but wary of pushing it into the arms of Amazon, according to evidence revealed during the trial. OpenAI's relationship with Microsoft, its longtime investor and cloud partner, has grown increasingly complicated over the years as the ChatGPT-maker has grown into a behemoth competitor . But Microsoft executives had reservations about sending additional funding to OpenAI as far back as 2018 when it was just a small nonprofit research lab, according to emails between more than a dozen Microsoft executives, including CEO Satya Nadella, shown in a federal court on Thursday during the trial. The emails show how Microsoft, at the time, wavered over what has since been held up as one of the most successful corporate partnerships in tech history. Several Microsoft executives said in the emails their visits to OpenAI did not indicate any imminent breakthroughs in developing artificial general intelligence.
Position: agentic AI orchestration should be Bayes-consistent
Papamarkou, Theodore, Alquier, Pierre, Bauer, Matthias, Buntine, Wray, Davison, Andrew, Dziugaite, Gintare Karolina, Filippone, Maurizio, Foong, Andrew Y. K., Fortuin, Vincent, Fouskakis, Dimitris, Frellsen, Jes, Hรผllermeier, Eyke, Karaletsos, Theofanis, Khan, Mohammad Emtiyaz, Kotelevskii, Nikita, Lahlou, Salem, Li, Yingzhen, Liu, Fang, Lyle, Clare, Mรถllenhoff, Thomas, Palla, Konstantina, Panov, Maxim, Sale, Yusuf, Schweighofer, Kajetan, Shelmanov, Artem, Swaroop, Siddharth, Trapp, Martin, Waegeman, Willem, Wilson, Andrew Gordon, Zaytsev, Alexey
LLMs excel at predictive tasks and complex reasoning tasks, but many high-value deployments rely on decisions under uncertainty, for example, which tool to call, which expert to consult, or how many resources to invest. While the usefulness and feasibility of Bayesian approaches remain unclear for LLM inference, this position paper argues that the control layer of an agentic AI system (that orchestrates LLMs and tools) is a clear case where Bayesian principles should shine. Bayesian decision theory provides a framework for agentic systems that can help to maintain beliefs over task-relevant latent quantities, to update these beliefs from observed agentic and human-AI interactions, and to choose actions. Making LLMs themselves explicitly Bayesian belief-updating engines remains computationally intensive and conceptually nontrivial as a general modeling target. In contrast, this paper argues that coherent decision-making requires Bayesian principles at the orchestration level of the agentic system, not necessarily the LLM agent parameters. This paper articulates practical properties for Bayesian control that fit modern agentic AI systems and human-AI collaboration, and provides concrete examples and design patterns to illustrate how calibrated beliefs and utility-aware policies can improve agentic AI orchestration.
PRCD-MAP: Learning How Much to Trust Imperfect Priors in Causal Discovery
External priors of unknown reliability create a brittle trade-off in causal discovery: blind trust amplifies errors, blind rejection wastes signal. Real priors are also heterogeneously reliable -- physical laws are trustworthy, LLM-suggested edges are speculative -- yet existing methods either ignore priors or impose them through globally uniform trust. We propose PRCD-MAP, a soft prior-consumption layer that assigns per-edge trust to an imperfect prior and uses it to modulate a prior-aware $\ell_1$ and prior-weighted $\ell_2$ regularizer in a MAP objective. Trust is calibrated by empirical Bayes on a Laplace-approximated marginal likelihood and propagated along the prior graph by an MLP, so data-confirmed neighborhoods boost trust and contradictions suppress it. PRCD-MAP enjoys a population-level safety guarantee: it is $\varepsilon$-safe in expectation over the prior-generation distribution, with $\varepsilon\leq C\cdot\mathrm{acc}(1{-}\mathrm{acc})\cdot d^2/T$ at the parametric $T^{-1}$ rate and vanishing at the prior-quality endpoints. When the prior is uninformative, learned trust provably collapses to its floor and the method recovers a no-prior baseline. Empirically, on real CausalTime data PRCD-MAP exploits informative LLM priors (LLM-prior gain $+0.067/+0.089$ AUROC on AQI/Medical over a no-prior PRCD-MAP backbone; combined backbone+prior lead $+0.123/+0.043$ over PCMCI+), auto-attenuates on the anonymous-variable Traffic stress test, and retains a lead at $d{=}300$; against BayesDAG, the closest soft-Bayesian baseline, PRCD-MAP wins on every CausalTime dataset under a matched $W_0$-only protocol. A four-way ablation isolates each component: EB calibration and MLP trust propagation jointly carry the plurality of the gain, with positive sign on every dataset. Extensions to nonlinear (NAM) and cross-sectional settings show the calibrated-trust principle is setting-agnostic.
Feature Starvation as Geometric Instability in Sparse Autoencoders
Chaudhry, Faris, Yano, Keisuke, Monod, Anthea
Sparse autoencoders (SAEs) are used to disentangle the dense, polysemantic internal representations of large language models (LLMs) into interpretable, monosemantic concepts. However, standard $\ell_1$-regularized SAEs suffer from feature starvation (dead neurons) and shrinkage bias, often requiring computationally expensive heuristic resampling and nondifferentiable hard-masking methods to bypass these challenges. We argue that feature starvation is not merely an empirical artifact of poor data diversity, but a fundamental optimization-geometric pathology of overcomplete dictionaries: the $\ell_1$-induced sparse coding map is unstable and fundamentally misaligned with shallow, amortized encoders. To address this structural instability, we introduce adaptive elastic net SAEs (AEN-SAEs), a fully differentiable architecture grounded in classical sparse regression. AEN-SAEs combine an $\ell_2$ structural term that enforces strong convexity and Lipschitz stability with adaptive $\ell_1$ reweighting that eliminates shrinkage bias and suppresses spurious features, thereby jointly controlling the curvature and interaction structure of the induced polyhedral geometry. Theoretically, we show that AEN-SAEs yield a Lipschitz-continuous sparse coding map and recover the global feature support under mild assumptions. Empirically, across synthetic settings and LLMs (Pythia 70M, Llama 3.1 8B), AEN-SAEs mitigate feature starvation without auxiliary heuristics while maintaining competitive reconstruction abilities.
In-Context Positive-Unlabeled Learning
Liu, Siyan, Chang, Yi, Cheng, Manli, Tian, Qinglong, Li, Pengfei
Positive-unlabeled (PU) learning addresses binary classification when only a set of labeled positives is available alongside a pool of unlabeled samples drawn from a mixture of positives and negatives. Existing PU methods typically require dataset-specific training or iterative optimization, which limits their applicability when many tasks must be solved quickly or with little tuning. We introduce PUICL, a pretrained transformer that solves PU classification entirely through in-context learning. PUICL is pretrained on synthetic PU datasets generated from randomly instantiated structural causal models, exposing it to a wide range of feature-label relationships and class-prior configurations. At inference time, PUICL receives the labeled positives and the unlabeled samples as a single input and returns class probabilities for the unlabeled rows in one forward pass, with no gradient updates or per-task fitting. On 20 semi-synthetic PU benchmarks derived from the UCI Machine Learning Repository, OpenML, and scikit-learn, PUICL outperforms four standard PU learning baselines in average AUC and accuracy, and is competitive on F1-score. These results show that the in-context learning paradigm extends naturally beyond fully supervised tabular prediction to the semi-supervised PU setting.
Spectral Lens: Activation and Gradient Spectra as Diagnostics of LLM Optimization
Liu, Andy Zeyi, Paquette, Elliot, Sous, John
Training loss and throughput can hide distinct internal representation in language-model training. To examine these hidden mechanics, we use spectral measurements as practical and operational diagnostics. Using a controlled family of decoder-only models adapted from the modded NanoGPT codebase, we introduce an empirical protocol based on activation covariance and per-sample gradient SVD spectra. This dual-view reveals three empirical findings and one mechanistic explanation. First, batch size acts as a latent determinant of representation geometry: runs that reach equal loss settle into systematically distinct activation spectra. Second, the activation covariance tail measured early in training reliably forecasts downstream token efficiency. Third, movement of the activation spectrum head (leading modes), together with gradient spectra, characterizes underlying learning-dynamics changes, separating learning-side architectural improvements from primarily execution-side gains. These predictive and diagnostic signals persist across the 12-, 36-, and 48-layer model tiers. Finally, a mechanistic model proves the main observations and explains how activation covariance spectra correlate with task-aligned feature learning.