Country
Price of Quality: Sufficient Conditions for Sparse Recovery using Mixed-Quality Data
Chaabouni, Youssef, Gamarnik, David
We study sparse recovery when observations come from mixed-quality sources: a small collection of high-quality measurements with small noise variance and a larger collection of lower-quality measurements with higher variance. For this heterogeneous-noise setting, we establish sample-size conditions for information-theoretic and algorithmic recovery. On the information-theoretic side, we show that it is sufficient for $(n_1, n_2)$ to satisfy a linear trade-off defining the Price of Quality: the number of low-quality samples needed to replace one high-quality sample. In the agnostic setting, where the decoder is completely agnostic to the quality of the data, it is uniformly bounded, and in particular one high-quality sample is never worth more than two low-quality samples for this sufficient condition to hold. In the informed setting, where the decoder is informed of per-sample variances, the price of quality can grow arbitrarily large. On the algorithmic side, we analyze the LASSO in the agnostic setting and show that the recovery threshold matches the homogeneous-noise case and only depends on the average noise level, revealing a striking robustness of computational recovery to data heterogeneity. Together, these results give the first conditions for sparse recovery with mixed-quality data and expose a fundamental difference between how the information-theoretic and algorithmic thresholds adapt to changes in data quality.
What should post-training optimize? A test-time scaling law perspective
Li, Muheng, Qian, Jian, Mou, Wenlong
Large language models are increasingly deployed with test-time strategies: sample $N$ responses, score them with a reward model or verifier, and return the best. This deployment rule exposes a mismatch in post-training: standard objectives optimize the mean reward of a single response, whereas best-of-$N$ performance is governed by the upper tail of the reward distribution. Recent test-time-aware objectives partly address this mismatch, but typically assume that training can use the same per-prompt rollout budget as deployment, which is impractical when post-training must cover many prompts while deployment can allocate much larger per-prompt test-time compute. We study this budget-mismatch regime, where only $m\ll N$ per-prompt rollouts are available during training but the target objective is best-of-$N$ deployment. Under structural assumptions on the reward tails, we show that the policy gradient of the best-of-$N$ objective can be approximated from a much smaller rollout group by extrapolating upper-tail statistics. This yields a family of Tail-Extrapolated estimators for best-of-$N$-oriented post-training: a simple direct estimator, Tail-Extrapolated Advantage (TEA), and a fixed-order debiased Prefix-TEA estimator based on moment cancellation. Experiments on instruction-following tasks show that TEA and Prefix-TEA improve best-of-$N$ performance across different language models, reward models and datasets under various training and test-time budget settings.
Factual recall in linear associative memories: sharp asymptotics and mechanistic insights
Giorlandino, Alessio, Goldt, Sebastian, Maillard, Antoine
Large language models demonstrate remarkable ability in factual recall, yet the fundamental limits of storing and retrieving input--output associations with neural networks remain unclear. We study these limits in a minimal setting: a linear associative memory that maps $p$ input embeddings in $\mathbb{R}^d$ to their corresponding~$d$-dimensional targets via a single layer, requiring each mapped input to be well separated from all other targets. Unlike in supervised classification, this strict separation induces~$p$ constraints per association and produces strong correlations between constraints that make a direct characterisation of the storage capacity difficult. Here, we provide a precise characterisation of this capacity in the following way. We first introduce a decoupled model in which each input has its own independent set of competing outputs, and provide numerical and analytical evidence that this decoupled model is equivalent to the original model in terms of storage capacity, spectra of the learnt weights, and storage mechanism. Using tools from statistical physics, we show that the decoupled model can store up to $p_c \log p_c / d^2 = 1 / 2$ associations, and generalise the computation of $p_c$ to linear two-layer architectures. Our analysis also gives mechanistic insight into how the optimal solution improves over a naรฏve Hebbian learning rule: rather than boosting input-output alignments with broad fluctuations, the optimal solution raises the correct scores just above the extreme-value threshold set by the competing outputs. These findings give a sharp statistical-physics characterisation of factual storage in linear networks and provide a baseline for understanding the memory capacity of more realistic neural architectures.
Ilya Sutskever Stands by His Role in Sam Altman's OpenAI Ouster: 'I Didn't Want It to Be Destroyed'
Ilya Sutskever Stands by His Role in Sam Altman's OpenAI Ouster: 'I Didn't Want It to Be Destroyed' The former OpenAI chief scientist may be estranged from the company, but he still came to its defense as he testified on Monday. Elon Musk's trial against OpenAI and Microsoft entered its final stretch on Monday, with testimony from Microsoft CEO Satya Nadella, former OpenAI chief scientist Ilya Sutskever, and current OpenAI chairman Bret Taylor. Sutskever drew the spotlight, revealing an ownership stake in OpenAI's $850-billion for-profit arm that is currently worth about $7 billion. That makes him one of the largest known individual shareholders of OpenAI. Earlier in the trial, OpenAI president Greg Brockman acknowledged for the first time that he has around $30 billion worth of OpenAI shares .
How the Trump-Xi summit could set superpower relations for many years to come
Security around Beijing's historic Tiananmen Square has been heightened for days, with rumours on social media swirling of a special parade or some big, choreographed event. Preparations for this major event have started with a whisper, but China appears ready to put on a show for US President Donald Trump. The visit will include talks, a banquet, and a visit to the Temple of Heaven, a complex of imperial temples where emperors would pray for a good harvest. And both Trump and Chinese President Xi Jinping will be hoping the visit will bear fruit. This summit between the world's two most powerful leaders is set to be one of the most consequential encounters for years.
Jackie and Shadow's eaglets can now see like their parents
Environment Animals Wildlife Birds Jackie and Shadow's eaglets can now see like their parents More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. They grow up so fast. Sandy and Luna pictured last week. Breakthroughs, discoveries, and DIY tips sent six days a week. High above Big Bear Lake in southern California, a pair of eaglets' eyesight is coming into focus.
This guy crammed a laptop into an Altoids tin
Yes, it works--if you have small fingers. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Fitting everything inside resembles a game of'Tetris.' Breakthroughs, discoveries, and DIY tips sent six days a week. Leftover Altoid tins are staple components in all types of handy, DIY projects . Once you eat the mints, the aluminum containers routinely house basic first aid kits, miniature speakers, sewing accessories, and even watercolor paints.
Testing for 'Bad Cholesterol' Doesn't Tell the Whole Story
Testing for'Bad Cholesterol' Doesn't Tell the Whole Story So why don't more doctors use it? For decades, assessing cholesterol risk has been built around a simple idea: Lower "bad" cholesterol, lower your chance of a heart attack . The test at the center of that approach measures how much low-density lipoprotein, or LDL cholesterol, is circulating in part of the blood. It has shaped everything from clinical guidelines to the widespread use of statins, medications that reduce LDL. Lowering LDL cholesterol reduces heart attacks, strokes, and early death.
Google announces its first-ever discovery of a zero-day exploit made with AI
We can now add cybercrimes to the list of growing concerns associated with artificial intelligence. Google's Threat Intelligence Group (GTIG) said it discovered, for the first time ever, a threat actor using a zero-day exploit that it believes was developed by AI. Zero-day vulnerabilities are often the most dangerous since they're unknown to the targets, leaving them with zero days to prepare for the attack. Google said in the report the threat actor was planning to use it in a mass exploitation event, but its proactive discovery may have prevented its use. Google added that it doesn't believe its own Gemini models were used, but still has high confidence an AI model was part of discovering the vulnerability and weaponizing an exploit.