AITopics | Country

Collaborating Authors

Country

Price of Quality: Sufficient Conditions for Sparse Recovery using Mixed-Quality Data

arXiv.org Machine LearningMay-12-2026

We study sparse recovery when observations come from mixed-quality sources: a small collection of high-quality measurements with small noise variance and a larger collection of lower-quality measurements with higher variance. For this heterogeneous-noise setting, we establish sample-size conditions for information-theoretic and algorithmic recovery. On the information-theoretic side, we show that it is sufficient for $(n_1, n_2)$ to satisfy a linear trade-off defining the Price of Quality: the number of low-quality samples needed to replace one high-quality sample. In the agnostic setting, where the decoder is completely agnostic to the quality of the data, it is uniformly bounded, and in particular one high-quality sample is never worth more than two low-quality samples for this sufficient condition to hold. In the informed setting, where the decoder is informed of per-sample variances, the price of quality can grow arbitrarily large. On the algorithmic side, we analyze the LASSO in the agnostic setting and show that the recovery threshold matches the homogeneous-noise case and only depends on the average noise level, revealing a striking robustness of computational recovery to data heterogeneity. Together, these results give the first conditions for sparse recovery with mixed-quality data and expose a fundamental difference between how the information-theoretic and algorithmic thresholds adapt to changes in data quality.

artificial intelligence, data quality, machine learning, (19 more...)

arXiv.org Machine Learning

2605.10713

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

What should post-training optimize? A test-time scaling law perspective

Li, Muheng, Qian, Jian, Mou, Wenlong

arXiv.org Machine LearningMay-12-2026

Large language models are increasingly deployed with test-time strategies: sample $N$ responses, score them with a reward model or verifier, and return the best. This deployment rule exposes a mismatch in post-training: standard objectives optimize the mean reward of a single response, whereas best-of-$N$ performance is governed by the upper tail of the reward distribution. Recent test-time-aware objectives partly address this mismatch, but typically assume that training can use the same per-prompt rollout budget as deployment, which is impractical when post-training must cover many prompts while deployment can allocate much larger per-prompt test-time compute. We study this budget-mismatch regime, where only $m\ll N$ per-prompt rollouts are available during training but the target objective is best-of-$N$ deployment. Under structural assumptions on the reward tails, we show that the policy gradient of the best-of-$N$ objective can be approximated from a much smaller rollout group by extrapolating upper-tail statistics. This yields a family of Tail-Extrapolated estimators for best-of-$N$-oriented post-training: a simple direct estimator, Tail-Extrapolated Advantage (TEA), and a fixed-order debiased Prefix-TEA estimator based on moment cancellation. Experiments on instruction-following tasks show that TEA and Prefix-TEA improve best-of-$N$ performance across different language models, reward models and datasets under various training and test-time budget settings.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2605.10716

Country:

North America (0.27)
Asia (0.27)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Factual recall in linear associative memories: sharp asymptotics and mechanistic insights

Giorlandino, Alessio, Goldt, Sebastian, Maillard, Antoine

arXiv.org Machine LearningMay-12-2026

Large language models demonstrate remarkable ability in factual recall, yet the fundamental limits of storing and retrieving input--output associations with neural networks remain unclear. We study these limits in a minimal setting: a linear associative memory that maps $p$ input embeddings in $\mathbb{R}^d$ to their corresponding~$d$-dimensional targets via a single layer, requiring each mapped input to be well separated from all other targets. Unlike in supervised classification, this strict separation induces~$p$ constraints per association and produces strong correlations between constraints that make a direct characterisation of the storage capacity difficult. Here, we provide a precise characterisation of this capacity in the following way. We first introduce a decoupled model in which each input has its own independent set of competing outputs, and provide numerical and analytical evidence that this decoupled model is equivalent to the original model in terms of storage capacity, spectra of the learnt weights, and storage mechanism. Using tools from statistical physics, we show that the decoupled model can store up to $p_c \log p_c / d^2 = 1 / 2$ associations, and generalise the computation of $p_c$ to linear two-layer architectures. Our analysis also gives mechanistic insight into how the optimal solution improves over a naïve Hebbian learning rule: rather than boosting input-output alignments with broad fluctuations, the optimal solution raises the correct scores just above the extreme-value threshold set by the competing outputs. These findings give a sharp statistical-physics characterisation of factual storage in linear networks and provide a baseline for understanding the memory capacity of more realistic neural architectures.

cit, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2605.10795

Country: Europe (0.92)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Ilya Sutskever Stands by His Role in Sam Altman's OpenAI Ouster: 'I Didn't Want It to Be Destroyed'

WIREDMay-11-2026, 23:51:35 GMT

Ilya Sutskever Stands by His Role in Sam Altman's OpenAI Ouster: 'I Didn't Want It to Be Destroyed' The former OpenAI chief scientist may be estranged from the company, but he still came to its defense as he testified on Monday. Elon Musk's trial against OpenAI and Microsoft entered its final stretch on Monday, with testimony from Microsoft CEO Satya Nadella, former OpenAI chief scientist Ilya Sutskever, and current OpenAI chairman Bret Taylor. Sutskever drew the spotlight, revealing an ownership stake in OpenAI's $850-billion for-profit arm that is currently worth about $7 billion. That makes him one of the largest known individual shareholders of OpenAI. Earlier in the trial, OpenAI president Greg Brockman acknowledged for the first time that he has around $30 billion worth of OpenAI shares .

large language model, machine learning, natural language, (16 more...)

WIRED

Country: North America > United States > California (0.14)

Industry:

Information Technology (1.00)
Law > Litigation (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

How the Trump-Xi summit could set superpower relations for many years to come

BBC NewsMay-11-2026, 23:20:53 GMT

Security around Beijing's historic Tiananmen Square has been heightened for days, with rumours on social media swirling of a special parade or some big, choreographed event. Preparations for this major event have started with a whisper, but China appears ready to put on a show for US President Donald Trump. The visit will include talks, a banquet, and a visit to the Temple of Heaven, a complex of imperial temples where emperors would pray for a good harvest. And both Trump and Chinese President Xi Jinping will be hoping the visit will bear fruit. This summit between the world's two most powerful leaders is set to be one of the most consequential encounters for years.

artificial intelligence, china, social media, (14 more...)

BBC News

Country:

North America > United States (1.00)
Asia > Middle East > Iran (0.73)
Asia > China > Beijing > Beijing (0.28)

Industry:

Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Asia Government > China Government (1.00)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.68)

Add feedback

Jackie and Shadow's eaglets can now see like their parents

Environment Animals Wildlife Birds Jackie and Shadow's eaglets can now see like their parents More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. They grow up so fast. Sandy and Luna pictured last week. Breakthroughs, discoveries, and DIY tips sent six days a week. High above Big Bear Lake in southern California, a pair of eaglets' eyesight is coming into focus.

artificial intelligence, chick, jackie and shadow, (11 more...)

Popular Science

Country: North America > United States > California (0.73)

Technology: Information Technology > Artificial Intelligence (0.50)

Add feedback

This guy crammed a laptop into an Altoids tin

Yes, it works--if you have small fingers. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Fitting everything inside resembles a game of'Tetris.' Breakthroughs, discoveries, and DIY tips sent six days a week. Leftover Altoid tins are staple components in all types of handy, DIY projects . Once you eat the mints, the aluminum containers routinely house basic first aid kits, miniature speakers, sewing accessories, and even watercolor paints.

artificial intelligence, exercising ingenuity, physics popular science video space, (7 more...)

Popular Science

Country: North America > United States (0.15)

Industry: Health & Medicine > Therapeutic Area (0.30)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

Testing for 'Bad Cholesterol' Doesn't Tell the Whole Story

WIREDMay-11-2026, 18:40:37 GMT

Testing for'Bad Cholesterol' Doesn't Tell the Whole Story So why don't more doctors use it? For decades, assessing cholesterol risk has been built around a simple idea: Lower "bad" cholesterol, lower your chance of a heart attack . The test at the center of that approach measures how much low-density lipoprotein, or LDL cholesterol, is circulating in part of the blood. It has shaped everything from clinical guidelines to the widespread use of statins, medications that reduce LDL. Lowering LDL cholesterol reduces heart attacks, strokes, and early death.

artificial intelligence, cholesterol, ldl cholesterol, (12 more...)

WIRED

Country:

Europe (0.47)
North America > United States (0.29)

Genre: Research Report > New Finding (0.70)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.96)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Google announces its first-ever discovery of a zero-day exploit made with AI

EngadgetMay-11-2026, 18:11:07 GMT

We can now add cybercrimes to the list of growing concerns associated with artificial intelligence. Google's Threat Intelligence Group (GTIG) said it discovered, for the first time ever, a threat actor using a zero-day exploit that it believes was developed by AI. Zero-day vulnerabilities are often the most dangerous since they're unknown to the targets, leaving them with zero days to prepare for the attack. Google said in the report the threat actor was planning to use it in a mass exploitation event, but its proactive discovery may have prevented its use. Google added that it doesn't believe its own Gemini models were used, but still has high confidence an AI model was part of discovering the vulnerability and weaponizing an exploit.

artificial intelligence, social media, transportation review smartphone laptop, (9 more...)

Engadget

Country: Asia (0.18)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games > Computer Games (0.76)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Mobile (0.56)
Information Technology > Communications > Social Media (0.44)

Add feedback

Robotaxi drives off from airport with passenger's suitcase

FOX NewsMay-11-2026, 18:00:40 GMT

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .

artificial intelligence, social media, waymo, (12 more...)

FOX News

Country: North America > United States > California (0.48)

Industry:

Transportation > Ground > Road (1.00)
Media > News (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications > Social Media (0.97)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.67)

Add feedback