AITopics

We introduce a principled probabilistic framework for reward-guided decoding in large language models, addressing the limitations of standard decoding methods that optimize token-level likelihood rather than sequence-level quality. Our method defines a reward-augmented target distribution over complete sequences by combining model transition probabilities with prefix-dependent reward potentials. Importantly, the approach is training-free: it leaves model weights unchanged and instead modifies the inference distribution via reward potentials, with all gains arising purely from inference-time sampling. To sample from this distribution, we develop Sequential Monte Carlo algorithms, including a computationally efficient prefix-only variant and a lookahead variant whose intermediate targets match the exact marginals of the full sequence distribution. The framework also integrates resample-move updates with Metropolis-Hastings rejuvenation and supports block-wise generation, subsuming common decoding strategies such as temperature sampling and power-tempered objectives. Empirical results across three 7B models show significant gains. On code generation (HumanEval), our method improves base performance by up to 54.9% and surpasses the strongest sampling baselines by 9.1%-15.3%. On mathematical reasoning (MATH500), it achieves gains of up to 8.8%. Notably, it reaches 87.8% on HumanEval and 78.4% on MATH500 with Qwen2.5-7B, consistently outperforming the reinforcement learning method GRPO.

large language model, machine learning, reinforcement learning, (18 more...)

2604.16453

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.54)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

FUSE: Ensembling Verifiers with Zero Labeled Data

Lee, Joonhyuk, Ma, Virginia, Zhao, Sarah, Nair, Yash, Spector, Asher, Cohen, Regev, Candès, Emmanuel J.

Verification of model outputs is rapidly emerging as a key primitive for both training and real-world deployment of large language models (LLMs). In practice, this often involves using imperfect LLM judges and reward models since ground truth acquisition can be time-consuming and expensive. We introduce Fully Unsupervised Score Ensembling (FUSE), a method for improving verification quality by ensembling verifiers without access to ground truth correctness labels. The key idea behind FUSE is to control conditional dependencies between verifiers in a manner that improves the unsupervised performance of a class of spectral algorithms from the ensembling literature. Despite requiring zero ground truth labels, FUSE typically matches or improves upon semi-supervised alternatives in test-time scaling experiments with diverse sets of generator models, verifiers, and benchmarks. In particular, we validate our method on both conventional academic benchmarks such as GPQA Diamond and on frontier, unsaturated benchmarks such as Humanity's Last Exam and IMO Shortlist questions.

large language model, machine learning, natural language, (21 more...)

2604.18547

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)
Asia > Middle East > Lebanon (0.04)
Asia > China (0.04)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Prior-Fitted Functional Flow: In-Context Generative Models for Pharmacokinetics

Ojeda, César, Hartung, Niklas, Huisinga, Wilhelm, Jahn, Tim, Kavwele, Purity Kamene, Klose, Marian, Kumar, Piyush, Sánchez, Ramsés J., Faroughy, Darius A.

We introduce Prior-Fitted Functional Flows, a generative foundation model for pharmacokinetics that enables zero-shot population synthesis and individual forecasting without manual parameter tuning. We learn functional vector fields, explicitly conditioned on the sparse, irregular data of an entire study population. This enables the generation of coherent virtual cohorts as well as forecasting of partially observed patient trajectories with calibrated uncertainty. We construct a new open-access literature corpus to inform our priors, and demonstrate state-of-the-art predictive accuracy on extensive real-world datasets.

large language model, machine learning, trajectory, (20 more...)

2604.1767

Country:

North America > United States (0.14)
Europe > Austria > Vienna (0.14)
Europe > Germany (0.05)

Genre: Research Report > Experimental Study (0.66)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Sfyraki, Maria-Eleni, Wang, Jun-Kun

Revisiting Active Sequential Prediction-Powered Mean Estimation

In this work, we revisit the problem of active sequential prediction-powered mean estimation, where at each round one must decide the query probability of the ground-truth label upon observing the covariates of a sample. Furthermore, if the label is not queried, the prediction from a machine learning model is used instead. Prior work proposed an elegant scheme that determines the query probability by combining an uncertainty-based suggestion with a constant probability that encodes a soft constraint on the query probability. We explored different values of the mixing parameter and observed an intriguing empirical pattern: the smallest confidence width tends to occur when the weight on the constant probability is close to one, thereby reducing the influence of the uncertainty-based component. Motivated by this observation, we develop a non-asymptotic analysis of the estimator and establish a data-dependent bound on its confidence interval. Our analysis further suggests that when a no-regret learning approach is used to determine the query probability and control this bound, the query probability converges to the constraint of the max value of the query probability when it is chosen obliviously to the current covariates. We also conduct simulations that corroborate these theoretical findings.

large language model, machine learning, natural language, (16 more...)

2604.18569

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
Oceania > New Zealand (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

EngadgetApr-20-2026, 18:39:49 GMT

LinkedIn's new Crosscheck feature lets premium subscribers test competing AI models for free

LinkedIn's new Crosscheck feature lets premium subscribers test competing AI models for free The feature is a blind taste test for AI models from Anthropic, Google, OpenAI and other companies. You can now use LinkedIn to test out some of the latest AI models from OpenAI, Anthropic, Google, Microsoft and other companies without having to worry about token limits or paying for an extra subscription. The professional network is experimenting with a new feature that allows people to test AI platforms' latest offerings within LinkedIn. It's called Crosscheck, and it's rolling out now to anyone with a LinkedIn Premium subscription in the United States. The feature is meant to be a kind of blind taste test for AI models, according to the company's Chief Product Officer Hari Srinivasan.

large language model, machine learning, natural language, (21 more...)

Engadget

Country: North America > United States (0.25)

Industry: Information Technology > Services (0.89)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.45)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

The GuardianApr-18-2026, 14:00:15 GMT

How a fiery attack on Sam Altman's home unfolded

Sam Altman speaks during the BlackRock infrastructure summit on 11 March in Washington DC. Sam Altman speaks during the BlackRock infrastructure summit on 11 March in Washington DC. How a fiery attack on Sam Altman's home unfolded Molotov cocktail attack on OpenAI CEO's home comes amid growing discontent against artificial intelligence I n the early hours of 10 April, a man approached the gate of OpenAI CEO Sam Altman's house in San Francisco and hurled a molotov cocktail at the building before fleeing. Federal and California state authorities have charged Moreno-Gama with a range of crimes including attempted arson and attempted murder. His parents issued a statement this week saying that their son had recently suffered a mental health crisis.

large language model, machine learning, natural language, (16 more...)

The Guardian

Country:

North America > United States > District of Columbia > Washington (0.45)
North America > United States > California > San Francisco County > San Francisco (0.38)
Europe > Ukraine (0.06)
(4 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.48)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.36)

Technology:

Information Technology > Communications > Social Media (0.99)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
(2 more...)

Stop asking AI for life advice

Recent studies confirm that you're better off finding a human therapist. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Breakthroughs, discoveries, and DIY tips sent six days a week. Millions of people use AI systems every day, for all kinds of reasons. And it's hard to deny they can be useful at times.

ai system, large language model, natural language, (13 more...)

Popular Science

Country: North America > United States > Illinois (0.05)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Consumer Health (0.70)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.31)

arXiv.org Machine LearningApr-17-2026

Generative Augmented Inference

Lu, Cheng, Wang, Mengxin, Zhang, Dennis J., Zhang, Heng

Data-driven operations management often relies on parameters estimated from costly human-generated labels. Recent advances in large language models (LLMs) and other AI systems offer inexpensive auxiliary data, but introduce a new challenge: AI outputs are not direct observations of the target outcomes, but could involve high-dimensional representations with complex and unknown relationships to human labels. Conventional methods leverage AI predictions as direct proxies for true labels, which can be inefficient or unreliable when this relationship is weak or misspecified. We propose Generative Augmented Inference (GAI), a general framework that incorporates AI-generated outputs as informative features for estimating models of human-labeled outcomes. GAI uses an orthogonal moment construction that enables consistent estimation and valid inference with flexible, nonparametric relationship between LLM-generated outputs and human labels. We establish asymptotic normality and show a "safe default" property: relative to human-data-only estimators, GAI weakly improves estimation efficiency under arbitrary auxiliary signals and yields strict gains whenever the auxiliary information is predictive. Empirically, GAI outperforms benchmarks across diverse settings. In conjoint analysis with weak auxiliary signals, GAI reduces estimation error by about 50% and lowers human labeling requirements by over 75%. In retail pricing, where all methods access the same auxiliary inputs, GAI consistently outperforms alternative estimators, highlighting the value of its construction rather than differences in information. In health insurance choice, it cuts labeling requirements by over 90% while maintaining decision accuracy. Across applications, GAI improves confidence interval coverage without inflating width. Overall, GAI provides a principled and scalable approach to integrating AI-generated information.

information, large language model, machine learning, (22 more...)

2604.14575

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Texas (0.04)
North America > United States > California (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Robots can't replace guide dogs

Technology AI Robots can't replace guide dogs Man's best friend shares an'invisible care world' with humans that AI can't beat--yet. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Guide dogs are highly trained and can help people with vision loss navigate the world, open doors, and more. Breakthroughs, discoveries, and DIY tips sent six days a week. On paper, few physical jobs seem as ripe for AI takeover as that of the loyal service dog .

large language model, natural language, physics popular science video space, (12 more...)

Popular Science

Country:

North America > United States > New York (0.05)
Europe > Finland > Southwest Finland > Turku (0.05)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Consumer Health (0.97)
Health & Medicine > Therapeutic Area (0.90)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Potaptchik, Peter, Yim, Jason, Saravanan, Adhi, Holderrieth, Peter, Vanden-Eijnden, Eric, Albergo, Michael S.

Discrete Flow Maps

arXiv.org Machine LearningApr-15-2026

The sequential nature of autoregressive next-token prediction imposes a fundamental speed limit on large language models. While continuous flow models offer a path to parallel generation, they traditionally demand expensive iterative integration. Flow Maps bypass this bottleneck by compressing generative trajectories into single-step mappings, theoretically enabling the generation of full text sequences from noise in a single forward pass. However, standard formulations rely on Euclidean regression losses that are geometrically ill-suited for discrete data. In this work, we resolve this conflict with Discrete Flow Maps, a framework that reconciles trajectory compression with the geometry of the probability simplex. We recast standard flow map training for the discrete domain, aligning the training dynamics with the discrete nature of language. Empirically, this strict geometric alignment allows our method to surpass previous state-of-the-art results in discrete flow modeling.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

2604.09784

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Syria (0.04)
North America > United States > New York > Kings County > New York City (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)