AITopics | banel

Collaborating Authors

banel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning from failure to tackle extremely hard problems

AIHubNov-12-2025, 10:00:00 GMT

This blog post is based on the work BaNEL: Exploration Posteriors for Generative Modeling Using Only Negative Rewards . The ultimate aim of machine learning research is to push machines beyond human limits in critical applications, including the next generation of theorem proving, algorithmic problem solving, and drug discovery. A standard recipe involves: (1) pre-training models on existing data to obtain base models, and then (2) post-training them using scalar reward signals that measure the quality or correctness of the generated samples. The probability of producing a positive-reward sample can be so low that the model may go through most of the training without ever encountering a positive reward. Calls to the reward oracle can be expensive or risky, requiring costly simulations, computations, or even physical experiments.

artificial intelligence, machine learning, social media, (14 more...)

AIHub

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

BaNEL: Exploration Posteriors for Generative Modeling Using Only Negative Rewards

Lee, Sangyun, Amos, Brandon, Fanti, Giulia

arXiv.org Artificial IntelligenceOct-13-2025

Today's generative models thrive with large amounts of supervised data and informative reward functions characterizing the quality of the generation. They work under the assumptions that the supervised data provides knowledge to pre-train the model, and the reward function provides dense information about how to further improve the generation quality and correctness. However, in the hardest instances of important problems, two problems arise: (1) the base generative model attains a near-zero reward signal, and (2) calls to the reward oracle are expensive. This setting poses a fundamentally different learning challenge than standard reward-based post-training. To address this, we propose BaNEL (Bayesian Negative Evidence Learning), an algorithm that post-trains the model using failed attempts only, while minimizing the number of reward evaluations (NREs). Our method is based on the idea that the problem of learning regularities underlying failures can be cast as another, in-loop generative modeling problem. We then leverage this model to assess whether new data resembles previously seen failures and steer the generation away from them. We show that BaNEL can improve model performance without observing a single successful sample on several sparse-reward tasks, outperforming existing novelty-bonus approaches by up to several orders of magnitude in success rate, while using fewer reward evaluations.

banel, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.09596

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback