Goto

Collaborating Authors

 yang


Knowing When to Quit: A Principled Framework for Dynamic Abstention in LLM Reasoning

Davidov, Hen, Cohen, Nachshon, Kalinsky, Oren, Fairstein, Yaron, Kushilevitz, Guy, Yazdi, Ram, Rebeschini, Patrick

arXiv.org Machine Learning

Large language models (LLMs) using chain-of-thought reasoning often waste substantial compute by producing long, incorrect responses. Abstention can mitigate this by withholding outputs unlikely to be correct. While most abstention methods decide to withhold outputs before or after generation, dynamic mid-generation abstention considers early termination of unpromising reasoning traces at each token position. Prior work has explored empirical variants of this idea, but principled guidance for the abstention rule remains lacking. We present a formal analysis of dynamic abstention for LLMs, modeling abstention as an explicit action within a regularized reinforcement learning framework. An abstention reward parameter controls the trade-off between compute and information. We show that abstaining when the value function falls below this reward strictly outperforms natural baselines under general conditions. We further derive a principled and efficient method to approximate the value function. Empirical results on mathematical reasoning and toxicity avoidance tasks support our theory and demonstrate improved selective accuracy over existing methods.


Biodegradable wash keeps grapes fresh for 2 weeks at room temperature

Popular Science

More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. The estimated commercial cost is also comparable to existing industry rinses. Breakthroughs, discoveries, and DIY tips sent six days a week. While rinsing really does help clean fruits and vegetables of harmful pesticides and bacteria, washing produce with water alone doesn't ensure a longer shelf life or guard against decay. With millions of pounds of fresh food wasted annually in the United States alone, agricultural researchers at the University of British Columbia (UBC) in Canada are investigating new ways to extend freshness and rid produce of unwanted pesticides.


One-Step Score-Based Density Ratio Estimation

Chen, Wei, Zhao, Qibin, Paisley, John, Yang, Junmei, Zeng, Delu

arXiv.org Machine Learning

Density ratio estimation (DRE) is a useful tool for quantifying discrepancies between probability distributions, but existing approaches often involve a trade-off between estimation quality and computational efficiency. Classical direct DRE methods are usually efficient at inference time, yet their performance can seriously deteriorate when the discrepancy between distributions is large. In contrast, score-based DRE methods often yield more accurate estimates in such settings, but they typically require considerable repeated function evaluations and numerical integration. We propose One-step Score-based Density Ratio Estimation (OS-DRE), a partly analytic and solver-free framework designed to combine these complementary advantages. OS-DRE decomposes the time score into spatial and temporal components, representing the latter with an analytic radial basis function (RBF) frame. This formulation converts the otherwise intractable temporal integral into a closed-form weighted sum, thereby removing the need for numerical solvers and enabling DRE with only one function evaluation. We further analyze approximation conditions for the analytic frame, and establish approximation error bounds for both finitely and infinitely smooth temporal kernels, grounding the framework in existing approximation theory. Experiments across density estimation, continual Kullback-Leibler and mutual information estimation, and near out-of-distribution detection demonstrate that OS-DRE offers a favorable balance between estimation quality and inference efficiency.


A New Game Turns the H-1B Visa System Into a Surreal Simulation

WIRED

Inspired by real immigrant stories, H1B.Life captures the uncertainty, trade-offs, and pure luck that shape the lives of people trying to build a future in the US. When Allison Yang moved to the US from China two years ago, she noticed that immigrants often talked about their visa status like they were playing cards. The former Chinese journalist and founder of the game studio Reality Reload was at an event in New York when she heard fellow Chinese immigrants talking in confusing terminologies, like playing a Queen, Knight, or Ace. Everyone introduced themselves by words like H-1B, OPT, L-1, O-1, NIW--names of legal immigration categories in the US. With their cards on the table, they could start talking in greater depth about each person's immigration journey.


Good Semi-supervised Learning That Requires a Bad GAN

Neural Information Processing Systems

Semi-supervised learning methods based on generative adversarial networks (GANs) obtained strong empirical results, but it is not clear 1) how the discriminator benefits from joint training with a generator, and 2) why good semi-supervised classification performance and a good generator cannot be obtained at the same time. Theoretically we show that given the discriminator objective, good semi-supervised learning indeed requires a bad generator, and propose the definition of a preferred generator. Empirically, we derive a novel formulation based on our analysis that substantially improves over feature matching GANs, obtaining state-of-the-art results on multiple benchmark datasets.


Differentiable Learning of Logical Rules for Knowledge Base Reasoning

Neural Information Processing Systems

We study the problem of learning probabilistic first-order logical rules for knowledge base reasoning. This learning problem is difficult because it requires learning the parameters in a continuous space as well as the structure in a discrete space. We propose a framework, Neural Logic Programming, that combines the parameter and structure learning of first-order logical rules in an end-to-end differentiable model. This approach is inspired by a recently-developed differentiable logic called TensorLog [5], where inference tasks can be compiled into sequences of differentiable operations. We design a neural controller system that learns to compose these operations. Empirically, our method outperforms prior work on multiple knowledge base benchmark datasets, including Freebase and WikiMovies.



Imagine Losing Your Job to the Mere Possibility of AI

The Atlantic - Technology

The technology may not be ready to replace workers, but that isn't stopping execs from pushing forward anyway. Late last month, at an event in Washington, D.C., Andrew Yang delivered a bleak message. "I have bad news, America," he told the crowd. The Fuckening is the name that Yang, a former presidential candidate, has given to AI's disembowelment of the workforce. As he sees it, millions of knowledge workers will soon lose their job, personal-bankruptcy rates will spike, and entire downtowns will turn vacant as offices hollow out.


This AI Tool Will Tell You to Stop Slacking Off

WIRED

Fomi watches you work, then scolds you when your attention wanders. It's helpful, but there are privacy issues to consider. I've tested a lot of software tools over the years designed to block distractions and keep you focused. None of them work perfectly, mostly because of context. Reddit, for example, is something I should generally avoid during the workday, so I tend to block it--this is a good decision for me overall.