calibrator
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Diminishing Returns Shape Constraints for Interpretability and Regularization
Maya Gupta, Dara Bahri, Andrew Cotter, Kevin Canini
Similarly, a model that predicts the time it will take a customer to grocery shop should decrease in the number of cashiers, but each addedcashierreduces average wait time by less. In both cases, we would like to be able to incorporate this prior knowledge by constraining the machine learned model's output to have a diminishing returns response to the size of the apartment or number of cashiers.
- North America > United States > New York (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > Quebec > Montreal (0.04)
TheUnreliabilityofExplanationsinFew-shot PromptingforTextualReasoning
However, text-davinci-002 is able to benefit more substantially. We further show that explanations generated by the LLMs may not entail the models' predictions norbefactually grounded intheinput, evenonsimple tasks with extractive explanations. However, these flawed explanations can still be useful as a way to verify LLMs' predictions post-hoc.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Louisiana (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
CalibrationofSharedEquilibriainGeneralSum PartiallyObservableMarkovGames
We consider a general sum partially observableMarkovgamewhere agents ofdifferent types share asingle policy network, conditioned on agent-specific information. This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent phenomena ofsuch equilibria toreal-worldtargets. Parameter sharing with decentralized execution has been introduced as an efficient way to train multiple agents using a single policy network.
- North America > United States > California (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
MONICA: Real-Time Monitoring and Calibration of Chain-of-Thought Sycophancy in Large Reasoning Models
Hu, Jingyu, Yang, Shu, Gong, Xilin, Wang, Hongming, Liu, Weiru, Wang, Di
Large Reasoning Models (LRMs) suffer from sycophantic behavior, where models tend to agree with users' incorrect beliefs and follow misinformation rather than maintain independent reasoning. This behavior undermines model reliability and poses societal risks. Mitigating LRM sycophancy requires monitoring how this sycophancy emerges during the reasoning trajectory; however, current methods mainly focus on judging based on final answers and correcting them, without understanding how sycophancy develops during reasoning processes. To address this limitation, we propose MONICA, a novel Monitor-guided Calibration framework that monitors and mitigates sycophancy during model inference at the level of reasoning steps, without requiring the model to finish generating its complete answer. MONICA integrates a sycophantic monitor that provides real-time monitoring of sycophantic drift scores during response generation with a calibrator that dynamically suppresses sycophantic behavior when scores exceed predefined thresholds. Extensive experiments across 12 datasets and 3 LRMs demonstrate that our method effectively reduces sycophantic behavior in both intermediate reasoning steps and final answers, yielding robust performance improvements.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)