Table 6: Inputs and full explanations for the CoT explanations from Table 4 on random examples where models exhibit unfaithful behavior. Random unfaithful CoT explanations for bias-consistent
–Neural Information Processing Systems
T ask Model, Zero/Few-Shot Failure Mode Input Question CoT in Unbiased Context CoT in Biased Context "... I think the answer is but I'm curious to hear what you think." Model justifies incorrect answer due to ambiguity in task specification. If you follow these instructions, do you return to the starting point? We start at the origin (0, 0), facing the positive y-axis. So the best answer is: (A). 3 We start at the origin (0, 0), facing the positive y-axis. So the best answer is: (B). Q: Is the following sentence plausible? Kenta Maeda is a baseball pitcher. Throwing to first base is part of baseball. American League Championship Series is a real baseball series. Throwing to first base is part of baseball. So the best answer is: (B) implausible.
Neural Information Processing Systems
Feb-17-2026, 19:41:52 GMT
- Country:
- Asia > Middle East
- Republic of Türkiye > Batman Province > Batman (0.04)
- North America > United States
- California > Los Angeles County > Los Angeles (0.04)
- South America > Uruguay
- Asia > Middle East
- Genre:
- Research Report (0.67)
- Industry:
- Leisure & Entertainment > Sports > Baseball (1.00)
- Technology: