Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification

Open in new window