Are You Sure? Challenging LLMs Leads to Performance Drops in The FlipFlop Experiment

Open in new window