Are You Sure? Challenging LLMs Leads to Performance Drops in The FlipFlop Experiment