OpenAIs o3 and o4-mini hallucinate way higher than previous models