OpenAI's most capable models hallucinate more than earlier ones

ZDNet 

OpenAI says its latest models, o3 and o4-mini, are its most powerful yet. However, research shows the models also hallucinate more -- at least twice as much as earlier models. Also: How to use ChatGPT: A beginner's guide to the most popular AI chatbot In the system card, a report that accompanies each new AI model, and published with the release last week, OpenAI reported that o4-mini is less accurate and hallucinates more than both o1 and o3. Using PersonQA, an internal test based on publicly available information, the company found o4-mini hallucinated in 48% of responses, which is three times o1's rate. While o4-mini is smaller, cheaper, and faster than o3, and, therefore, wasn't expected to outperform it, o3 still hallucinated in 33% of responses, or twice the rate of o1.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found