OpenAI's newest AI models hallucinate way more, for reasons unknown

Apr-22-2025, 15:12:14 GMT–PCWorld

Last week, OpenAI released its new o3 and o4-mini reasoning models, which perform significantly better than their o1 and o3-mini predecessors and have new capabilities like "thinking with images" and agentically combining AI tools for more complex results. This is unusual as newer models tend to hallucinate less as the underlying AI tech improves. In the realm of LLMs and reasoning AIs, a "hallucination" occurs when the model makes up information that sounds convincing but has no bearing in truth. In other words, when you ask questions to ChatGPT, it may respond with an answer that's patently false or incorrect. OpenAI's in-house benchmark PersonQA--which is used to measure the factual accuracy of its AI models when talking about people--found that o3 hallucinated in 33 percent of responses while o4-mini did even worse at 48 percent.

large language model, machine learning, natural language, (11 more...)

PCWorld

Apr-22-2025, 15:12:14 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (1.00)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found