Begin typing your search above and press return to search.
proflie-avatar
Login
exit_to_app
DEEP READ
Ukraine
access_time 16 Aug 2023 11:16 AM IST
The Russian plan: Invade Japan and South Korea
access_time 16 Jan 2025 3:32 PM IST
Putin
access_time 2 Jan 2025 1:36 PM IST
What is Christmas?
access_time 26 Dec 2024 11:19 AM IST
Munambam Waqf issue decoded
access_time 16 Nov 2024 10:48 PM IST
exit_to_app
Homechevron_rightTechnologychevron_rightOpenAI’s new AI models...

OpenAI’s new AI models show a surge in hallucinations: report

text_fields
bookmark_border
OpenAI’s new AI models show a surge in hallucinations: report
cancel

OpenAI's latest AI models, o3 and o4-mini, are facing unexpected challenges, with internal tests revealing they "hallucinate" — or generate inaccurate or fabricated information — more frequently than their predecessors.

This troubling development comes as the company races to outperform rivals like Google, Meta, xAI, Anthropic, and DeepSeek in the intensifying global AI arms race.

The o-series models, launched on April 16, were designed with enhanced reasoning capabilities intended to pause and analyze queries more deeply before responding. Despite this advancement, a report by TechCrunch claims the models are demonstrating higher rates of hallucination than even earlier non-reasoning versions such as GPT-4o.

According to OpenAI’s own technical documentation, the issue remains poorly understood. The company admitted that more research is required to determine why reasoning models are increasingly producing inaccurate content. A former OpenAI employee suggested that the specific type of reinforcement learning used in training the o-series models could be worsening problems that were previously kept in check by conventional post-training methods.

Although such hallucinations may sometimes contribute to creative or novel outputs, experts caution that for enterprise-level applications, accuracy is critical. The unpredictability could hamper the models' appeal to businesses seeking dependable AI solutions.

Despite these issues, OpenAI maintains that its new models offer competitive performance. The o3 model reportedly achieved a 69.1% score on SWE-bench (a benchmark used to test coding abilities), while o4-mini trailed closely at 68.1%.

In a separate concern, a recent collaborative study by OpenAI and the MIT Media Lab has raised questions about the psychological impact of ChatGPT.

The research found that users who frequently relied on and emotionally bonded with the chatbot were more likely to report feelings of loneliness. While the study acknowledges that loneliness is influenced by various factors, it suggests that the growing emotional attachment to AI may warrant closer scrutiny in mental health discussions.


Show Full Article
TAGS:OpenAI New AI ModelsO3 by OpenAIO4-Mini by OpenAI
Next Story