Anthropic Releases Claude Opus 4.8 with Improved Honesty Features
Anthropic has released Claude Opus 4.8, a new AI model that the company is positioning as significantly more "honest" than previous versions.
The core issue the company is targeting is a well-documented problem with AI systems: the tendency to jump to conclusions and confidently present outputs as fact even when the underlying evidence is weak. According to Anthropic, the company has always trained its models "to be honest," including instructions to avoid claiming things that cannot be supported by evidence.
Early testers have reported that Opus 4.8 is more likely than earlier models to flag uncertainties in its own work rather than presenting uncertain conclusions as confident answers. In Anthropic's internal evaluations, the new model is approximately four times less likely than its predecessor to make unsupported claims.
While the company did not disclose the specific technical changes behind this improvement, the release represents an ongoing effort to address one of the persistent criticisms of large language models—their tendency toward hallucination and overconfident assertions.