AI Giants OpenAI and Anthropic Forge Rare Alliance for Joint Safety Evaluation

OpenAI and Anthropic, two of the leading developers in artificial intelligence, recently undertook a significant collaborative effort, briefly opening their advanced AI models for joint safety testing. This unusual cross-lab initiative, conducted amidst intense market competition, aimed to identify previously unseen vulnerabilities in their respective internal evaluations and lay groundwork for future industry-wide safety protocols. The collaboration underscores a growing recognition within the sector for collective responsibility as AI technology becomes more integral to daily operations.

This move comes as AI systems enter a “consequential” phase, deeply embedded in numerous applications, prompting a critical need for standardized safety measures. Wojciech Zaremba, co-founder of OpenAI, emphasized the broader question of how the industry can establish unified safety standards and foster collaboration, even with billions invested in fierce competition for talent and market share. The joint research specifically addressed a crucial safety aspect: AI model hallucination, where an AI generates incorrect or fabricated information, presenting it as fact.

The findings from this joint evaluation offered stark contrasts in how different models handle uncertainty. Anthropic's Claude Opus 4 and Sonnet 4 models demonstrated a high refusal rate, declining to answer up to 70% of questions when uncertain. In contrast, OpenAI's o3 and o4-mini models attempted more answers but exhibited significantly higher rates of hallucination. For industrial applications, such as predictive maintenance or supply chain optimization, understanding these behaviors is critical; systems must either provide reliable data or transparently indicate uncertainty to prevent costly operational errors. Another pressing safety concern, sycophancy – an AI's tendency to reinforce negative user behavior to be agreeable – is also a focus for both companies, particularly concerning user safety and ethical deployment.

While competition is expected to remain robust, the success of this initial collaborative safety research points toward a potential future where rival AI labs increasingly pool resources on safety frontiers. Both OpenAI and Anthropic researchers expressed a desire for more regular and extensive joint testing across broader subjects and future AI models. This signals a maturing industry perspective, where collective safety is seen as a necessary foundation, even amidst the relentless pursuit of technological advancement.

From our partners

Sponsored

AI Consultants Ditch the Strategy Deck for the Factory Floor

For years, the promise of Artificial Intelligence in the industrial sector has been locked behind a formidable gate. The gatekeepers—large, established consulting firms—offered exhaustive strategic roadmaps and multi-year

By Industrial Intelligence Daily

/ 22 Aug 2025

AI Giants OpenAI and Anthropic Forge Rare Alliance for Joint Safety Evaluation

Industrial Intelligence Daily

More in Live

Anthropic Projects $70 Billion Revenue by 2028 Amidst Enterprise AI Expansion

Japanese Trade Group Requests OpenAI Halt Unauthorized Content Training

More from Industrial Intelligence Daily

Anthropic Projects $70 Billion Revenue by 2028 Amidst Enterprise AI Expansion

Japanese Trade Group Requests OpenAI Halt Unauthorized Content Training

From our partners

AI Consultants Ditch the Strategy Deck for the Factory Floor