Anthropic Reverses Policy That Might Have ‘Undermined’ AI Researchers Using Claude

Anthropic is reconsidering a policy that would have discreetly restricted competitors from using its new AI model, Claude Fable 5, to create other AI models. The shift follows considerable criticism from the AI research community.
âWeâre revising Fable 5âs safeguards for frontier LLM development to ensure they are transparent,â Anthropic stated in a communication to WIRED. âWe acknowledged our miscalculation and apologize for not achieving the appropriate balance.â
Earlier this week, Anthropic introduced Claude Fable 5, which includes enhanced safety measures intended to prevent misuse. Some of the safeguards were expected: the company indicated it would redirect users who inquired about cybersecurity, biology, or chemistry to a less advanced AI model to lower the likelihood of someone employing the powerful AI for cyberattacks or bioweapon development.
However, for researchers aiming to utilize Claude Fable 5 for frontier AI advancement, Anthropic had a different plan. The firm intended to subtly diminish the modelâs performance in ways that users couldn’t perceive. This strategy would effectively undermine researchers attempting to use Claude for training competing AI models, which Anthropic explicitly prohibits in its terms of service.
Anthropic has now announced a change in direction, stating that the safeguards for AI development within Claude Fable 5 will be apparent to users. If the company suspects a user is trying to leverage Claude to build a highly capable AI, it will inform them that itâs either denying the request or directing them to a less powerful model.
The reversal of policy came after intensive backlash from the AI research community. Although Anthropic had already enacted measures to restrict competitors from using Claude in developing closed- and open-source AI models, critics argue that secretly degrading the modelâs performance for specific users crossed a line. Claudeâs coding capability has gained popularity among developers, especially those engaged in open-source AI research initiatives, and researchers have indicated to WIRED that the new policy could have led to a concerning scenario where only a few leading AI labs could engage in advanced AI research.
Dean Ball, a senior fellow at the Foundation for American Innovation and a former White House AI advisor, expressed in a post on X that âdegrading performance on ML research *without informing the user* is shockingly antagonistic and reflects poorly.â He further commented in another post that the âcovert sabotageâ policy contradicts Anthropicâs overall mission by hindering collaboration among AI researchers on safety measures.
âIt seemed like Anthropic was conveying to the public, âWe donât trust anyone else to conduct AI research. We are the only ones who should be doing AI research,ââ says Will Brown, research lead at the open-source AI startup Prime Intellect. âIt feels somewhat like theyâre beginning to pull the ladder up behind them.â
Brown noted that the policy would have also left developers unaware of potential violations of Anthropicâs rules, as the company wouldnât notify them when its safeguards were activated. He added that these restrictions could have far-reaching implications. For instance, he referenced the increasing network of third-party evaluation firms that assess frontier models for safety, performance, and reliabilityâtasks that could have been hampered if Anthropic quietly degraded its model.
