The US Government Just Unplugged Anthropic's Best Models. Here's Why That's Absurd.

The US Government Just Unplugged Anthropic's Best Models. Here's Why That's Absurd.

Anthropic spent the better part of two years telling us it was different. Not like the other labs. Safety-first. Responsible. The AI company that actually cared whether the thing it built killed you. And for a while, you could almost believe it — if you squinted past the billion-dollar funding rounds and the slightly cultish deployment philosophy and the fact that their entire brand identity is essentially "we're building something that might end civilisation, but we promise we feel bad about it."

This week, that careful positioning collided head-on with the United States government, and the result is one of the more remarkable documents you will read in the AI space this year.


Here is what happened, in brief.

Earlier this week, Anthropic launched Claude Fable 5 and Claude Mythos 5. Both models descended from Claude Mythos Preview, the highly restricted frontier model Anthropic had been quietly running through something called Project Glasswing — a small cohort of trusted research partners given access to a model specifically designed to find security vulnerabilities. Mozilla alone, according to reports, resolved a significant number of previously unknown vulnerabilities using the system.

When Fable 5 launched publicly, it came loaded with guardrails. Heavy ones. So heavy, in fact, that Anthropic almost immediately started walking them back after developers complained the thing was too restricted to be useful. There is a very particular type of AI company embarrassment that comes from building a safety system so aggressive it renders the product useless, and Anthropic got to experience that in real time.

But none of that matters now, because at 5:21pm Eastern Time on June 12th, the Commerce Department dropped a letter in Anthropic's inbox. Commerce Secretary Howard Lutnick wrote to Dario Amodei informing him that Fable 5 and Mythos 5 would be subject to export controls — not just internationally, but to "foreign persons within the country," which is a phrase that does a lot of quiet work. As Axios reported, the trigger was a claim from another company that it had managed to jailbreak Mythos. The administration, apparently alarmed, acted.

Hours later, Anthropic announced it was disabling both models for every customer, everywhere, immediately, to ensure compliance. Not just Mythos — which was always restricted — but Fable, the publicly available one that hundreds of millions of people had access to. Gone.


Now, here is where it gets interesting.

Anthropic did not go quietly. Their statement is worth reading in full, because it is one of the more pointed things a major AI company has published about a government action. The company noted that the directive arrived with no specific details about the national security concern. Their own review of what they believe to be the jailbreak in question found that it involved asking a model to read a codebase and fix software flaws — a technique which, in their assessment, "is widely available from other models (including OpenAI's GPT-5.5), and is used every day by the defenders who keep systems safe."

Let that land for a moment. Anthropic is telling the US government, in a public statement, that the capability they are terrified of is already sitting in OpenAI's products, which are presumably not being pulled.

They go further. The company states plainly that no tester has found a universal jailbreak — one that broadly bypasses all safeguards. What has apparently been demonstrated is a narrow, non-universal technique. Anthropic's position is that perfect jailbreak resistance is not achievable for any model, that the industry knows this, and that their "defense in depth" strategy was specifically designed to account for the existence of narrow bypasses while containing any damage they could cause.

In other words: the government may have just forced a company to recall a product because it has a limitation that every comparable product also has.


I want to be careful here, because "the US government is wrong" is an easy take that lands differently depending on your priors. And there are people who will read this and think: good, frontier AI models capable of finding security vulnerabilities should face government scrutiny, full stop. That is not an unreasonable position. The question of who gets to use tools that can accelerate both attack and defence in the cyber domain is genuinely complicated.

But the mechanism here is what should trouble you, regardless of where you stand on AI regulation.

Anthropic was given a directive at 5:21pm with no specific disclosure of the national security concern. The company complied — because legally they had no choice — but published a statement saying they believe it is based on a misunderstanding. They provided a technical rebuttal. They named a competitor's model doing the same thing. And they said, clearly, that if this standard were applied across the industry, it would "essentially halt all new model deployments for all frontier model providers."

That last point is the one worth sitting with. Because this is not a case of a model being caught doing something genuinely dangerous that its peers cannot do. It appears, on Anthropic's account, to be a case of a model being caught doing something that its peers can do and do routinely, and being singled out for it.


Regular readers will remember that we covered the Mythos Preview launch back in April, and the curious tension at the heart of Anthropic's whole project: a company that built its brand on safety concerns finding itself in the business of deploying a model specifically designed to exploit vulnerabilities. The argument then, as now, was that the same capability that enables offence also enables defence — and that defenders need better tools than attackers, urgently.

That argument has not gotten any weaker. But it is now being adjudicated not by technical experts or security researchers, but by the Commerce Department, on the basis of a jailbreak demonstration that Anthropic says was benign and non-novel.

There is also an uncomfortable political layer here. The Trump administration previously tried to stop Anthropic from releasing Mythos and Fable before launch, and failed. This export control directive, arriving days after the public launch, reads — at minimum — as a second attempt. Whether that is regulatory overreach or appropriate national security caution probably depends on who you voted for, and you can make that call yourself.


What I keep coming back to is the banality of the mechanism.

Anthropic built serious safety infrastructure. They ran thousands of hours of red-teaming. They instituted data retention policies that cost them customers. They built a model that, by their account, has substantially stronger safeguards than anything previously deployed. And the commercial result of all of that, at least for now, is that their best products are offline, their customers are disrupted, and the company is writing public statements trying to explain to a government department that the thing they are afraid of is already widely available from a competitor in San Francisco.

The AI safety narrative always assumed the risk was the model going rogue, or being used by a sophisticated adversary with serious resources and intent. It did not account for the possibility that the more immediate threat to a well-intentioned AI deployment was a letter from a cabinet secretary who was told, by someone, that a jailbreak existed.

Anthropic says it is working to restore access as soon as possible. Given the speed at which these things tend to move in Washington, I wouldn't plan your afternoon around it.


Sources: 9to5Mac · Anthropic Statement