Srsly Risky Biz: Anthropic Lacks Emotional Intelligence

Share

Your weekly dose of Seriously Risky Business news is written by Tom Uren and edited by Patrick Gray and Amberleigh Jack. This week's edition is sponsored by Ent AI.

You can hear a podcast discussion of this newsletter by searching for "Risky Business News" in your podcatcher or subscribing via this RSS feed.

Srsly Risky Biz: Anthropic Lacks Emotional Intelligence

Photo by Colton Duke on Unsplash

The stoush between Anthropic and the US government has erupted once again, this time over concerns about how the release of new AI models is being managed.

Early last week, Anthropic rolled out two new models, Mythos 5 and Fable 5. By Friday, they'd been pulled.

The Wall Street Journal reported their withdrawal was kicked off by conversations on Thursday last week between Amazon CEO Andy Jassy and US officials, including Treasury Secretary Scott Bessent. Jassy raised the possibility that the models could be jailbroken and by Friday evening the Commerce Department told Anthropic that its models would be subject to export controls. These controls prohibit the models from being used by any foreign national, regardless of whether they are inside or outside of the US. 

To ensure compliance, Anthropic cut off access to the two models for all of its customers.

The role of jailbreaks in all of this is important to understand. 

Mythos 5 and Fable 5 are actually the same underlying model. In essence, Anthropic slapped some stricter guardrails on Mythos 5, changed its name to Fable and made it available for general use. 

If a user submitted a prompt to Fable related to cyber security, chemistry, biology, or model distillation, the guardrails would kick queries down to Opus 4.8, a less-capable model. Mythos 5 has drastically fewer safeguards, but is only available to Project Glasswing participants; a select group of cyber defenders

At first glance, this release strategy makes sense as a way to manage the risks of deploying advanced models. However, as Risky Business Technology Editor James Wilson points out in this (terrific) solo podcast, the strategy relies on these guardrails actually being effective. If Fable's guardrails were susceptible to a universal jailbreak, then all Mythos 5 capabilities would be available to anyone with access to Fable 5.

So, to reassure everyone, when Anthropic announced Fable 5, it described the model's guardrails as "cautious… and stricter than ideal". It also said the UK AI Safety Institute's testing had not found a universal jailbreak in over 1,000 hours of testing. Although it did concede "it is likely impossible to completely prevent universal jailbreaks". 

As an additional control, Anthropic instituted a 30-day retention policy for all data sent to Mythos and Fable, saying the data would help them defend against novel and complex attacks, including jailbreaks. So far, so good.

Where things went wrong is when Jassy raised concerns to the government, which in turn raised those concerns with Anthropic, the company blew them off. An official also told Axios last week that the administration had tried to get Anthropic to delay releasing its new models, but Anthropic declined. But in its statement about suspending access to Fable and Mythos, Anthropic said: 

To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed a report that we believe is the basis of the government's directive and validated that the level of capability displayed there is widely available from other models (including OpenAI’s GPT-5.5), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours. 

This doesn't seem like a sensible comms approach to us, especially when Anthropic's name is already mud in the White House. 

Anthropic can't seem to help itself from ignoring the Trump administration's preferences and making itself an enemy of the White House for absolutely no reason. Repeatedly. 

The company is already on a Pentagon blacklist for being a supply chain risk. Now it's been slapped with a Commerce department export ban because it is too dangerous for foreign use, too. 

This latest dispute is obviously not good for the company. Lawmakers are also concerned about the White House's (non)process, as are Antropic's competitors, who worry that similar, quasi-arbitrary measures will be imposed on them. There are also concerns from cybersecurity experts that this ban will hinder defensive efforts by preventing defenders from using the latest and greatest models to improve software security.

In an ideal world, formal, process-driven assessments of new models and their safeguards would be used to weigh up whether the benefits of new releases outweigh risks before governments made big decisions like this. That seems like a better approach than relying on Amazon's CEO making phone calls as a system for managing serious national security risks. 

However, meticulous collection of data followed by sober analysis of risks and benefits isn't really this administration's style. So don't expect much to change on the government side of this.

On the Anthropic side, however, something can be done. Rather than fighting the administration, the company needs to learn how to manage upwards. Throwing data at administration officials, or expert testimony, even if it entirely supports Anthropic's argument, won't make a difference. That's just not what is important to key Trump administration officials. Here's an idea: If they ask you to reassess your guardrails, just say yes!

Anthropic also needs to shut up about how dangerous its models are. Anthropic's core messaging seems to be that its models will unleash a jobs armageddon and cyber apocalypse. Governments do not like jobs armageddons, nor cyber apocalypses. Zero stars, Anthropic! Zero stars!

The great irony here is that the state of AI is changing fast, and it's somewhat likely that we'll have open weight models sitting on our desks with similar capabilities to Mythos 5 in a year or two. At that point, debates about export controls and guardrails will seem quaint.

But for now at least, Anthropic needs to focus a bit less on artificial intelligence and a little more on emotional intelligence.

Pulte Nomination Risks America's Best Intel Source

Last week, the law that authorised what is known as Section 702 collection lapsed without reauthorisation for the first time since the statute was enacted in 2008.  

The intelligence collection authorised by this section of the Foreign Intelligence Surveillance Act (FISA) has been described as the "crown jewel" of US surveillance programs.  

Although the authorisation has lapsed, the FISA court has certified existing collection orders through to March 2027, so it seems intelligence gathering under 702 will continue until then. That's not a certainty, however, as some lawmakers fear that companies compelled by the law may challenge the intelligence collection in court.

Section 702 renewal has traditionally involved a bit of back and forth, and this year's attempt has already involved two extensions, one until April 30 and then a second one until Friday June 12.

Back in April we wrote that some of the difficulty in reauthorising Section 702 this time round came down to "a simple lack of trust in the administration".  

That wasn't helped at all by President Donald Trump naming close ally Bill Pulte as acting Director of National Intelligence after Tulsi Gabbard announced that she would be retiring at the end of this month. Pulte has no national security experience but does have a track record of attacking Trump's perceived political enemies. 

The possibility that Pulte could end up as DNI permanently was a showstopper. Per NPR:

In an interview with NPR's Morning Edition, Sen. Mark Warner, the top Democrat on the chamber's intel committee, said "he's extraordinarily unqualified, but the timing could also not be more of a mistake." Hakeem Jeffries, the top House Democrat, described Pulte as a "political hack" and "malignant clown."

Republicans were also concerned. Senate Majority Leader John Thune told reporters, "we don't need a weaponised DNI" and that "we need professionals there".

On Thursday last week, President Trump changed tack and nominated Jay Clayton, a former head of the Securities and Exchange Commission, to serve as Director of National Intelligence. Senator Warner described Clayton as "a capable public servant".

The Senate intended to fast track Clayton's confirmation process but in late-breaking news President Trump kyboshed that plan by delaying Clayton's nomination hearings. This means that Pulte will necessarily act as DNI for at least several weeks. 

So for now, it looks like the Section 702 intelligence collection will continue in an ok-but-far-from-ideal state. We're not holding our breath for Pulte to be dumped and Clayton confirmed, but assume that once that happens we'll go back to situation normal: a never-ending clown show of lawmaker arguments that end in temporary reauthorisations. 

Watch James Wilson and Tom Uren discuss this edition of the newsletter:

Three Reasons to Be Cheerful This Week:

  1. Outsider Enterprise Phishing-as-a-service takedown: The FBI, Google, and Lumen have collaborated on taking down Outsider Enterprise, a Chinese phishing-as-a-service platform. The Bureau says the service used over 8,000 phishing domains, and is estimated to have stolen more than 3.8 million credit cards causing USD$1.9 billion in losses. Google has further detail about the takedown and the laws it is advocating for to combat scams.    
  2. Closer cyber security ties with Ukraine: The European Union has approved the addition of Ukraine to its EU Cybersecurity Reserve. That means the country will now be able to access emergency support from the European Union Agency for Cybersecurity (ENISA) to respond to large-scale cyber security incidents.  
  3. Russian emails quarantined by Estonia: The Estonian government announced that its public sector will quarantine all emails from .ru domains for additional security checks. The Minister for Justice and Digital Affairs, Liisa Pakosta, said that ".ru emails are increasingly being used in various cyberattacks". 

In this Risky Business sponsor interview, Catalin Cimpanu talks with Brandon Dixon, co-founder and CTO of Ent AI, on the company's innovative use of local LLMs to track user behavior on the endpoint, and add context to suspicious events to detect or prevent malicious activity.

Risky Biz Talks

You can find the audio edition of this newsletter and other fine podcasts and interviews in the Risky Biz News feed (RSS, iTunes or Spotify).  

In our last "Between Two Nerds" discussion Tom Uren and The Grugq talk about how NATO is set up to deter conventional conflict, and how that approach is fundamentally unsuited for ongoing, everyday cyber operations that are intended to confound adversaries.

Or watch it on YouTube!

From Risky Bulletin:

China arrests members of Silver Fox cybercrime group: Chinese police have arrested 67 suspects linked to Silver Fox, the country's largest and most active cybercrime group targeting its domestic audiences.

Arrests took place across five provinces and targeted everyone from developers to phishing site operators and various affiliates.

Authorities identified a man named Ji Moufei as the main individual who wrote and sold the group's malware, the eponymous Silver Fox trojan. Ji and four associates were arrested in Zhejiang.

Twenty-eight others were also arrested in the Jilin province, including a man named Chen, described as having developed a variant of the group's trojan.

[more on Risky Bulletin]

Arch Linux supply chain attack spreads to 1,900+ AUR packages: 

More than 1,900 Arch Linux packages have been hijacked over the weekend as part of a massive supply chain attack designed to infect users with a rootkit and a credentials harvester.

The attacker(s) targeted Arch Linux packages hosted on the AUR portal, an unofficial repository of Arch packages created by the community. The portal hosts a massive 100,000 entries, but almost a tenth have been abandoned by their maintainers in what AUR calls "orphaned packages."

The attack exploited an AUR mechanism that allowed the hacker to "adopt" the abandoned packages and become a maintainer.

[more on Risky Bulletin]

In the age of AI, CISA changes federal patching rules: The US Cybersecurity and Infrastructure Security Agency (CISA) issued a new binding operational directive (BOD) this week that updates the patching rules for federal civilian agencies.

The new order cites the rise of AI-automated attacks as the main reason to prioritize bugs based on the risk they pose to federal networks and shorten patching deadlines.

The order introduces a new decision tree (pictured in the linked article) that will prioritize vulnerabilities that are exploited in the wild, are easy to exploit and automate, and grant broad access to a system if they have been exploited.

[more on Risky Bulletin]