Srsly Risky Biz: Open Weight Model Advances Make the Mythos Debate Moot
Your weekly dose of Seriously Risky Business news is written by Tom Uren and edited by Patrick Gray and Amberleigh Jack. This week's edition is sponsored by Trail of Bits.
You can hear a podcast discussion of this newsletter by searching for "Risky Business News" in your podcatcher or subscribing via this RSS feed.
Srsly Risky Biz: Open Weight Model Advances Make the Mythos Debate Moot

This week, the Five Eyes cyber security agencies issued a call-to-action, warning that AI is accelerating "the speed, scale, and sophistication of cyber threats".
The thinking behind the call-to-action is clear, here. The Five Eyes believe it is no longer possible to limit AI's powerful, offensive cyber security capabilities to benign actors. AI is lowering barriers for malicious actors and shrinking the window between vulnerability discovery and exploitation. Organisations need to be ready, because the genie is out of the bottle.
They're not wrong. Freely available open weights models have closed the gap with frontier models to the extent that they're now extremely useful in orchestrating various offensive cyber security tasks.
A recent independent safety evaluation of the Chinese-made Kimi 2.5 model, for example, compared it to Anthropic's Opus 4.5 and OpenAI's GPT-5.2, the frontier models available at the time of its release. They found Kimi "performs competitively on structured cyber security tasks", even if it was not as good at vulnerability discovery and exploitation.
Opus 4.5 and GPT-5.2, were released in late November and mid-December respectively, while Kimi 2.5 was released in late January.
So Kimi wasn't as capable, but these free and open models are getting better, and quickly.
There have been updates since. Kimi 2.7 Code was released last week, and while we haven't seen an independent evaluation, we expect that improved coding performance will also mean that it is better at some cyber security tasks.
Vulnerability researchers are also getting better at using open weight models to find bugs, even when they aren't as capable as the bleeding edge frontier releases. See this interview with Niels Provos about his work, or read this post from noted researcher Karsten Nohl.
So while it's not the case that threat actors can just ask open weight models to go and conduct completely automated campaigns, it is the case that for expert users, open weight models are accelerating offensive workflows and vulnerability discovery.
In the workflow case, a report released last week from OALABS Research describes how a single hacker leveraged Opus 4.5 and GPT-5.2 to hack at least 14 companies in partially automated attacks.
OALABS was given a copy of the hacker's working directory on a compromised host he was launching attacks from. Although AI helped him to hack a number of companies, he was not exactly savvy. His OPSEC was poor. He even edited his resume on this compromised server. OALABS said this revealed his "full name, location, education history, and even his LinkedIn profile, revealing him to be a young man living in Addis Ababa, Ethiopia".
And yet, despite his questionable skill level, the attacker was still quite successful. Per OALABS:
What stands out most is how little the attacker needed to provide to get meaningful results. In many cases, the attacker supplied only vague, low-skill prompts and allowed Claude to fill in the gaps: researching exposed services, identifying possible vulnerabilities, writing exploit code, validating access, and harvesting data. The attacker did not need to be an expert operator; they simply had to use the correct framing for their prompts. The agent supplied much of the structure and technical execution that the attacker appeared to lack.
Opus 4.5 and GPT-5.2 were used in this example. They're not frontier models anymore and the hacker in this case could easily achieve the same level of automation now with something like Qwen 3.6, a locally-hosted model.
Straightforward, workaday hacking does not require frontier models anymore. Open weight models also come with the added benefit of ineffective and easy to remove guardrails.
It's easy to put too much emphasis on guardrails, however. In this case, the hacker was using notionally "safer" models with more robust guardrails than something like Kimi or Qwen, but he still received very little pushback. The hacker framed his requests as occurring in an authorised red team engagement. In more than 1,000 sessions, GPT-5.2 flagged only one usage policy violation and Opus 4.5 just nine. When these queries were flagged, the hacker simply worded his requests less aggressively and emphasised he was authorised to complete the work. This strategy worked for all of the hacker's queries.
This case study isn't necessarily a demonstration that safeguards are ineffective. Rather, it emphasises that it's impossible to tell the difference between legitimate cyber security work and criminal hacking without additional context. The only way to know if an activity is benign or malicious is to understand the user's intent.
So, what are policymakers to do in response to the Five Eyes call-to-action? Obviously, forcing the frontier model providers to slap extreme guardrails on their products won't get us very far.
The policy response here has to be about preparing us for a world where everyone has a Mythos 5.0-level model running locally, on a computer on their desk. In that world, governments will need to focus less on mandating strong guardrails and slapping export bans on frontier models, and more on trying to tighten cyber security across government and private sector systems. That's going to be a heavy lift, but trying to stop bad actors getting their hands on dangerous models is an unwinnable battle.
The Five Eyes cyber security authorities are right: The AI genie is out of the bottle.
Why Cybercrime Takedowns Are Like Mowing the Lawn
Last week the multinational joint law enforcement initiative known as Operation Endgame announced it had disrupted the SocGholish botnet.
The SocGholish network used compromised WordPress sites to spread malware that purported to be software updates. This malware provided criminals with access to victim computers and was often used for follow-on ransomware attacks.
Endgame says it took down 106 servers and domains. In addition, Dutch police removed backdoors and malware from 14,971 WordPress sites that were being used to distribute malware.
That's another big trophy to add to Endgame's collection. Since it kicked off in 2024, it has disrupted malware delivery platforms, ransomware infrastructure, and infostealer networks. The sheer number of criminal services disrupted is impressive.
Despite some significant wins, the celebrations don't always last. Many of these services have bounced back, some quite rapidly.
The Bumblebee malware loader was disrupted in May 2024 but had reappeared by October of that year. Similarly, SystemBC reappeared within three months and DanaBot malware within six.
It is disheartening when services return, but when there is big money on the line criminal enterprises have proven to be quite resilient. That doesn't mean these operations are failures, though. Even when criminal services are able to bounce back relatively quickly, the disruptions do have long-lasting impacts.
Take Operation Endgame's impact on the Rhadamanthys infostealer malware, for example, as examined by Proofpoint.
Rhadamanthys initially benefited as other malware strains were taken down and criminals migrated to it, before it itself was taken down in November 2025, leaving its customers searching for alternatives.
Newer alternatives were available, but Proofpoint noted that criminals still had to spend time and effort retooling their attack chains to deal with the disruption. And when criminal groups rebuild infrastructure, Operation Endgame has sometimes come back for a second round. In May 2025 it took on Trickbot and Bumblebee, two malware variants that it had previously disrupted. Constant contact!
There are indirect benefits to increased law enforcement "presence", too. Per Proofpoint, these types of actions "sow distrust among the criminal ecosystem", generating friction and resulting in criminals imposing "more restrictive policies and tighter controls about who can buy malware from certain brokers".
Operation Endgame describes its activities as different "seasons", but we don't think it's quite the right analogy. Disrupting cybercriminal networks is more like maintaining a lawn: You simply have to keep mowing.
Watch James Wilson and Tom Uren discuss this edition of the newsletter:
Three Reasons to Be Cheerful This Week:
- Android developer verification is coming: Starting in September this year all Android Apps on the Google Play and six other app stores will need to come from verified accounts. The plan is to extend verification worldwide in 2027.
- London SMS blaster ringleader sentenced: A 43-year-old Chinese national, Di Li, was sentenced to 48 months' in prison for his role organising an SMS blaster scheme to send fraudulent text messages around London. He convinced another man that he could pay off gambling debts by driving around the city and operating the blaster equipment.
- OpenAI to Patch the Planet: On Monday, OpenAI announced the catchily titled Patch the Planet initiative that it will run with security firm Trail of Bits to help improve the security of open source software. Further detail in this week's sponsored interview.
Sponsor Section
In this Risky Business sponsor interview, James Wilson chats with Trail of Bits founder and CEO Dan Guido about its newly announced partnership with OpenAI. Together, they’ve started a new initiative called "Patch the Planet" to support open source maintainers.
Risky Biz Talks
You can find the audio edition of this newsletter and other fine podcasts and interviews in the Risky Biz News feed (RSS, iTunes or Spotify).
In our last "Between Two Nerds" discussion Tom Uren and The Grugq discuss the idea that the People's Republic of China has mobilised its influence operations against the construction of US data centres and its build out of AI capacity.
Or watch it on YouTube!
From Risky Bulletin:
The FortiBleed incident is so much worse than a simple credentials leak: FortiBleed, a massive hacking campaign that targeted Fortinet devices this year, was far more sophisticated than security researchers initially thought.
Initial reports painted the picture of a campaign that gained access to Fortinet devices, collected credentials and authentication hashes, cracked the hashes, and then the data mysteriously leaked online.
The reality is that the campaign was far more complex and targeted a lot more things than just Fortinet devices. Compiling data from reports published by Fortinet itself, SOC Radar, CloudSEK, Palo Alto Networks, and Prodaft we have a clear picture of a broad hacking campaign that began in February this year and was initially just an internet mass-scan and brute-forcing operation.
[more on Risky Bulletin]
Klue breach impacts security firms: At least five security firms have had their Salesforce business accounts pilfered as part of a hacking spree that was traced back to business intelligence platform Klue.
The Klue breach took place last week, the company admitted in a blog post.
Hackers accessed its platform via "a compromised legacy credential associated with an integration service" and then stole OAuth tokens that customers had used to connect Klue to other third-party services, such as Salesforce.
[more on Risky Bulletin]
Canada's spy agency allowed to remove a botnet from Canadian devices: Canada's main intelligence service obtained a court warrant this week to proactively remove a mysterious botnet's malware from Canadian systems such as servers, home routers, and smart devices.
The devices were allegedly part of an unnamed proxy botnet. These types of botnets are very common these days and allow hackers to disguise the origin of their attacks and their identities, making their malicious traffic appear as coming from a local residential network.
According to a copy of the court order obtained by The Canadian Press, the botnet was allegedly being used by a threat actor to "advance their financial, political, ideological and economic interests."
[more on Risky Bulletin]