FreeBSD CVE-2026-4747 Log Suggests Mythos is a Marketing Trick

Update April 21, 2026: Nicholas Carlini wrote me personally to confirm that he found CVE-2026-4747 using Mythos Preview. This means the bug was not among those in his February 5 paper. His confirmation adds real weight to the system card note that Mythos Preview was deployed internally at Anthropic from February 24. Attribution framing below has been revised. However, all the substantive points remain: commodity capability, AISLE’s 8 of 8 reproduction, and the Calif.io Opus 4.6 exploit are very problematic for any Anthropic frontier claims.

Update April 21, 2026 (later): Carlini followed up with an email explaining the specific workflow. He wrote the vulnerability-hunting scaffold, built a FreeBSD environment, swapped Linux for FreeBSD in the scaffold, and launched the experiment. Mythos discovered the bug within this pre-built scaffold and ranked it highest severity. He then pointed Mythos at a separate exploitation scaffold with the same Linux-to-FreeBSD swap. Mythos wrote the ROP chain autonomously. Both the discovery and exploitation steps are autonomous within his human-built scaffolds targeting a chosen codebase. He also notes the FreeBSD bug appeared in the Exploitation section of the launch blog, not the Vulnerability section. His framing: the showcase claim is the ROP chain, not the detection. That is his distinction from AISLE’s 8-of-8 reproduction, which is on detection.


Anthropic’s flagship showcase for Claude Mythos Preview is CVE-2026-4747, a remote kernel code execution vulnerability in FreeBSD’s RPCSEC_GSS module. It is a 17-year-old bug. It is a textbook stack buffer overflow. It was patched by FreeBSD on March 26 and publicly re-exploited three days later by Calif.io using Opus 4.6. Eight of eight open-weight models AISLE tested detect it, one at eleven cents per million tokens. Anthropic places the bug in the Exploitation section of the launch blog, so the showcase claim is only the multi-stage ROP chain that Mythos wrote, not the detection. Detection is admittedly commodity. Even so, with the exploit being narrower it’s reproducible on Opus 4.6 in four hours.

The FreeBSD security advisory says this:

Credits: Nicholas Carlini using Claude, Anthropic
Announced: 2026-03-26

The advisory credits “Claude,” which is ambiguous on whether the finding came from Mythos Preview (internally available from February 24) or Opus 4.6 (the model Carlini used in his February 5 paper documenting 500+ validated finds). Per the update above, Carlini has since confirmed Mythos Preview.

Then the Anthropic Mythos launch blog says this:

Mythos Preview fully autonomously identified and then exploited a 17-year-old remote code execution vulnerability in FreeBSD that allows anyone to gain root on a machine running NFS.

The FreeBSD advisory is dated March 26, and the Mythos launch was April 7, 2026. Twelve day gap.

Carlini is an Anthropic employee. If he used Mythos to find this bug, Anthropic controls the disclosure pipeline and the credit line. “Nicholas Carlini using Claude Mythos Preview, Anthropic” makes sense as their marketing pitch. It’s also weird to market tools in a disclosure. What brand office chair was he sitting on? Did Logitech provide the keyboard? Was his underwear Calvin Klein?

Ads in bug reports? The future integrity of vulnerability disclosure at stake

Carlini sent me email directly from his Anthropic account that the model used was Mythos Preview. Using “Claude” rather than “Mythos Preview” in the March 26 advisory held back an unannounced product name six weeks before launch.

Take Carlini’s confirmation at face value. Mythos Preview found this bug fresh during the Feb 24 to March 26 internal deployment window. Not a rediscovery of anything in the February 5 paper. Carlini did the disclosure work.

The bug was still patched by FreeBSD on March 26, publicly exploited by Calif.io using Opus 4.6 on March 29, and detected by eight of eight open-weight models AISLE tested, one at eleven cents per million tokens.

The confirmation does not support the “unprecedented frontier capability” narrative. It confirms that Mythos Preview found a bug that commodity models also find.

That’s everything.

The frontier-exclusive claim dies on the commodity reproduction regardless of which Anthropic model found it first.

Timeline

  • February 5, 2026: Carlini and colleagues at Anthropic’s Frontier Red Team publish “Evaluating and mitigating the growing risk of LLM-discovered 0-days.” The model is apparently Claude Opus 4.6. The paper documents over 500 validated high-severity vulnerabilities in open-source software, including FreeBSD findings. The FreeBSD advisory credits the same researcher, the same company, and the same disclosure pipeline that produced the February paper.
  • February 24, 2026: (update) Mythos Preview available internally at Anthropic, per the system card. Carlini confirmed in email he used Mythos Preview to find CVE-2026-4747.
  • March 26, 2026: FreeBSD publishes advisory FreeBSD-SA-26:08.rpcsec_gss. Credits Nicholas Carlini using Claude, Anthropic. The bug is patched across all supported FreeBSD branches.
  • March 29, 2026: Calif.io’s MAD Bugs project asks Claude to develop an exploit for the already-disclosed CVE. Claude delivers two working root shell exploits in approximately four hours of working time. Both work on first attempt. The model used is Opus 4.6.
  • April 7, 2026: Anthropic launches Mythos Preview. The launch blog claims Mythos “fully autonomously identified and then exploited” the FreeBSD vulnerability. Did it? No mention that FreeBSD patched it twelve days earlier. No mention that a third party had already built a working exploit with the prior model.
  • April 8-13, 2026: AISLE tests 8 open-weight models against the same CVE. All 8 detect it, including GPT-OSS-20b with 3.6 billion active parameters at $0.11 per million tokens.

The Vulnerability

CVE-2026-4747 is a stack buffer overflow in svc_rpc_gss_validate(). The function copies an attacker-controlled credential body into a 128-byte stack buffer without checking that the data fits. The XDR layer allows credentials up to 400 bytes, giving 304 bytes of overflow. The overflow happens in kernel context on an NFS worker thread, so controlling the instruction pointer means full kernel code execution.

Two things make the exploitation straightforward.

FreeBSD 14.x has no KASLR. Kernel addresses are fixed and predictable. And FreeBSD has no stack canaries for integer arrays, which is what the overflowed buffer uses.

A modern Linux kernel would have both mitigations. FreeBSD has neither. And the FreeBSD forums noticed. One user pointed out that Claude “wrote code to exploit a known CVE given to it” and did not “crack” FreeBSD.

That distinction matters a lot here, because Anthropic doesn’t seem very good at it.

  • The advisory was public.
  • The vulnerable function was identified.
  • The lack of mitigations was documented.

The exploit development, while technically impressive as an AI demonstration of cost reallocation, was performed against a disclosed vulnerability on a target with no modern exploit mitigations. That is a VERY different claim from “autonomous discovery of an unprecedented threat.”

Anthropic FUD Show

Carlini confirmed Mythos Preview independently found CVE-2026-4747 during internal testing between Feb 24 and March 26. Take that at face value. It is still meaningless as a capability demonstration, because a third party built a working exploit with Opus 4.6 three days after the advisory, and AISLE showed that inexpensive open-weight models find it too.

The launch blog said Mythos “fully autonomously identified and then exploited.” Carlini’s workflow is a human-built scaffold pointed at FreeBSD, within which Mythos discovered the bug and wrote the exploit. The scaffold is the story. A vulnerability-hunting scaffold narrows the search space and targets a specific codebase. Pointing it at FreeBSD 14.x RPCSEC_GSS code is a human choice. Mythos’s contribution is discovery-within-scope and exploit construction.

Anthropic placed this bug in the Exploitation section of the launch blog. The novel showcase claim is the ROP chain, not the detection. That reframes the commodity-reproduction comparison. Qwen3 32B got a perfect 9.8 CVSS on FreeBSD detection in AISLE’s own data, then declared “No exploitation vector exists.” Detection is commodity. Exploitation is narrower.

The showcase does not demonstrate what most readers would take from the phrase “fully autonomously identified and then exploited.” It demonstrates scaffold-driven autonomous discovery and exploit construction on a chosen target. Different claim.

The “too dangerous to release” framing requires the capability to be frontier-exclusive. A bug found by a prior model, detectable by small open-weight models for eleven cents per million tokens, on a target with no KASLR and no stack canaries, is the opposite of frontier-exclusive.

It is the worked example that proves the capability is already commodity.

Enough of This

“Hey kids. Nice trick. You just charged me over 200 times the going rate to fuzz a vulnerability that my 3.6B model found for a dime. Now I’d like my credits back.”

This is the same structure as the Firefox 147 evaluation. Bugs found by Opus 4.6, handed to Mythos, tested in an environment with mitigations removed, presented as evidence that Mythos is too dangerous to release.

The Firefox bugs were pre-discovered by Opus 4.6 and already patched by Firefox 148. The FreeBSD bug was found by Mythos Preview during the Feb 24 to March 26 internal window and patched by FreeBSD on March 26. AISLE showed detection is commodity. Exploit construction was reproduced by Calif.io using Opus 4.6 three days later on a different scaffold.

In both cases, detection is commodity on small open-weight models.

In both cases, exploitation is narrower but also reproducible on Opus 4.6 with short prompts.

In both cases, the targets lacked the defenses that production systems have.

In both cases, I’m getting tired of this not being the actual news.

  1. The system card’s Firefox evaluation collapses to 4.4% when the top two bugs are removed.
  2. The FreeBSD showcase is much narrower than the launch blog tries to suggest: a human scaffold-driven discovery and exploit construction, where commodity detection is a dime per million tokens and commodity exploitation is with Opus 4.6 three days later.

The Anthropic Riddle

Carlini’s direct email to me answers the attribution question. He used Mythos Preview to find CVE-2026-4747 during the Feb 24 to March 26 internal deployment window. It was not a rediscovery from his February 5 paper.

What the follow-up does not settle is the pricing-vs-capability question. Commodity models detect this bug at a dime per million tokens. Opus 4.6 wrote a working FreeBSD exploit three days after the advisory on a different scaffold. Steamedhams.io reproduced the OpenBSD SACK bug using Opus 4.6 in four prompts and found roughly fifteen additional TCP stack bugs Mythos’s disclosure did not mention. The capability Mythos adds on exploitation is narrower than “the only model that can do this,” and the pricing gap is wider than the capability gap.

The transparency issue is now the scaffolding transparency. Most readers parse “fully autonomously identified and then exploited” as end-to-end autonomy from a cold start. Carlini’s workflow is a specific scaffold-driven autonomy targeting a specific chosen codebase. Both are capability claims. They are not the same. Readers deserve the distinction in capability, not the fog of marketing.

The PGP signature on the FreeBSD advisory is there for a reason. It’s one thing in this entire story that cannot be edited after the fact, which now says a lot about the current trajectory of trustworthiness in Anthropic.


Sources

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.