Anthropic Launches Cyber Jailbreak Severity Framework for Fable 5 Safeguards

Anthropic Launches Cyber Jailbreak Severity Framework for Fable 5 Safeguards

First seen 3 Jul 2026, 04:25 UTC AnthropicCryptobriefingCybersecuritynewsGbhackersLetsdatascience+1 82% similarity 54.9
Share:

Article Content

Browse articles
ThreatCluster

Anthropic has redeployed its AI model, Claude Fable 5, after a temporary suspension due to a jailbreak vulnerability. The US Commerce Department had enforced export controls after Amazon researchers discovered a method to bypass Fable 5's safeguards. In response, Anthropic introduced a new safety classifier that blocks over 99% of attempts to exploit this vulnerability. Alongside this, the company proposed a Cyber Jailbreak Severity (CJS) framework, developed with partners like Amazon, Microsoft, and Google, to standardize the assessment of jailbreak risks. The CJS scale ranges from CJS-0 (Informational) to CJS-4 (Critical) and evaluates jailbreaks based on capability gain, breadth of capability gain, ease of weaponization, and discoverability. The framework aims to facilitate consistent communication among AI developers, security teams, and regulators regarding jailbreak risks. A HackerOne bug-bounty program has also been launched to encourage researchers to report potential cyber jailbreaks in Fable 5.

Key Points: • Anthropic redeployed Fable 5 on July 1, 2026, after a 19-day suspension due to a jailbreak. • The new Cyber Jailbreak Severity framework categorizes jailbreak risks into five levels from CJS-0 to CJS-4. • A HackerOne program is available for researchers to report vulnerabilities in Fable 5.

ThreatCluster AI

Timeline

2026-06-12
US government enforces export controls on Fable 5
The US Commerce Department suspended global access to Fable 5 after a jailbreak vulnerability was discovered.
aiweekly.co
2026-07-01
Fable 5 redeployed globally
Access to Fable 5 was restored after implementing a new safety classifier that blocks over 99% of jailbreak attempts.
Cryptobriefing
2026-07-02
Anthropic publishes Cyber Jailbreak Severity framework
The framework aims to standardize how AI jailbreak risks are assessed across the industry.
Letsdatascience
2026-07-03
Anthropic launches HackerOne bug-bounty program
The program encourages researchers to report potential cyber jailbreaks in Fable 5 for review.
Anthropic

Community

Browse all →