Anthropic Launches Bug Bounty for AI Safety Defenses

Severity: Medium (Score: 51.9)

Sources: www.anthropic.com

Summary

Anthropic has initiated a new bug bounty program to identify vulnerabilities in its AI safety measures, particularly targeting universal jailbreaks that could bypass safety protocols. The program, in partnership with HackerOne, aims to stress-test the Constitutional Classifiers system designed to prevent misuse related to CBRN (chemical, biological, radiological, and nuclear) topics. Participants can earn rewards up to $25,000 for verified vulnerabilities. This initiative follows a previous bug bounty program and is invite-only, running from May 10 to May 18, 2026. The focus is on ensuring that AI models meet the AI Safety Level-3 (ASL-3) Deployment Standard as part of their Responsible Scaling Policy. The program invites experienced researchers to apply, with plans to expand participation in the future. Key Points: • Anthropic's new bug bounty program targets vulnerabilities in AI safety measures. • Rewards up to $25,000 are offered for verified universal jailbreaks related to CBRN topics. • The program runs from May 10 to May 18, 2026, and is currently invite-only.

Key Entities

anthropic.com (domain)
Financial (industry)

Anthropic Launches Bug Bounty for AI Safety Defenses

Summary

Key Entities

Threat Not Found