Anthropic's Opus 4.8 Browser Agent Faces 31.5% Hijack Rate Pre-Safeguards

Anthropic's Opus 4.8 Browser Agent Faces 31.5% Hijack Rate Pre-Safeguards

1 Jun 2026 VenturebeatKucoin 84% similarity 67.5
Share:

Article Content

Browse articles
ThreatCluster

Anthropic reported a 31.5% hijack rate for its Opus 4.8 browser agent before safeguards were activated. This statistic was disclosed in a 244-page system card released on May 28, 2026. The hijack rate indicates that nearly one in three prompt injection attacks succeeded when the model was exposed to the web without defenses. In contrast, other AI labs like OpenAI and Google provided less comprehensive data on prompt injection vulnerabilities. Post-safeguard testing showed a significant reduction in hijack success rates to around 1%. The findings highlight the ongoing security challenges posed by prompt injection, particularly for AI systems interacting with external data sources. The crypto industry, heavily reliant on AI agents for various functions, is particularly at risk from these vulnerabilities. Security professionals are urged to consider these metrics when deploying AI systems.

Key Points: • Anthropic's Opus 4.8 browser agent has a 31.5% hijack rate before safeguards. • Post-safeguard testing reduced the hijack rate to approximately 1%. • Prompt injection remains a critical security challenge for AI systems.

ThreatCluster AI

Timeline

2026-05-28
Anthropic releases Opus 4.8 system card
The system card details a 31.5% hijack rate for the browser agent before safeguards engage, spanning 244 pages.
Venturebeat
2026-06-01
Multiple articles report on hijack rate
Both Kucoin articles confirm the 31.5% hijack rate and discuss implications for AI security in crypto.
Kucoin
2026-06-01
Post-safeguard testing shows improvement
Testing indicated that post-safeguard hijack rates dropped to around 1%, highlighting the effectiveness of defensive measures.
Kucoin

Community

Browse all →