AI Models Vulnerable to Multi-Turn Attacks, Cisco Study Reveals

Severity: High (Score: 64.5)

Sources: Infosecurity-Magazine, Csoonline, Cybersecuritydive, blogs.cisco.com

Published: 2026-05-27 · Updated: 2026-05-28

Keywords: models, cisco, vulnerable, attacks, leading, researchers, guardrails

Severity indicators: rat

Summary

A Cisco study has found that leading AI models from OpenAI, Anthropic, Google, Amazon, and xAI are significantly more vulnerable to multi-turn attacks than previously claimed. The research evaluated 15 models and revealed that single-turn attack success rates (ASR) do not accurately represent their performance under iterative attacks. For instance, OpenAI's GPT 5.4 had a single-turn ASR of 2.74%, which jumped to 24.68% in multi-turn scenarios. The study highlighted that attackers can exploit models by reframing refusals and decomposing tasks across multiple exchanges. Overall, the multi-turn ASR ranged from 8% to 88%, compared to 2% to 65% for single-turn prompts. This discrepancy poses a significant risk for organizations relying on single-prompt evaluations for security decisions. Cisco's findings call for a reevaluation of how AI model safety is assessed and highlight the need for more comprehensive testing methodologies. Key Points: • Leading AI models are more vulnerable to multi-turn attacks than single-turn evaluations suggest. • Cisco's study found multi-turn attack success rates ranged from 8% to 88%, highlighting significant risks. • Organizations must reconsider their AI safety assessments to account for iterative attack strategies.

Detailed Analysis

**Impact** Organizations deploying large language models (LLMs) from major vendors including OpenAI, Anthropic, Google, Amazon, and xAI are affected globally. The study tested 15 frontier AI models, revealing multi-turn attack success rates (ASRs) ranging from 8% to 88%, significantly higher than single-turn ASRs of 2% to 65%. This gap exposes enterprises relying on single-prompt safety benchmarks to increased security and governance risks, potentially leading to unauthorized actions or data leakage in sectors adopting AI for customer service, internal automation, or decision support. **Technical Details** Attackers exploit multi-turn conversational techniques that include role-playing, misdirection, ambiguity, reframing refusals, information decomposition, and incremental escalation to bypass LLM safety guardrails. Testing involved over 30,000 single-turn and nearly 7,000 multi-turn attacks across 1,456 conversations, demonstrating that iterative adversarial interaction increases model vulnerability. No specific malware, CVEs, or infrastructure details were provided; the attack vector focuses on adversarial prompt engineering during the interaction phase of the AI kill chain. **Recommended Response** Defenders should prioritize evaluating AI model safety using multi-turn attack simulations rather than relying solely on single-turn benchmarks. Enable and document configuration settings such as reasoning modes that impact model resilience, as seen with xAI’s Grok 4.1. Implement monitoring for anomalous multi-turn conversational patterns indicative of iterative probing or escalation. Vendors and organizations should demand transparency on multi-turn ASRs and incorporate these metrics into procurement and governance decisions.

Source articles (4)

All Major LLMs Exposed to Multi-Turn Manipulation, Warn Researchers — Infosecurity-Magazine · 2026-05-27
The safety guardrails of several prominent large language models (LLM) can be bypassed if a user tricks the LLM into having a multi-pronged, ongoing conversation, researchers at Cisco have warned. The…
Leading AI models are more vulnerable to malicious prompts than vendors claim — Cybersecuritydive · 2026-05-27
Hackers could subvert frontier models with attacks that their developers overlook, Cisco said. Cisco’s evaluation of 15 leading AI models from OpenAI, Anthropic, Google, Amazon and xAI “found that sin…
AI models more vulnerable than claimed when faced with iterative attacks — Csoonline · 2026-05-27
CISOs relying on LLM runtime guardrails and official safety scores when making security decisions their organizations’ AI usage and model selection are due for a wakeup call. According to a new study…
Cisco researchers said in a report — blogs.cisco.com · 2026-05-27

Timeline

2026-05-27 — Cisco study published on AI model vulnerabilities: Cisco's evaluation revealed that leading AI models are significantly more susceptible to multi-turn attacks than previously reported, with success rates ranging from 8% to 88%.
2026-05-27 — Multi-turn attack techniques identified: Researchers tested various attack strategies including role-playing and misdirection, demonstrating that all models were affected by multi-turn attack success rates.
2026-05-27 — Call for reevaluation of AI safety benchmarks: Cisco's findings emphasize the need for organizations to rethink how they evaluate AI model safety, moving beyond single-prompt testing.

Related entities

Amazon Nova (Platform)
Anthropic’s Claude (Platform)
Google Gemini (Platform)
GrokAI (Platform)
OpenAI’s ChatGPT (Platform)
XAI’s Grok (Platform)