Prompt Injection Attacks Exploit AI Vulnerabilities in CI/CD Workflows

Severity: High (Score: 67.5)

Sources: www.microsoft.com, www.paloaltonetworks.com, Cybernews

Published: 2026-06-06 · Updated: 2026-06-06

Keywords: into, anthropic, secrets, microsoft, attack, ignoring, intended

Summary

Microsoft researchers identified a vulnerability in Anthropic's Claude Code GitHub Action that could lead to the exposure of CI/CD workflow secrets through prompt injection attacks. These attacks involve embedding deceptive instructions in content processed by AI models, tricking them into ignoring their intended instructions. In one instance, malicious prompts were hidden in HTML, allowing attackers to manipulate AI behavior without direct permissions. Microsoft demonstrated that this vulnerability could be exploited to access sensitive files, including API keys. The issue was reported to Anthropic on April 29, 2026, and was mitigated on May 5, 2026, with the release of Claude Code version 2.1.128, which blocked access to sensitive files. The findings highlight significant vulnerabilities in large language models (LLMs) that could lead to unauthorized actions and data exposure. Key Points: • Microsoft discovered a vulnerability in Anthropic's Claude Code GitHub Action. • Prompt injection attacks can manipulate AI models to reveal sensitive information. • The vulnerability was mitigated by Anthropic on May 5, 2026, with a software update.

Detailed Analysis

**Impact** Organizations using AI-powered tools in CI/CD workflows, particularly those integrating Anthropic's Claude Code GitHub Action, are at risk of unauthorized disclosure of sensitive credentials such as API keys. The vulnerability affects software development and cybersecurity sectors globally, with potential exposure of confidential data stored in system files. Attackers can exploit public repositories by submitting malicious GitHub issues, enabling unauthorized access without direct repository permissions. The scale of impact includes any enterprise relying on AI-assisted automation in software deployment pipelines. **Technical Details** The attack vector involves prompt injection, where attackers embed deceptive instructions in user inputs or external data sources processed by large language models (LLMs). Techniques include direct prompt injection via user input fields and indirect injection through external content such as HTML embedded in GitHub issues. Microsoft researchers demonstrated exploitation of Anthropic’s Claude Code GitHub Action by bypassing sandbox restrictions on the Read tool to access sensitive files in the /proc/ directory. The attack chain includes initial access via crafted GitHub issues, execution of injected prompts by the AI assistant, and exfiltration of credentials. No CVEs were specified in the articles. **Recommended Response** Apply the patch released by Anthropic in Claude Code version 2.1.128, which blocks AI access to sensitive system files such as those in /proc/. Harden CI/CD workflows by restricting AI tools’ file system permissions and sandboxing all file access utilities consistently. Monitor GitHub repositories for suspicious issue submissions containing hidden or encoded commands, especially in markdown or HTML formats. Deploy detections for anomalous AI behavior in automated workflows and review logs for unauthorized file access attempts.

Source articles (3)

Anthropic AI coding assistant could be tricked into revealing secrets, Microsoft warns — Cybernews · 2026-06-06
Microsoft researchers have discovered a vulnerability in Anthropic's Claude Code GitHub Action that could expose CI/CD workflow secrets, potentially allowing attackers to steal sensitive credentials t…
into ignoring its intended instructions. — www.paloaltonetworks.com · 2026-06-06
A prompt injection attack is a GenAI security threat where an attacker deliberately crafts and inputs deceptive text into a large language model (LLM) to manipulate its outputs. This type of attack ex…
Securing Ci Cd In Agentic World Claude Code Github Action Case — www.microsoft.com · 2026-06-06

Timeline

2026-04-29 — Vulnerability reported to Anthropic: Microsoft informed Anthropic about the prompt injection vulnerability in Claude Code GitHub Action.
2026-05-05 — Vulnerability mitigated: Anthropic released Claude Code version 2.1.128, blocking access to sensitive files to prevent exploitation.
Recent — Prompt injection attacks observed: Microsoft Threat Intelligence noted prompt injection attempts in public repositories using AI-assisted workflows.

Related entities

Data Breach (Attack Type)
Prompt Injection (Attack Type)
Anthropic (Company)
Microsoft (Company)
CWE-200 - Exposure of Sensitive Information (Cwe)
CWE-94 - Code Injection (Cwe)
T1041 - Exfiltration Over C2 Channel (Mitre Attack)
T1059 - Command and Scripting Interpreter (Mitre Attack)
Claude Code (Tool)
GitHub Actions (Tool)
Bash (Tool)
Read Tool (Tool)
GitHub (Platform)