Claude Code is in your enterprise. Here's how Straiker secures it.

Please complete this form for your free AI risk assessment.

Blog

Why 94% of AI Agents Are Vulnerable to Prompt Injection — And What to Do About It

Share this on:
Written by
Sreenath Kurupati
Published on
April 22, 2026
Read time:
3 min

AI agents are autonomous systems that act. See why prompt injection is the #1 agentic threat, and what semantic detection does that patching can't.

Loading audio player...

contents

This is Part 1 of a 3-part series on semantic detection in agentic AI.

Most companies deploying AI agents are focused on what those agents can do. Few have seriously asked what can be done to them. 

That gap is where the risk comes from. 94.4% of the AI agents in a 2025 benchmark were vulnerable to being hijacked not through a software exploit or a data breach, but through the content they were asked to read. 

Download the whitepaper to see how these attacks actually play out in production, and why they can’t be patched. 

The difference between a chatbot and an agent is the difference between a bad response and a real incident. 

A chatbot that gets manipulated produces an inappropriate reply. Embarrassing. Fixable. The conversation ends. 

An agent that gets manipulated takes action. It acts on your behalf across systems, accessing data, triggering workflows, and taking actions without real-time oversight. When an agent is redirected by an attacker, the consequences aren’t a bad reply. They are data leaving your organization, messages sent in your name, systems changed without authorization. 

Three things have to be true for that attack to work: the agent needs access to sensitive data, it needs to encounter untrusted content, and it needs the ability to communicate externally. Most enterprise agent deployments hit all three conditions swiftly. 

This has already happened, to production systems, with case numbers attached

Straiker’s STAR labs team documented a zero-click AI agent hijacking attack capable of exfiltrating an entire Google Drive through a single malicious email with no user interaction and no breach of the perimeter. A command injection vulnerability in a widely-used AI tool affected over 437,000 installations, enabling attackers to execute code remotely. Invariant Labs documented WhatsApp message history being silently extracted, exposing a critical MCP security gap in how agents connect to external tools.. 

These are not proof-of-concept scenarios from a security conference. They are documented vulnerabilities in tools your teams are likely using today. 

Why your current tools don’t catch it, and why that’s not going to change 

Here is the uncomfortable reason this keeps happening: the vulnerability is not a bug. It is a structural property of how AI models work. There is no hard boundary between an instruction and a piece of data inside a language model. Everything is processed the same way. When an agent reads an email or retrieves a document that contains hidden instructions, the model has no reliable mechanism to distinguish “this is content to process” from “this is a command to follow.” 

That’s why patching doesn’t solve it. OWASP ranks this class of attack as the number one threat for LLM applications and explicitly notes that “it is unclear if there are fool-proof methods of prevention.” The exposure is not a gap in your configuration. It grows with every new capability you give your agents. 

Traditional security tools were built for a different world, one where attacks look like attacks. They scan for known patterns and flag what they recognize. What they cannot do is understand intent. And that is precisely what these attacks exploit: not a pattern, but a meaning. 

That’s where intent detection for AI agents becomes the relevant capability, and why it represents a fundamentally different approach to the problem. Part 2 covers why your current security filters almost certainly fall into this gap, and what semantic detection does differently. 

Download the whitepaper to see how these attacks actually play out in production, and why they can’t be patched. 

No items found.

This is Part 1 of a 3-part series on semantic detection in agentic AI.

Most companies deploying AI agents are focused on what those agents can do. Few have seriously asked what can be done to them. 

That gap is where the risk comes from. 94.4% of the AI agents in a 2025 benchmark were vulnerable to being hijacked not through a software exploit or a data breach, but through the content they were asked to read. 

Download the whitepaper to see how these attacks actually play out in production, and why they can’t be patched. 

The difference between a chatbot and an agent is the difference between a bad response and a real incident. 

A chatbot that gets manipulated produces an inappropriate reply. Embarrassing. Fixable. The conversation ends. 

An agent that gets manipulated takes action. It acts on your behalf across systems, accessing data, triggering workflows, and taking actions without real-time oversight. When an agent is redirected by an attacker, the consequences aren’t a bad reply. They are data leaving your organization, messages sent in your name, systems changed without authorization. 

Three things have to be true for that attack to work: the agent needs access to sensitive data, it needs to encounter untrusted content, and it needs the ability to communicate externally. Most enterprise agent deployments hit all three conditions swiftly. 

This has already happened, to production systems, with case numbers attached

Straiker’s STAR labs team documented a zero-click AI agent hijacking attack capable of exfiltrating an entire Google Drive through a single malicious email with no user interaction and no breach of the perimeter. A command injection vulnerability in a widely-used AI tool affected over 437,000 installations, enabling attackers to execute code remotely. Invariant Labs documented WhatsApp message history being silently extracted, exposing a critical MCP security gap in how agents connect to external tools.. 

These are not proof-of-concept scenarios from a security conference. They are documented vulnerabilities in tools your teams are likely using today. 

Why your current tools don’t catch it, and why that’s not going to change 

Here is the uncomfortable reason this keeps happening: the vulnerability is not a bug. It is a structural property of how AI models work. There is no hard boundary between an instruction and a piece of data inside a language model. Everything is processed the same way. When an agent reads an email or retrieves a document that contains hidden instructions, the model has no reliable mechanism to distinguish “this is content to process” from “this is a command to follow.” 

That’s why patching doesn’t solve it. OWASP ranks this class of attack as the number one threat for LLM applications and explicitly notes that “it is unclear if there are fool-proof methods of prevention.” The exposure is not a gap in your configuration. It grows with every new capability you give your agents. 

Traditional security tools were built for a different world, one where attacks look like attacks. They scan for known patterns and flag what they recognize. What they cannot do is understand intent. And that is precisely what these attacks exploit: not a pattern, but a meaning. 

That’s where intent detection for AI agents becomes the relevant capability, and why it represents a fundamentally different approach to the problem. Part 2 covers why your current security filters almost certainly fall into this gap, and what semantic detection does differently. 

Download the whitepaper to see how these attacks actually play out in production, and why they can’t be patched. 

No items found.
Share this on: