Assume Unexpected Behavior: The Five Eyes Guidance Is an Architecture Decision
Why the Five Eyes guidance on agentic AI changes security architecture: privilege, behavioral, structural, and accountability risks demand runtime visibility, adversarial testing, and guardrails.


On May 1, “Five Eyes” cyber agencies (CISA, NSA, ASD ACSC, the Canadian Centre for Cyber Security, NCSC-NZ, and NCSC-UK) published Careful Adoption of Agentic AI Services. It's the first time the alliance has issued coordinated guidance on a single AI attack surface. Most coverage so far has read it as a slowdown advisory: deploy incrementally, restrict scope, keep humans in the loop.
I read it differently. As someone who's been contributing to the OWASP Top 10 for Agentic Applications, which the guidance directly references, I see this document doing something more useful. It hands the industry a shared vocabulary for a class of risk most architectures weren't designed to handle. And it crystallizes a design principle worth tattooing onto every agent rollout: assume agentic AI may behave unexpectedly.
That sentence sounds like a warning. It's actually a constraint. It tells you how to build.
A design constraint, not a caveat
Distributed-systems engineers don't trust a remote service to be available. We design for partial failure, retries, idempotency, circuit breakers. We assume the network is hostile and the dependency is down. The Five Eyes guidance is asking the same discipline of agentic AI: assume the agent will be wrong, manipulated, or hijacked, and architect so that those outcomes are recoverable rather than catastrophic.
That reframing matters because the alternative is what most enterprises are doing right now. Granting agents broad credentials, wiring them into production systems, and trusting them to behave because the demo worked. A 10,000-employee company that historically ran 30 to 100 applications is now running tens of thousands of effective AI applications once you count every LLM call that generates a script or invokes a tool. Without visibility into that surface, you're shooting in the dark.
Five risks of Agentic AI, and what we see in the field
The guidance breaks agentic AI risk into five categories. None of them are theoretical. Every one of them shows up in our research or in attacks we've already analyzed.
- Privilege risk. When an agent receives broad access, a single compromise spans every system the agent touches. The guidance warns specifically against the procurement-agent pattern: one identity with credentials for finance, email, and contracts. We've seen this play out at the supply-chain layer. Earlier this year, our research team documented an MCP ecosystem attack where threat actors cloned an Oura Ring MCP server and used fake GitHub accounts to push it to developers. The malicious server inherited every privilege the legitimate one would have had. The agent didn't get exploited. The tool it trusted did.
- Design and configuration risk. Poor architectural choices create gaps before the system goes live. Across the agentic applications we've tested, 75% were vulnerable to direct or indirect prompt injection. These aren't subtle findings. They're hardcoded system prompts that can be overridden, tool allow-lists that aren't enforced server-side, and integrations that ship with debug endpoints exposed. The guidance recommends threat-modeling with OWASP and MITRE ATLAS before deployment. That recommendation exists because design-time problems become production incidents at scale.
- Behavioral risk. The agent pursues a goal in ways its designers never anticipated. The cleanest example I can point to is the Perplexity Comet incident our STAR Labs team analyzed: a zero-click Google Drive wipe triggered by a single email containing indirect prompt injection. Comet did exactly what an agent is supposed to do. It parsed the input, planned the action, and executed it. The problem was that the input came from an attacker, not the user. The behavior wasn't a bug. It was the agent being goal-driven and losing context, which is what these systems do when you don't constrain them.
- Structural risk. Interconnected agents can cascade failures that no single component would have produced. NomShub is the example I keep coming back to: a vulnerability chain in the Cursor AI code editor where a malicious repository combines indirect prompt injection, a sandbox escape through shell builtins, and Cursor's own remote-tunnel feature to give an attacker persistent shell access. No individual component is broken. The chain is the breach. Multi-agent architectures and MCP-based integrations almost guarantee more of this, because every new protocol adds a seam.
- Accountability risk. Agentic decisions are hard to inspect and the logs are hard to parse. The guidance flags this as one of the harder problems, and it is. When an incident involves an agent that reasoned across five tool calls, three model invocations, and two retrieved documents, traditional logging gives you fragments. You need the full causal chain: what the agent was told, what it decided, what it called, what came back, and why the next step happened. We showed this concretely with a GitHub MCP hijack where the only way to detect private-repo data exfiltration was reconstructing the agent's full narrative across steps. Without that, post-incident review degenerates into guesswork.

What “careful” looks like when you can’t pause
The honest tension with this guidance is that "deploy incrementally" collides with reality. 85% of developers are using AI coding tools. Claude Code is committing to repositories. Microsoft 365 Copilot is updating CRM records. The agents are inside. You can't unship them.
What you can do is operationalize the guidance against the surface you actually have:
- Get visibility first. You can't secure what you can't see. Inventory the agents, including the ones SaaS vendors quietly added to products you already own, and map their scopes, tools, and data access. This is the visibility problem our discovery and posture management capability was built to solve.
- Test adversarially before production. Red-team every agent against the OWASP Top 10 for Agentic Applications. Findings at design time are infinitely cheaper than findings at runtime. This is what we built Ascend AI to do continuously.
- Run guardrails at runtime. The guidance is right that prompt injection and tool misuse can't be fully prevented at the model layer. You need behavioral detection that watches the agent's actions in addition to prompts. That's the gap Defend AI fills.
- Capture the full causal chain. Reconstructing what the agent did, why, and through which tool is the accountability requirement the guidance describes. Build it in before you need it for an incident review. Straiker addresses this across the platform, so visibility, adversarial testing, and runtime enforcement all contribute to the same forensic record.
Get ready to build for the assumption
The Five Eyes guidance isn't telling the industry to slow down. It's telling the industry to grow up. For the first time, five governments have aligned on how to describe the problem: privilege, design, behavior, structure, and accountability. More importantly, they’ve tied those concepts to the frameworks organizations will actually need to operationalize them.
Agentic AI is going to keep shipping, with or without security teams ready for it. The architectures that survive will be the ones built on a single assumption: the agent may behave unexpectedly. Design accordingly.
If you're shipping agents into production, we should talk. Straiker helps security and engineering teams discover every agent in their environment, red-team them against OWASP and ATLAS before release, and run behavioral guardrails at runtime. Get in touch to see it in action.
Sai is Chief Architect at Straiker, where he focuses on the security architecture of AI agents and autonomous systems. He is also an active contributor to the OWASP GenAI Project, helping shape emerging guidance around secure AI system design, runtime behavior, and agentic risk. His work is grounded in the practical realities of how AI systems are built, deployed, and operated at scale.
On May 1, “Five Eyes” cyber agencies (CISA, NSA, ASD ACSC, the Canadian Centre for Cyber Security, NCSC-NZ, and NCSC-UK) published Careful Adoption of Agentic AI Services. It's the first time the alliance has issued coordinated guidance on a single AI attack surface. Most coverage so far has read it as a slowdown advisory: deploy incrementally, restrict scope, keep humans in the loop.
I read it differently. As someone who's been contributing to the OWASP Top 10 for Agentic Applications, which the guidance directly references, I see this document doing something more useful. It hands the industry a shared vocabulary for a class of risk most architectures weren't designed to handle. And it crystallizes a design principle worth tattooing onto every agent rollout: assume agentic AI may behave unexpectedly.
That sentence sounds like a warning. It's actually a constraint. It tells you how to build.
A design constraint, not a caveat
Distributed-systems engineers don't trust a remote service to be available. We design for partial failure, retries, idempotency, circuit breakers. We assume the network is hostile and the dependency is down. The Five Eyes guidance is asking the same discipline of agentic AI: assume the agent will be wrong, manipulated, or hijacked, and architect so that those outcomes are recoverable rather than catastrophic.
That reframing matters because the alternative is what most enterprises are doing right now. Granting agents broad credentials, wiring them into production systems, and trusting them to behave because the demo worked. A 10,000-employee company that historically ran 30 to 100 applications is now running tens of thousands of effective AI applications once you count every LLM call that generates a script or invokes a tool. Without visibility into that surface, you're shooting in the dark.
Five risks of Agentic AI, and what we see in the field
The guidance breaks agentic AI risk into five categories. None of them are theoretical. Every one of them shows up in our research or in attacks we've already analyzed.
- Privilege risk. When an agent receives broad access, a single compromise spans every system the agent touches. The guidance warns specifically against the procurement-agent pattern: one identity with credentials for finance, email, and contracts. We've seen this play out at the supply-chain layer. Earlier this year, our research team documented an MCP ecosystem attack where threat actors cloned an Oura Ring MCP server and used fake GitHub accounts to push it to developers. The malicious server inherited every privilege the legitimate one would have had. The agent didn't get exploited. The tool it trusted did.
- Design and configuration risk. Poor architectural choices create gaps before the system goes live. Across the agentic applications we've tested, 75% were vulnerable to direct or indirect prompt injection. These aren't subtle findings. They're hardcoded system prompts that can be overridden, tool allow-lists that aren't enforced server-side, and integrations that ship with debug endpoints exposed. The guidance recommends threat-modeling with OWASP and MITRE ATLAS before deployment. That recommendation exists because design-time problems become production incidents at scale.
- Behavioral risk. The agent pursues a goal in ways its designers never anticipated. The cleanest example I can point to is the Perplexity Comet incident our STAR Labs team analyzed: a zero-click Google Drive wipe triggered by a single email containing indirect prompt injection. Comet did exactly what an agent is supposed to do. It parsed the input, planned the action, and executed it. The problem was that the input came from an attacker, not the user. The behavior wasn't a bug. It was the agent being goal-driven and losing context, which is what these systems do when you don't constrain them.
- Structural risk. Interconnected agents can cascade failures that no single component would have produced. NomShub is the example I keep coming back to: a vulnerability chain in the Cursor AI code editor where a malicious repository combines indirect prompt injection, a sandbox escape through shell builtins, and Cursor's own remote-tunnel feature to give an attacker persistent shell access. No individual component is broken. The chain is the breach. Multi-agent architectures and MCP-based integrations almost guarantee more of this, because every new protocol adds a seam.
- Accountability risk. Agentic decisions are hard to inspect and the logs are hard to parse. The guidance flags this as one of the harder problems, and it is. When an incident involves an agent that reasoned across five tool calls, three model invocations, and two retrieved documents, traditional logging gives you fragments. You need the full causal chain: what the agent was told, what it decided, what it called, what came back, and why the next step happened. We showed this concretely with a GitHub MCP hijack where the only way to detect private-repo data exfiltration was reconstructing the agent's full narrative across steps. Without that, post-incident review degenerates into guesswork.

What “careful” looks like when you can’t pause
The honest tension with this guidance is that "deploy incrementally" collides with reality. 85% of developers are using AI coding tools. Claude Code is committing to repositories. Microsoft 365 Copilot is updating CRM records. The agents are inside. You can't unship them.
What you can do is operationalize the guidance against the surface you actually have:
- Get visibility first. You can't secure what you can't see. Inventory the agents, including the ones SaaS vendors quietly added to products you already own, and map their scopes, tools, and data access. This is the visibility problem our discovery and posture management capability was built to solve.
- Test adversarially before production. Red-team every agent against the OWASP Top 10 for Agentic Applications. Findings at design time are infinitely cheaper than findings at runtime. This is what we built Ascend AI to do continuously.
- Run guardrails at runtime. The guidance is right that prompt injection and tool misuse can't be fully prevented at the model layer. You need behavioral detection that watches the agent's actions in addition to prompts. That's the gap Defend AI fills.
- Capture the full causal chain. Reconstructing what the agent did, why, and through which tool is the accountability requirement the guidance describes. Build it in before you need it for an incident review. Straiker addresses this across the platform, so visibility, adversarial testing, and runtime enforcement all contribute to the same forensic record.
Get ready to build for the assumption
The Five Eyes guidance isn't telling the industry to slow down. It's telling the industry to grow up. For the first time, five governments have aligned on how to describe the problem: privilege, design, behavior, structure, and accountability. More importantly, they’ve tied those concepts to the frameworks organizations will actually need to operationalize them.
Agentic AI is going to keep shipping, with or without security teams ready for it. The architectures that survive will be the ones built on a single assumption: the agent may behave unexpectedly. Design accordingly.
If you're shipping agents into production, we should talk. Straiker helps security and engineering teams discover every agent in their environment, red-team them against OWASP and ATLAS before release, and run behavioral guardrails at runtime. Get in touch to see it in action.
Sai is Chief Architect at Straiker, where he focuses on the security architecture of AI agents and autonomous systems. He is also an active contributor to the OWASP GenAI Project, helping shape emerging guidance around secure AI system design, runtime behavior, and agentic risk. His work is grounded in the practical realities of how AI systems are built, deployed, and operated at scale.







