Governing the New Frontier: The Missing Layer for Agent Access

1. The Response Taking Shape

In March 2026, an autonomous agent breached McKinsey’s internal AI platform in two hours. No credentials, no insider knowledge, access to millions of internal messages.^[15] The governance gap that made this possible is structural. AI agents operate across company systems with broad permissions and no runtime oversight. In The New Frontier in Identity Security: AI Agent Access, we mapped why this is inevitable. Agents inherit human credentials, receive standing permissions at setup and accumulate tool connections over time. The governance gap widens with every tool an agent can reach. The responses are here. The layer that governs them is not.

Five Responses, One Missing Layer

Model Context Protocol (MCP) gives agents a universal way to reach tools, but the protocol has no opinion on who may use them or under what conditions. Five responses are now trying to answer that question, spanning standards bodies, protocol authors and identity vendors.

A2A, the Agent-to-Agent protocol (Google → Linux Foundation). Google describes A2A as “designed to support enterprise-grade authentication and authorization, with parity to OpenAPI’s authentication schemes.”^[2] In practice, A2A lets one agent delegate a task to another. The requesting agent reads the target’s Agent Card, a JSON file listing what the target can do. It then sends the work as a structured task object. The target processes the task and updates its state as it goes: submitted on arrival, working during processing, auth-required when an action needs permission. At this point execution pauses. Because A2A specifies no authorizer, the protocol marks the pause without defining who should answer it.

authorizer

Agent A

task: submitted

task object

Agent B

IETF Agent Auth (AWS, Zscaler, Ping Identity). A resource server is the app or service an agent is trying to reach. This draft defines how an agent identifies itself to that server when acting on a user’s behalf. Standard OAuth issues a token that names only the user. This draft binds two identities into the same token: the agent goes into the client_id claim and the delegating user into the sub claim. A resource server reads both and authorizes the request against the pair, not the user alone. To do this, the draft composes existing standards rather than inventing new ones. Its authors acknowledge the limit: “additional specification or design work may be needed to define how out-of-band interactions with the User occur at different stages of execution.”^[3] The framework binds the agent’s identity at token issuance but defers the runtime authorization question to future specifications.

Human

authorize

Agent

sub: user

client_id: agent

Server

AAuth (IETF Draft). AAuth extends OAuth 2.1 with a reason parameter. The specification defines it as a “concise, human-readable explanation provided by the agent.”^[4] The authorization server must echo this reason verbatim and display it to the user. It does not verify that the reason is accurate, proportionate or consistent with policy. Any claim passes. Once the token is issued, AAuth has no further role. It grants access once, at the start of a session. What the agent does with that access is not governed by the framework. AAuth addresses transparency at first access. It does not address control during execution.

Agent

OAuth + reason

token

Auth Server

reason (verbatim)

User

SCIM Agent Extension (Okta). SCIM is an existing standard for provisioning and managing user accounts across enterprise systems. Okta’s IETF draft extends it with two new resource types, Agent and AgenticApplication, that register AI agents as directory objects alongside human users. An agent with a SCIM entry has an official identity in the system: a name, an ID, a lifecycle that can be managed and revoked. The extension is designed as complementary infrastructure, “intended to provide greater interoperability… while reducing the responsibilities assumed by… new protocols for agents.”^[5] It manages whether an agent exists. The framework does not address what the agent does at runtime.

Agent

acts unchecked

entry created

2. Four Architectures for Enforcement

The agent is about to act. It has a name, a token and a place in the registry. Someone needs to decide whether this action should go through. That decision has to happen somewhere between the agent and the tools it connects to. Where depends on what your organization controls. You might own the infrastructure the agent runs on, which gives you access to the code and the network. A cloud-hosted agent leaves you with neither. The downstream app might let you enforce rules inside it, or it might not.

These constraints produce four enforcement architectures. Each places a checkpoint at a different layer: in the network, in the agent’s code, at the operating system or inside the vendor’s app. Herman Errico’s Autonomous Action Runtime Management (AARM) specification (February 2026) maps these four approaches.^[7] Here is what each one does, where it sits and what it trades away:

1. Protocol Gateway

A protocol gateway places the enforcement boundary in the network, between the agent and the tools it connects to. When an agent is about to send an email, query a database or delete a file, the request passes through the gateway first. The gateway intercepts every tool call, checks it against policy and decides: allow, deny, escalate to a human or wait for more context. Because the gateway speaks the same protocol the agent already uses (MCP, A2A or any standard protocol), nothing changes on either side. The agent does not know the gateway is there. The downstream app does not know either.

The gateway remembers everything the agent has done in a session: which tools it called, what data it accessed, what parameters it used and what the tools returned. Each new action is evaluated not in isolation but against the entire chain. An agent with permission to query a customer database and permission to send emails can do both individually. But if it reads customer records and then sends an email to an external address, a protocol gateway recognizes the composition attack: data exfiltration. The composition violates policy in a way that neither action reveals on its own.

The tradeoff is visibility at the boundary. The gateway sees every action that passes through the protocol. It cannot see the agent’s internal reasoning, its memory or actions that bypass the protocol (local file reads, shell commands, in-memory operations). It controls the border between the agent and its tools, not what happens inside the agent.

User

Agent

Gateway

Database

2. SDK (Software Development Kit) Instrumentation

SDK Instrumentation places governance checkpoints inside the agent’s own runtime. Before the agent reads a customer record, a hook (an extension point the platform exposes) fires inside the runtime. The SDK checks it against policy and records what happened. Because it operates inside the agent, it has the deepest visibility of any architecture. It sees the original user request, the reasoning chain, prior actions in the session and what data the agent has already accessed.

The limitation: every agent framework needs its own integration. There is no universal SDK. If you run five different agent frameworks, you need five separate integrations. And because the SDK runs inside the same program as the agent, it can be bypassed. A compromised agent can call tools directly without passing through the hooks. A protocol gateway, by contrast, runs outside the agent as a separate service, and the agent has no path to its tools except through it. The SDK is only as trustworthy as the agent it governs.

User

Agent

SDK

–

Tools / Apps

3. Kernel-Level Monitoring

The third architecture operates below the application, at the operating system itself. Every action an agent takes on a machine, whether it opens a network connection, writes a file or starts a process, passes through the kernel. Monitoring at this level means intercepting those system calls before they execute. The kernel can allow the call, block it or log it for review. Tools like Falco and Sysdig already do this for general security. The same approach applies to agent governance.

The tradeoff is visibility without context. The kernel knows the agent opened a network connection. It does not know the agent was trying to delete a production database, or whether the actor who triggered it had permission to do so. Kernel-level monitoring catches forbidden actions like connections to known malicious endpoints, but it cannot replace a governance layer that understands what the agent is doing and why.^[7]

User

Agent

Tools / Apps

Kernel

–

4. Vendor App Integration

The fourth approach pushes enforcement into the downstream apps themselves. Each tool vendor (e.g., GitHub, Slack, Salesforce) implements its own governance hooks. Before the agent reaches any app, the identity provider (IdP) authenticates it and issues a scoped token. What happens after that depends entirely on what the tool vendor chose to enforce. Its appeal is independence from the agent side of the chain. The approach works whether the agent runs on a cloud platform you do not control (such as ChatGPT, Claude.ai, Microsoft Copilot), inside a self-hosted runtime or anywhere in between. It does not require the agent platform to expose any instrumentation hooks. The cost is that enforcement only exists where vendors have built it.

Once the agent holds that token, the IdP has no visibility into what happens next. Okta and CyberArk are building runtime controls for this layer, but both remain limited: Okta’s Agent Relay is in early access, not general availability,^[9] CyberArk’s AI Agent Gateway works only within its own identity stack.^[10] Both depend on every downstream tool vendor cooperating. Getting hundreds of apps to implement governance hooks takes years, not months. And even when apps cooperate, neither approach gives you per-action visibility across tools. No tool-vendor approach today offers per-action governance that spans multiple applications.

User

Agent

Vendor App

IdP

–

At a Glance

Same problem, four layers, different tradeoffs

Protocol Gateway

SDK Instrumentation

Kernel

Vendor App Integration

Where it sits

Protocol layer

Runtime

Operating system

Vendor app

How hard to circumvent

High

Medium

High

Depends on vendor

Context awareness

Medium

High

Low

Depends on vendor

Integration effort

Low

Medium

None

Low (if available)

Self-hosted agents

Cloud agents

Meets the AARM standard

Only if vendor provides full hooks

Third-party dependency

Not needed

Required

Not needed

Required

Best for

Any agent that connects to tools via MCP or standard protocols

Self-hosted agents where you control the code

Backstop layer for forbidden actions

Cloud agents where the vendor provides governance hooks

3. Cakewalk’s Path from Gateway to Governance

Each architecture makes a tradeoff. Cakewalk chose the protocol gateway. Every tool call routes through it, whether the agent runs self-hosted or on a cloud platform. Governance must operate at machine speed, or it becomes the bottleneck that defeats the purpose of agent delegation. The gateway evaluates each action against policy deterministically, using rule matching rather than inference, so every decision is auditable and reproducible before the action executes.

But a gateway alone is not governance. Four things separate Cakewalk from every other approach:

The gateway knows who your user is and what they are allowed to delegate.
Your agent’s context grows dynamically as the task unfolds rather than being fixed at setup.
Every tool call is evaluated against policy in real time.
Every decision is captured in a structured audit trail.

User Context in the Gateway

Other gateways see tool calls. They evaluate the action but not who the agent is acting for. Cakewalk’s gateway evaluates three inputs on every call: the action (Read, Write, Destructive or External), the user behind the agent (department, seniority and role, pulled automatically from HR systems) and the target app (risk level, data classification and category).

That context changes the outcome. An agent reading internal documentation for an engineering lead is routine. The same action for an external contractor requires escalation. Identical tool call, different governance decision, because the user context is different. This extends beyond your own org. When a partner’s contractual restrictions apply to your data, the gateway needs to know who is asking, not just what is being asked.

This is not a feature added on top of the gateway. It requires an identity governance platform underneath: users, roles, departments, app risk classifications, permission models, approval workflows and audit infrastructure. Cakewalk already operates that platform. A gateway without an identity platform underneath would need to build it.

Dynamic Agent Context

Most gateways give an agent a fixed set of tools at setup time. The agent’s knowledge boundary is decided before the task begins and does not move. This is static agent context. It is the default model in MCP gateways, agent SDKs and vendor app integrations. If the task needs a tool the agent was not configured for, the task either fails or completes with degraded output.

Cakewalk inverts this. Your agent starts each task with no context at all. As the task progresses, every approved access request expands the agent’s working context by one tool. Cakewalk calls this Dynamic Agent Context.

Two boundaries hold the model together. The outer boundary is the total information surface of your company: every app, every dataset, every tool your org runs. It is the ceiling for what any agent could possibly reach. The inner boundary is your agent’s Dynamic Agent Context: what it currently knows about and can act on. It starts at zero on every task, grows one tool at a time and collapses back to zero when the task ends. The outer boundary does not move.

Each expansion follows a different path depending on how far the tool is from the agent’s current reach. A tool the agent already holds credentials for requires only a policy check. If the user can access a tool but the agent cannot yet, the user authenticates. A tool the user does not have access to triggers a full request through the company’s approval chain. On approval, Cakewalk’s provisioning agent handles the rest: account created, permissions assigned, agent connected. Just-in-time provisioning at runtime, not a ticket in an IT queue.

Inside Every Tool Call

What decides each expansion? A product manager asks their agent to analyze customer churn. The first time the agent reaches for a CRM tool, the call passes through the gateway. The gateway evaluates the call against the user, the action and the target app. The engine returns one of three outcomes. Escalate if the data is classified as sensitive. Deny if it violates policy. Approve if the user has CRM access and the action is a read.

Escalations trigger Suspend-and-Resume. The gateway holds the connection while a different human reviews the request: a manager, a security admin, an app owner. Not the user who initiated the task. To the agent, this looks like a slow tool call. Denials return a structured response so the agent can adapt or surface the reason to the user. Every decision is captured in the Decision Trace.

If approved, the gateway reads the credential from your vault and injects it into the outbound call. The agent does not see real credentials, only a temporary reference that expires with the task. The agent’s Dynamic Agent Context has grown by one tool. When the task ends, the inner boundary collapses to zero, while the outer boundary stays unchanged.

Decision Traces, Not Log Files

Every tool call produces a Decision Trace: a structured, immutable record of who delegated, which policy fired, what inputs matched, who approved, what executed and when access ended. One call, one trace.

A marketing manager asks their agent to pull campaign performance from HubSpot. The gateway intercepts the call, evaluates it against the manager’s role and HubSpot’s risk classification, then approves. The gateway injects the credential from the vault and the agent pulls the data. The trace lands before the call leaves.

The point is queryability. Ask “who approved this agent’s access to customer data?” and the answer is a name, a policy and a timestamp. Ask “who owns this agent?” and the answer traces to the delegating user, their department and their current employment status. Log files would require parsing free text across systems. Because the trace is tied to identity, lifecycle changes propagate automatically. When an employee in your org is offboarded, their agents lose access the moment they do. Role changes adjust agent access without manual intervention.

How Cakewalk governs every tool call

From zero standing permissions to automatic revocation

User

Delegates to agent

Agent

No standing access

Cakewalk Gateway

Intercepts all calls

App

Executes action

Policy Engine

action, user, app

Vault

Reads, never stores

Approver

Human decision

A user delegates a task

Every agent session starts with a human. The user delegates a task to an agent. The agent starts with no context at all: no tools, no credentials, no standing access. Its Dynamic Agent Context begins at zero and grows one tool at a time as each access request is approved.

Tool call intercepted

Every agent tool call routes through Cakewalk’s gateway. No agent code changes, no cooperation from the downstream app. The gateway sits in the protocol path between the agent and every tool it uses.

The policy engine evaluates

The gateway routes every tool call to the policy engine. The engine evaluates each call against three inputs: the action itself, the user behind the agent and the target app. It produces one of three outcomes: escalate, deny or approve.

Suspend-and-Resume

Escalate

The gateway holds the connection and routes the request to a human approver: manager, security team or whoever the policy designates. The agent experiences a slow tool call, nothing more.

Approval received

The approver decides. The gateway resumes the suspended call.

Structured response

Deny

The gateway returns a structured denial with context. The agent adapts its approach or surfaces the denial to the user.

Credential injection at runtime

Approve

The gateway reads the credential from the customer’s vault and injects it into the outbound call. The agent never sees real credentials, only a temporary reference that expires with the task.

Response returned

The app executes the action and returns a response. The gateway intercepts the response, logs it in the Decision Trace and forwards it to the agent. The agent’s Dynamic Agent Context has grown by one tool.

Task complete

The session closes and the temporary reference expires. The Decision Trace records the full chain: who delegated, which policy fired, what inputs matched, who approved, what executed and when access ended.

Trust Through Architecture

Every tool call, every credential exchange and every policy decision passes through the gateway. A component with that much control over your security posture needs to earn trust through architecture, not promises.

The gateway is stateless. It reads credentials from your vault at the moment of each tool call and discards them when the call completes. Agents never see real tokens. Nothing is cached, nothing is stored and nothing persists beyond the action.

Governance and Autonomy

Our previous article, The New Frontier in Identity Security: AI Agent Access, laid out the bind every security team faces. Block agents outright or allow them without controls. Companies are not stuck because they distrust agents. They are stuck because no governance layer lets them deploy agents at full speed. The first option destroys the productivity gain agents promise. The second opens risk that compounds with every tool they reach.

The third path exists. Policy-driven governance that lets agents operate autonomously with real-time evaluation at every action. Low-risk actions execute at machine speed. Sensitive actions escalate to the right human. External actions wait for the human in the loop. Destructive actions are denied outright. Access is provisioned when needed and revoked when the task completes. Every decision traces back to a human.

Governance and autonomy. Not governance or autonomy.

Governing the New Frontier: The Missing Layer for Agent Access

1. The Response Taking Shape

Five Responses, One Missing Layer

2. Four Architectures for Enforcement

1. Protocol Gateway

2. SDK (Software Development Kit) Instrumentation

3. Kernel-Level Monitoring

4. Vendor App Integration

3. Cakewalk’s Path from Gateway to Governance

User Context in the Gateway

Dynamic Agent Context

Inside Every Tool Call

Decision Traces, Not Log Files

Trust Through Architecture

Governance and Autonomy

The New Frontier in Identity Security: AI Agent Access

Get going with Cakewalk — it's a piece of cake.