AI Agents Have Too Much Access: Why Static Permissions Are a Security Risk [Webinar Summary]
Your CRM. Your ERP. Your codebase. AI agents are already operating inside them, executing real actions, not just generating text suggestions. And the security model most organizations are using to manage that? It was designed for humans.
That mismatch is the core problem Johannes Keienburg, CEO and Founder of Cakewalk, sat down to explore with Herman Errico, Senior Product Manager at Vanta, in a recent practical session on AI agent authorization.
Herman recently published a 24-page specification called Autonomous Action Runtime Management, known as AARM. In the weeks since it went public, over 600 people reached out directly, 12 startups listed themselves on the project site, and six CISOs jumped in to give feedback. The paper clearly touched a nerve.
Below is a full breakdown of what they covered: why existing IAM frameworks fall short for agents, what AARM actually proposes, and what security teams should be watching for in the months ahead.
Key Takeaways from the Session
- Traditional IAM was designed for deterministic, accountable human users. Agents are none of those things.
- The three properties that make agents uniquely risky are: irreversibility of actions, execution speed that outruns human review, and privilege amplification combined with LLM non-determinism.
- LLM guardrails do not solve this problem because they are probabilistic and can always be tricked.
- AARM proposes treating tool execution as the primary security boundary and evaluating actions against both policy and accumulated session context before they run.
- Context referral, escalating only genuinely ambiguous cases to humans, is the mechanism that avoids authorization fatigue while preserving oversight.
- The market needs a new standard. A fragmented vendor landscape without a shared specification will produce 60-tool security stacks for AI agents, the same outcome the industry has already experienced with every previous threat category.
- The near-term risk is large-scale breaches driven by over-permissioned agents operating at volume. Regulatory fines and insurance premium increases are likely to follow.
Why AI Agent Access Is a Different Security Problem
The comparison to human IAM is easy to make on the surface. Agents need credentials. Agents need scopes. Agents perform actions. So you model them like a user, give them an API key, and set some RBAC rules. Done.
Except it is not done, because agents have three properties that make this approach fundamentally inadequate.
1. Irreversibility
Agents are no longer just generating text. They are calling tools through MCP servers, making API calls, writing to databases, sending emails, and processing payments. When a human clicks the wrong button, there is usually a confirmation dialog, a paper trail, or at least another human who notices. When an agent does it at 3am, the action is often complete before anyone knows it happened.
Herman gave a sharp example: imagine telling an agent to automate your life and handing it a credit card. The agent does not pause. It does not second-guess. It spends.
2. Speed
Agents operate at machine speed. Traditional policy-based tools work by generating an alert, routing it to a SOC analyst, and letting a human review it. That process is measured in minutes or hours. Agents execute in milliseconds. The threat has already occurred before the first tier-one analyst opens the ticket.
As Herman put it: imagine millions and billions of agent actions per day. Some organizations are already deploying stacks of Mac Minis running open-source agent frameworks, automating everything they can. The volume alone breaks the human-review model.
3. Privilege Amplification and Non-Determinism
This is where it gets particularly uncomfortable for security teams. Today, when you assign a static credential to an agent, it inherits everything attached to that role. If the role is admin, the agent is trusted as admin, no questions asked.
But LLMs are non-deterministic. They can experience what Herman calls intent drift: the agent starts a task, reinterprets it partway through, and ends up somewhere the original instruction never intended. There is no existing tool that reliably catches this class of failure.
"LLMs might be confused. Intent drift is when I start with a particular task, interpret it differently, and end up with a different output. There is no tool today that can manage that pattern." - Herman Errico, Vanta
Johannes summarized the core issue: agents get long-lived credentials with broad permissions, act in milliseconds, have no liability or accountability, and in most deployments leave no accessible audit trail on the client side.
Why LLM Guardrails Are Not a Solution
One of the key moments in the session came when Johannes pushed back on the AARM concept with what he framed as a deliberately provocative question: if the problem is that LLMs are non-deterministic and their guardrails are probabilistic, how does adding another LLM-powered layer solve anything?
Anyone who has spent time trying to enforce behavioral constraints on an LLM knows that they break. Tell a model it cannot provide harmful instructions, and someone will rephrase the request as a 19th-century historical novel. The guardrail is a probabilistic filter on the model itself, not on what the model actually does.
Herman's answer:
An AARM system does not have to be LLM-only. Pattern recognition and rule-based logic can handle a large share of decisions without involving a language model at all. LLMs come in where intent interpretation is genuinely needed, not as the default.
"In the arms race to build the most efficient model, almost no one is thinking about how to make AI agent adoption actually safe at scale." - Herman Errico, Vanta
What AARM Actually Proposes: The Four Building Blocks
AARM stands for Autonomous Action Runtime Management. It is a cross-industry specification, not a product, that defines how a runtime security layer should evaluate AI agent actions before they are executed. The spec treats tool execution, not the LLM itself, as the primary security boundary.
The architecture has four components:
Intercept
Before any agent action executes, the AARM layer intercepts it. The spec describes several ways to implement this: as a protocol gateway, as an SDK, at the kernel level, or through a direct vendor integration. The specific implementation matters less than the principle: no action should execute without passing through the authorization layer.
Accumulate Context
This is where AARM diverges from traditional policy-based tools. A pure policy engine evaluates an action in isolation. Is this action permitted or not? That binary approach generates two failure modes: false positives that block legitimate work, and false negatives that allow harmful actions that technically match a permitted rule.
AARM requires accumulating the agent's chain of thought and session context alongside the action. The question shifts from "is this action permitted?" to "does this action make sense given what this agent is supposed to be doing right now?"
Evaluate Against Policy
With context in hand, the system evaluates the action against both static policy and a dynamic policy that reads intent. The authorization outcomes are tiered:
- Explicitly denied: the action is blocked outright.
- Context approved: the action is permitted based on accumulated session context.
- Context denied: the context makes the action suspicious, so it is blocked.
- Context referral: the system cannot determine a clear answer, so it escalates to a human.
That last category is Herman's favorite, and it is easy to see why. Context referral is how you avoid authorization fatigue without abandoning human oversight entirely. Instead of asking humans to approve every action, you only surface the genuinely ambiguous ones.
Log with Tamper-Resistant Records
Every action, decision, and escalation gets logged in a tamper-resistant format. This is the audit trail that current agent deployments mostly lack. Without it, forensic investigation after an incident is guesswork.
The Authorization Fatigue Problem
There is an existing parallel in human IAM that makes this easy to understand.
When systems require approval for too many actions, users experience decision fatigue. The threshold for scrutiny drops, people start auto-approving, and the oversight mechanism becomes theater. Johannes observed this in his own behavior with AI coding agents: early on, he read every suggested action carefully. A few weeks in, he found himself approving without reading.
Herman's term for the agent version is authorization fatigue. Agents are already making so many actions per day that approval prompts are becoming noise. Users find workarounds, ignore them, or set up auto-approval flows. Some attack patterns specifically exploit this by inserting prompt injections designed to get past human-in-the-loop checks.
The solution is not to remove human oversight. It is to make human oversight selective: only escalate what the system genuinely cannot resolve. That requires the context-accumulation layer described above. Without it, the AARM system cannot distinguish between a routine file read and an agent that has drifted off-task and is trying to access something it should not.
Why This Needs to Be a Cross-Industry Effort
Herman made a point here that anyone who has worked in enterprise security will find uncomfortably familiar.
The security tooling market has historically fragmented around every new threat category. Signature-based antimalware, behavioral detection, anomaly detection, SIEM, XDR: each solved a real problem and each became its own silo. The result is that the average security team's tech stack is not three or four tools. It is sixty.
Without a shared specification for what an AARM system must do, the same fragmentation will happen with AI agent authorization. Every vendor will define the problem slightly differently. Every implementation will have different coverage gaps. Security teams will end up buying five overlapping products and still not have full visibility.
Herman's argument for a collaborative spec is pragmatic: whoever gets there first with a working solution will define what the category looks like. By publishing AARM as an open specification and inviting builders, CISOs, and startups to contribute, the goal is to shape that definition toward user needs before competitive dynamics lock in a fragmented outcome.
"If we don't act as a peer group and define what this tooling should do, we're going to end up buying 60 tools to manage AI agents at runtime." - Herman Errico, Vanta
What to Expect: Breaches, Fines, and Insurance Premiums
The second half of the conversation turned to what comes next, and Herman did not pull his punches.
The current wave of AI agent deployments is still largely in proof-of-concept or early-adoption territory. The incidents so far have been small-scale. That is changing quickly. CEOs across industries are pushing for AI adoption metrics, and some organizations are tying employee performance to AI usage. When that pressure meets agents with over-broad permissions and no runtime authorization layer, the conditions for large-scale compromise are already in place.
Herman laid out a specific scenario: an agent starts acting in a way that compromises user data. The company loses customer trust overnight. For organizations in regulated markets, that kind of incident can translate directly into market share loss and regulatory action. We are not far from seeing an agent-related incident that has a material impact on a public company's valuation.
Two other signals are worth watching:
- Regulatory fines for organizations that cannot demonstrate AI agent security controls. The regulatory frameworks are still forming, but the direction of travel is clear.
- Cyber insurance premiums increasing for companies without AI agent observability and authorization tooling in place. Insurers price what they can measure. Right now they cannot measure AI agent risk well. When they can, the pricing will reflect it.
The Consultant Analogy: Why Static Credentials Miss the Point
Towards the end of the session, Johannes proposed an analogy: the consultant with a CEO badge.
Imagine a large company hires an external consultant. The CEO hands them a badge and a task. That badge lets the consultant walk into almost any room in the building. If a security guard questions them, they flash the CEO badge and the guard waves them through. The consultant has broad access because of who authorized them, not because of what they actually need for the specific task at hand.
That is exactly how most AI agent credentials work today. The agent gets an API key OAuth credentials with broad permissions. When it calls a tool, the tool sees valid credentials and allows the request. There is no check on whether the specific action the agent is taking right now is within the scope of what it was actually asked to do.
The fix that AARM proposes is a shift from identity-based to job-based authorization: instead of "this agent has admin credentials," the question becomes "what is this agent supposed to be doing in this session, and does this action fit within that scope?"
What Security and IT Teams Should Do Now
The AARM specification is still forming, and no single product fully implements it yet. That does not mean security teams should wait.
A few concrete starting points from the conversation:
Audit your current agent credentials. Most organizations have no clear inventory of which agents have been deployed, what credentials they are using, or what scopes those credentials include. Start there. Long-lived (borrowed) human credentials with admin access attached to agents that nobody is actively monitoring is a meaningful and immediate risk.
Treat tool execution as your security boundary, not the model. The tendency is to focus on LLM behavior: what the model says, whether it follows instructions, whether its outputs are appropriate. The higher-risk surface is what the model does through its tools. That is where irreversible actions happen.
Build for context accumulation now. Even without a full AARM implementation, teams building or buying agent infrastructure should be capturing session context and chain-of-thought data. Without that data, any future runtime authorization layer will be flying blind.
Watch the open specification. The AARM spec is actively evolving with input from builders and CISOs. Following it costs nothing and gives security teams an early signal on where the category is heading.
Join The Agent Access Management Waitlist

If the problems covered in this session sound familiar, Cakewalk's Agent Access Management is built to address them directly.
Cakewalk helps security teams to govern and provision access of AI agents and human identities - implementing policy-first access at runtime and a complete audit trail for every agent action.
Join the waitlist to get early access and shape what gets built.

.avif)
