Identity & Access·Artificial Intelligence

Silent Drift: How LLMs Are Quietly Breaking Organizational Access Control

LLMs can write complex Rego and Cedar code in seconds, but a single missing condition or hallucinated attribute can quietly dismantle your organization’s least-privilege security model. The post Silent Drift: How LLMs Are Quietly Breaking Organizational Access Control appeared first on SecurityWeek.

Business efficiency demands maximum use of AI assistance, but where policy as code is concerned, AI can introduce serious policy flaws.

The shift to policy as code for organizational security, compliance, and operational rules, is being followed by increased use of LLM artificial intelligence to help produce the raw code. This makes sense. A primary purpose of AI within business is to improve human efficiency, and writing policy in languages like Rego or Cedar is not easy. AI is increasingly used to streamline the process.

But there is a problem. These generated policies often look correct, compile successfully, and still grant the wrong access. This shouldn’t be a complete surprise. AI generated applications are already known to be capable of introducing security issues by choosing the simplest solution over the most secure solution. However, security issues in an organizational policy that is designed to prevent security issues is especially problematic.

Independent researcher (and senior security engineer at Apple), Vatsal Gupta, has been examining these issues, and discussed them with SecurityWeek. “LLMs are being introduced into engineering workflows. Developers are using them to generate infrastructure code, security rules, and now even access control policies,” he says.

The appeal is obvious. “Instead of writing policy logic manually, teams can describe intent in plain language and let the model generate the enforcement logic.”

But it doesn’t always work that way. “LLM-generated policies are often syntactically valid but semantically incorrect,” continues Vatsal. “One missing condition, a misinterpreted attribute, or an incorrect action can completely redefine who gets access to what.”

These are not obvious failures. They don’t break builds or trigger alerts. But they quietly expand access boundaries. And Vatsal’s research has found various recurring failure patterns.

A common issue, he tells us, is missing contextual restraints. “A policy that is supposed to limit access based on region, department, or ownership may omit that condition entirely. The generated policy still looks clean and valid, but it now applies globally instead of within the intended scope.”

A second, he continues, is missing deny logic. “Many access control policies rely on a baseline deny posture with specific exceptions. LLMs often capture the exception but fail to encode the underlying restriction. The result is a policy that allows more than intended, even though it appears to implement the requirement.”

Then there’s the standard recurring problem with LLMs — the potential to hallucinate. “Models sometimes introduce attributes that do not exist in the actual system schema. The policy compiles, but at runtime it behaves unpredictably because it relies on data that is not present or incorrectly mapped.”

Temporal and contextual conditions are frequently dropped. “Policies that depend on time windows, approvals, or session context are simplified into static rules. What was meant to be controlled, time-bound access becomes always-on access.”

And the last concern: “Even action misclassification can occur. A policy intended to restrict a sensitive action like deletion may be translated into a broader or different operation. The difference may be small in wording, but large in impact.”

All these failings are natural outcomes from AI’s intention to interpret and simplify language. The result can be a policy that looks good, feels good and tastes good, but simply isn’t good. And detection of not goodness is difficult.

Over time, these small deviations accumulate. Policies are no longer static artifacts reviewed occasionally – they are generated, updated, and deployed continuously. “As more policies are generated, deployed, and reused, the risk compounds,” continues Vatsal. Organizations may believe they are enforcing least privilege while actually drifting toward over-permissioned environments.

“If the generation process is not reliable, the risk becomes systemic,” he adds. “Organizations may end up with thousands of subtly flawed policies. Each flaw may be individually small, but collectively they create a large and difficult-to-understand attack surface.”

The solution, he says, is not to abandon LLMs but to change our trust model, especially where policy is concerned. “Generated policies should not be treated as correct by default; validation layers between generation and enforcement should be introduced to ensure all required components are present, correct and consistent with expected behavior; policies should be tested, not just compiled; and deny-by-default principles should be enforced explicitly.”

Most importantly, he adds, “Organizations need to treat authorization logic as a high-risk domain.” Just because a model can generate code does not mean that code is safe to deploy without scrutiny.

“As we move toward AI-assisted security engineering, the goal should not just be automation. It should be correctness, auditability, and trust, because in authorization, ‘almost correct’ isn’t good enough,” Vatsal told SecurityWeek.

Learn More at the AI Risk Summit

Latest News

Publisher