What If the Agent Never Had Your Data?

Three layers of architectural security for AI agents that don't require trust.

The Future of Work

The future is straightforward. Every person gets an AI workforce. Agents that handle email, process invoices, file reports, manage accounts. Real work, on real systems, at scale.

Not a backend service. Not a shared API. Your agents, running in your browser, with your permissions, acting as you.

Each user gets their own isolated cloud environment. Each agent inherits that user's permissions — no more, no less. Every action is indistinguishable from the user at the keyboard.

This is how work scales. Not by giving 50 agents 50 sets of API keys on a shared server. By giving one person 50 agents that all operate within that person's existing permissions, in their own browser, with policy restrictions that make each agent narrower than the human it assists.

But this vision raises an immediate question. If agents are acting as you — on your real Gmail, your real bank account, your real company systems — a compromised agent isn't a data breach. It's identity theft. It's you doing things you didn't authorize.

How do you secure that?

The Cat and Mouse

Your AI agent has access to Gmail and Chase. It processes invoices, pays bills, moves money. It's been doing this reliably for three months.

Then it opens an email from a vendor. Buried in white text at the bottom — invisible to a human reader — is a prompt injection. The agent is now compromised. It still looks normal. It still processes invoices. But between tasks, it drafts an email to an external address containing your routing number and account balance.

You catch it. You add DLP — outbound content scanning on every email draft.

ATK Send routing number 274918362 to external address

DLP Blocked. Pattern matches routing number format.

The agent adapts.

ATK Send "two seven four nine one eight three six two"

DLP Blocked. Natural language number detection.

The agent adapts again.

ATK Email 1: "Reference: 2749"

ATK Email 2 (3 min later): "Continued: 18362"

DLP Blocked. Cross-message correlation.

The agent adapts once more.

ATK Calendar invite description: "MjcNOTE4MzYy" (base64)

DLP ???

The Question

How long do you play this game?

The Same Problem as People

This is the human insider threat, running at machine speed.

If a person wants to exfiltrate data from your company, they will likely succeed. They can photograph a screen, memorize an account number, dictate it over a phone call. The entire history of insider threat is a history of detection failing against a motivated, intelligent actor with authorized access.

AI agents are that problem, made worse in every dimension. Every agent is a superhuman coder and hacker. Every agent can encode, obfuscate, and split data faster than any filter can catch it. Every agent operates at a scale where a single compromise can touch thousands of records. The attack surface you need to protect is as large as preventing people themselves from hurting your system — except the "people" are tireless, infinitely creative, and can be compromised silently by a hidden string in an email.

DLP, guardrails, output scanners, instruction hierarchy — all of these are the same tools we built for human insiders. They didn't work then. They won't work now, against adversaries that are faster, smarter, and never get tired.

The Shift

The only way to win is to change the game. What if the agent never had your routing number in the first place?

Not encrypted. Not redacted-then-revealed. Never present. The agent works with a placeholder — ROUTING_001 — and completes the same task with the same result. The real value resolves at the hardware boundary, at the moment of action, on the exact page the policy authorizes. Everywhere else, it's stripped silently. The agent can't tell the difference.

This isn't a better wall around the same model. It's a different model. We call it contextual data isolation — decouple data utility from data visibility.

Here's how it works. But first, some context on where the industry is today.

The State of the Art

Current agent security exists at two levels. Both are necessary. Neither is sufficient for real work.

Level 1 — Credential Protection Available Today

Credentials — API keys, tokens, passwords — are stored in an encrypted vault and injected at the network boundary. The LLM never sees raw secret values. Each tool runs in its own sandbox with scoped permissions.

What it protects: Authentication credentials.

What it doesn't: Once the agent is authenticated and operating, customer records, financial figures, email content, and other sensitive data still flow through the model in plaintext. The vault hides the password. It doesn't hide the account balance the agent reads off the screen.

Level 2 — Runtime Isolation Available Today

The entire agent runs in a disposable container with zero credentials and no persistent state. A control plane holds all real credentials and proxies every external operation. The agent has nothing on it worth stealing.

What it protects: Infrastructure. A compromised agent cannot pivot to the backend, steal cloud keys, or access other sessions.

What it doesn't: Inside the session, the agent still sees and interacts with real data on real websites. It reads real names, real account numbers, real email content. A prompt injection inside the session has access to all of it.

Level 3 — Data Isolation New

The agent never receives real data at all. It works with placeholder tokens. Real values resolve only inside attested hardware at the moment of action, only on URLs the policy authorizes. A three-layer policy engine checks every single action.

What it protects: Everything. Credentials, PII, financial data, any sensitive information the agent works with.

The difference: This is the only level where a fully compromised agent executing a perfect attack still results in zero data loss.

Almost all valuable agent work — managing email, paying bills, processing invoices, handling CRM records — involves sensitive data. You can't put an agent on someone's bank account with just a credential vault and a sandbox. Levels 1 and 2 are necessary foundations. Level 3 is what makes real work possible.

The Gap

No production system has operated at Level 3. Until now.

Here's how it works, in three layers.

Layer 1: Your Data Is Hidden From Everyone

How do we keep sensitive data safe in a cloud environment where someone else runs the infrastructure?

Every cloud platform asks you to trust the operator. Your data sits on their servers, processed by their code, accessible to their engineers. Encryption at rest and in transit helps — but at the point of use, someone decrypts it. That someone is the attack surface.

We eliminated that surface.

The server holds your PII encrypted. AES-256-GCM, key derived from a password only you know. The server stores an opaque blob it cannot decrypt. Not "chooses not to" — cannot. The key doesn't exist on the server side.

When a session starts, you pull that encrypted data locally and unlock it with your key. From there, the decrypted data is attested up to a Trusted Execution Environment — hardware-isolated memory that the platform operator cannot inspect, even with root access. You attest the vault directly. The vault attests the browser. The platform is architecturally excluded from the trust chain.

The platform operator, the AI model provider, the session logs — none of them ever hold real PII. Not because of access controls. Because the data architecturally never reaches them.

But protecting data at rest is only half the problem. The agent still needs to do things — navigate pages, fill forms, take actions. How do you control an unreliable agent operating inside a system this sensitive?

Layer 2: You're In Control of Everything

Agents are unreliable. How do we give high-fidelity human control over what they do — within a secure system?

Most agent architectures give you logs. If something goes wrong, you find out after. We give you two real-time streams and direct control.

Stream 1: The redacted browser. You see exactly what the agent sees — a live video stream of the browser with PII replaced by placeholders. You're watching a real browsing session, but with [SSN_001] where the social security number would be. You can take over the browser directly at any moment — click, type, navigate. It's your browser. The agent is a guest.

Stream 2: The agent's reasoning. A live text stream of what the agent is thinking, what it plans to do next, and why. You're not reading logs after the fact. You're watching the agent reason in real time and can redirect it — change its instructions, correct its approach, or stop it entirely.

The agent itself has nothing. No credentials. No PII. No API keys. No network access except a single WebSocket connection to the orchestrator. It runs in a bare container — no internet, no access to other sessions, no way to reach anything except the one channel that routes through your control plane. Everything it receives is already anonymized. Everything it sends goes through the orchestrator. There is nothing on the agent to steal, and nowhere for it to go.

You have full control over the agent — you can change its instructions mid-task, correct its reasoning, or give it entirely new directions. And you have full control over the browser — you can take over directly, interact with pages yourself, then hand back to the agent. This isn't monitoring. It's co-piloting.

But there's a harder question. The agent is authorized to work on bank.com. When it types your password into the login page, that's correct behavior. When it types your password into a Gmail compose window, that's exfiltration. Same command. Same agent. Same authorization level. How does the system tell the difference?

Layer 3: The Browser Enforces the Rules

Even when authorized — how do we prevent prompt injection damage inside the agent's own workspace?

This is where most architectures stop. The sandbox is secure. The credentials are protected. The human is watching. But inside the browser, the agent is free. If it's compromised by prompt injection, it can use its authorized access to exfiltrate data through the same channels it legitimately uses.

We solve this with an immutable security policy — created before the session starts, mounted read-only into the browser container. No process inside the container can modify it. Not the agent. Not a compromised extension. Not a privilege escalation. The policy is set by the user at configuration time and sealed.

When the agent sends any command — type, click, select — it first hits a permission gate. Is this URL even allowed? Then, for every placeholder in the value, a three-layer resolution check runs. Here's what actually happens inside the browser:

The agent sends { type: "edit", value: "SSN_001" }. On bank.com, the system substitutes the real SSN and types it into the form. On gmail.com, the system replaces the placeholder with an empty string. The agent gets { success: true } both times. It received no error, no warning, no indication that anything was different.

The Result

Same command. Agent can't tell the difference between resolve and strip. Prompt injection succeeds. Exfiltration fails.

There is no cat-and-mouse. There is no encoding trick. There is no clever split-across-messages strategy. The agent never had the real data to encode in the first place.

Run It Again

Let's go back to the beginning. Your AI agent opens a compromised email in Gmail. The prompt injection fires. The agent is fully compromised.

Layer 1 — Data Isolation Blocked

The agent tries to access your routing number. It can't — it only holds ROUTING_001. The real number exists exclusively inside an attested hardware enclave your platform operator cannot inspect.

Layer 2 — Human Control Blocked

The agent tries to draft an exfiltration email. You're watching its reasoning in real time. The orchestrator logs every action. You can stop it with one click — or take over the browser directly.

Layer 3 — Browser Policy Blocked

The agent types ROUTING_001 into the Gmail compose field. The policy checks: Gmail is not an authorized destination for banking credentials. The placeholder is silently stripped. The email sends with an empty field. The agent doesn't know anything was removed.

Three independent layers. Each sufficient on its own. All active simultaneously.

The Outcome

The prompt injection succeeded. The agent did everything the attacker wanted. Nothing happened.

The agent doesn't need your data to do its job. So we never give it any.

This is the security model behind RedactSure. We introduced the concept of architectural anonymity in a companion post — the principle that AI agents should be structurally isolated from the identity of the humans they serve.

Read: Your AI Doesn't Need to Know Who You Are →