Introduction
I’ve been building a lot of autonomous AI agents lately. It’s incredibly fun, until you realize a terrifying fact about the current ecosystem: standard AI SDKs offer a false sense of security.
Frameworks like LangChain or AutoGen are phenomenal orchestrators. They let you define explicit tools (like a calculator or databaseSearch) and wrap them in neat execution contexts. But what happens if your agent gets prompt-injected and decides to bypass your tools entirely? What if the LLM hallucination just figures out how to write JavaScript that calls require('node:fs').readFileSync('.env') directly?
Nothing stops it. It's not a bug in the SDK; it's a gap in Node.js itself.
I know the purist answer: "Just migrate to Deno or Bun, they have native --allow-read permissions!" And they are right. If you control your runtime from scratch, you should use them. But for the 90% of us stuck maintaining existing Node.js monorepos, a massive migration isn't an option. We need a pragmatic seatbelt.
So, I built one. Here is how I used AsyncLocalStorage and runtime monkey-patching to build an open-source flight recorder for AI agents.
The Architecture: APM for AI Agents
I realized the problem wasn't exactly new. Companies like Datadog and New Relic have been tracking deeply nested asynchronous executions for years using Application Performance Monitoring (APM). I just needed to apply that exact same architecture to an LLM execution loop.
I broke the problem down into two parts:
Context Isolation: How do I know which agent made the file system call?
Global Interception: How do I actually catch and block the raw Node.js system calls without breaking the rest of the application?
Context Isolation with AsyncLocalStorage
If you haven't used AsyncLocalStorage (ALS) from the node:async_hooks module, it is essentially thread-local storage for asynchronous operations.
When you start an agent run, you wrap it in an ALS context and pass it a "Policy Engine" and a "Receipt." Any function called downstream no matter how many promises it chains through can access that context.
Here’s a simplified sketch of what ReceiptBot does internally (the library hides the AsyncLocalStorage store behind runWithInterceptors).
import { AsyncLocalStorage } from 'node:async_hooks';
import type { PolicyEngine, Receipt } from '@receiptbot/core';
// This holds the state for the current async execution tree
export const context = new AsyncLocalStorage<{ policy: PolicyEngine; receipt: Receipt }>();
// Simplified internal sketch (ReceiptBot’s public API exposes runWithInterceptors, not the ALS store)
export async function runWithInterceptors(policy: PolicyEngine, receipt: Receipt, agentFn: () => Promise<any>) {
// (Global monkey-patches are applied here)
return context.run({ policy, receipt }, async () => {
return await agentFn();
});
}
Now, even if a rogue dependency nested five layers deep tries to read a file, ReceiptBot can look up the current ALS store and know which policy applies.
Runtime Monkey-Patching Node Core
To stop the agent from reading secrets or making rogue network requests, I needed a global interceptor. Using module.createRequire, the tool monkey-patches Node's core modules (fs, http, child_process, net, tls) at runtime.
During initialization, it replaces the original functions with wrappers. Here is a simplified look at how the fs.readFileSync patch works:
import { PolicyViolationError } from '@receiptbot/core';
const originalReadFileSync = fs.readFileSync;
fs.readFileSync = function (...args) {
const ctx = context.getStore(); // Check if we are inside an agent run
if (ctx) {
// ReceiptBot records the attempt FIRST; policy evaluation happens inside addEvent()
const event = ctx.receipt.addEvent({
type: 'tool.fs',
action: `fs.readFileSync("${String(args[0])}")`,
payload: { op: 'readFile', path: String(args[0]) },
});
// If the policy engine flagged it, kill the execution
if (event.status === 'BLOCKED_BY_POLICY') {
throw new PolicyViolationError('tool.fs', event.action, event.policyTrigger ?? 'Policy violation');
}
}
// Execute the original function if allowed
return originalReadFileSync.apply(this, args);
};
Hard-Stops, Cost Caps, and Redaction
Security isn't just about file access; it's about your API budget. A common failure mode for autonomous agents is getting stuck in a while(true) loop of hallucination, racking up a massive OpenAI API bill overnight.
While the network interceptor (http/fetch/net) is great for enforcing URL domain blocklists, calculating tokens natively at the network layer is messy. Instead, the Policy Engine allows you to enforce a hard budget cap:
const policy = new PolicyEngine()
.denyPathGlobs(['**/.env'])
.maxCost(1.00); // Hard stop at $1.00
When an LLM API call happens (either via a framework adapter or manually emitted as an llm.call event), it includes a costImpactUsd property. The Policy Engine validates the running total on every one of these events. The moment the next call would push the total over $1.00, it throws a PolicyViolationError and kills the execution path.
Finally, before any logs are written to the JSON receipt, the tool runs a redaction pass. It uses regex patterns to catch AWS keys, Stripe tokens, and OpenAI keys, replacing them with labeled markers like [REDACTED_OPENAI_API_KEY] so your audit logs don't become a new security vulnerability.
The Result: ReceiptBot
I packaged this architecture into an open-source tool called ReceiptBot.
It requires zero external infrastructure. It just sits quietly in your Node codebase, intercepts rogue system calls, and spits out a highly detailed, redacted JSON "receipt" of exactly what the agent did.
🧾 ReceiptBot
A Flight Recorder and Seatbelt for Node.js AI Agents.
Monkey-patching isn't a hard OS sandbox — ReceiptBot is not trying to be one. It's your in-process flight recorder: a structured audit trail of every I/O operation, a cost governor that cuts off runaway LLM loops, and a secret scrubber that runs before any log is written. All of it drops into your existing Node.js project in one function call.
View on GitHub · Quickstart · Architecture · Full API Reference
What is ReceiptBot?
ReceiptBot is a runtime governance library for Node.js that wraps your AI agent's async execution context with:
- A Policy Engine — rules you define that block dangerous operations before they happen
- A Flight Recorder — an immutable, structured audit trail (a "receipt") of every action taken
- A Global Interceptor — monkey-patches raw Node.js core modules so even rogue third-party library calls are caught
It does…
I want to be fully transparent: this is not a perfect OS-level sandbox like a V8 Isolate. There are always edge cases with monkey-patching in JavaScript. But it covers the most common, dangerous escape hatches (direct node:fs, http, child_process, fetch) within the same process.
If you are building with LangChain, AutoGen, or just raw LLM calls in Node.js, and you want a pragmatic "seatbelt" to keep your .env files safe and your budget capped, I’d love for you to check it out.
I would love any brutal architectural feedback you have!
This article was originally published by DEV Community and written by vysh.
Read original article on DEV Community