Introduction

I’ve been building a lot of autonomous AI agents lately. It’s incredibly fun, until you realize a terrifying fact about the current ecosystem: standard AI SDKs offer a false sense of security.

Frameworks like LangChain or AutoGen are phenomenal orchestrators. They let you define explicit tools (like a calculator or databaseSearch) and wrap them in neat execution contexts. But what happens if your agent gets prompt-injected and decides to bypass your tools entirely? What if the LLM hallucination just figures out how to write JavaScript that calls require('node:fs').readFileSync('.env') directly?

Nothing stops it. It's not a bug in the SDK; it's a gap in Node.js itself.

I know the purist answer: "Just migrate to Deno or Bun, they have native --allow-read permissions!" And they are right. If you control your runtime from scratch, you should use them. But for the 90% of us stuck maintaining existing Node.js monorepos, a massive migration isn't an option. We need a pragmatic seatbelt.

So, I built one. Here is how I used AsyncLocalStorage and runtime monkey-patching to build an open-source flight recorder for AI agents.

The Architecture: APM for AI Agents

I realized the problem wasn't exactly new. Companies like Datadog and New Relic have been tracking deeply nested asynchronous executions for years using Application Performance Monitoring (APM). I just needed to apply that exact same architecture to an LLM execution loop.

I broke the problem down into two parts:

Context Isolation: How do I know which agent made the file system call?

Global Interception: How do I actually catch and block the raw Node.js system calls without breaking the rest of the application?

Context Isolation with AsyncLocalStorage

If you haven't used AsyncLocalStorage (ALS) from the node:async_hooks module, it is essentially thread-local storage for asynchronous operations.

When you start an agent run, you wrap it in an ALS context and pass it a "Policy Engine" and a "Receipt." Any function called downstream no matter how many promises it chains through can access that context.

Here’s a simplified sketch of what ReceiptBot does internally (the library hides the AsyncLocalStorage store behind runWithInterceptors).

import { AsyncLocalStorage } from 'node:async_hooks';
import type { PolicyEngine, Receipt } from '@receiptbot/core';

// This holds the state for the current async execution tree
export const context = new AsyncLocalStorage<{ policy: PolicyEngine; receipt: Receipt }>();

// Simplified internal sketch (ReceiptBot’s public API exposes runWithInterceptors, not the ALS store)
export async function runWithInterceptors(policy: PolicyEngine, receipt: Receipt, agentFn: () => Promise<any>) {
  // (Global monkey-patches are applied here)
  return context.run({ policy, receipt }, async () => {
    return await agentFn();
  });
}

Now, even if a rogue dependency nested five layers deep tries to read a file, ReceiptBot can look up the current ALS store and know which policy applies.

Runtime Monkey-Patching Node Core

To stop the agent from reading secrets or making rogue network requests, I needed a global interceptor. Using module.createRequire, the tool monkey-patches Node's core modules (fs, http, child_process, net, tls) at runtime.

During initialization, it replaces the original functions with wrappers. Here is a simplified look at how the fs.readFileSync patch works:

import { PolicyViolationError } from '@receiptbot/core';
const originalReadFileSync = fs.readFileSync;

fs.readFileSync = function (...args) {
  const ctx = context.getStore(); // Check if we are inside an agent run

  if (ctx) {
    // ReceiptBot records the attempt FIRST; policy evaluation happens inside addEvent()
    const event = ctx.receipt.addEvent({
      type: 'tool.fs',
      action: `fs.readFileSync("${String(args[0])}")`,
      payload: { op: 'readFile', path: String(args[0]) },
    });

    // If the policy engine flagged it, kill the execution
    if (event.status === 'BLOCKED_BY_POLICY') {
      throw new PolicyViolationError('tool.fs', event.action, event.policyTrigger ?? 'Policy violation');
    }
  }

  // Execute the original function if allowed
  return originalReadFileSync.apply(this, args);
};

Hard-Stops, Cost Caps, and Redaction

Security isn't just about file access; it's about your API budget. A common failure mode for autonomous agents is getting stuck in a while(true) loop of hallucination, racking up a massive OpenAI API bill overnight.

While the network interceptor (http/fetch/net) is great for enforcing URL domain blocklists, calculating tokens natively at the network layer is messy. Instead, the Policy Engine allows you to enforce a hard budget cap:

const policy = new PolicyEngine()
  .denyPathGlobs(['**/.env'])
  .maxCost(1.00); // Hard stop at $1.00

When an LLM API call happens (either via a framework adapter or manually emitted as an llm.call event), it includes a costImpactUsd property. The Policy Engine validates the running total on every one of these events. The moment the next call would push the total over $1.00, it throws a PolicyViolationError and kills the execution path.

Finally, before any logs are written to the JSON receipt, the tool runs a redaction pass. It uses regex patterns to catch AWS keys, Stripe tokens, and OpenAI keys, replacing them with labeled markers like [REDACTED_OPENAI_API_KEY] so your audit logs don't become a new security vulnerability.

The Result: ReceiptBot

I packaged this architecture into an open-source tool called ReceiptBot.

It requires zero external infrastructure. It just sits quietly in your Node codebase, intercepts rogue system calls, and spits out a highly detailed, redacted JSON "receipt" of exactly what the agent did.

redshadow912 / ReceiptBot

🧾 ReceiptBot

A Flight Recorder and Seatbelt for Node.js AI Agents.

Monkey-patching isn't a hard OS sandbox — ReceiptBot is not trying to be one. It's your in-process flight recorder: a structured audit trail of every I/O operation, a cost governor that cuts off runaway LLM loops, and a secret scrubber that runs before any log is written. All of it drops into your existing Node.js project in one function call.

View on GitHub · Quickstart · Architecture · Full API Reference

What is ReceiptBot?

ReceiptBot is a runtime governance library for Node.js that wraps your AI agent's async execution context with:

A Policy Engine — rules you define that block dangerous operations before they happen
A Flight Recorder — an immutable, structured audit trail (a "receipt") of every action taken
A Global Interceptor — monkey-patches raw Node.js core modules so even rogue third-party library calls are caught

It does…

View on GitHub

I want to be fully transparent: this is not a perfect OS-level sandbox like a V8 Isolate. There are always edge cases with monkey-patching in JavaScript. But it covers the most common, dangerous escape hatches (direct node:fs, http, child_process, fetch) within the same process.

If you are building with LangChain, AutoGen, or just raw LLM calls in Node.js, and you want a pragmatic "seatbelt" to keep your .env files safe and your budget capped, I’d love for you to check it out.

I would love any brutal architectural feedback you have!

DE

Source

This article was originally published by DEV Community and written by vysh.

Read original article on DEV Community

Back to Discover

How I built a runtime "flight recorder" for Node.js AI agents using AsyncLocalStorage