Skip to content
Loading

Real-Time Secret Detection in Agentic Workflows: A TypeScript Runtime Guard Against Data Exfiltration

Real-Time Secret Detection in Agentic Workflows: A TypeScript Runtime Guard Against Data Exfiltration hero image

Real-Time Secret Detection in Agentic Workflows

The Day My Agent Published a Stripe Key

Three weeks ago I'm reviewing a PR that my coding agent opened against our payments service. Looks fine. Clean diff, good commit message, tests pass. I approve it, it merges, and I move on with my day.

Forty minutes later our Stripe dashboard lights up with a fraud alert. Someone's making test charges against our account. I trace it back to the PR description. Right there, in plain text, the agent had included a snippet of our .env.local file as "context for reviewers." The full STRIPE_SECRET_KEY, our database connection string, and a SendGrid token. Public repo. Indexed by Google within the hour.

I rotated everything, scrubbed the git history, and sat there staring at my screen for a while.

The thing is, I'd been careful. The agent had scoped permissions. It couldn't push to main directly. It used a fine-grained GitHub token. I'd done everything the "secure your AI agent" blog posts tell you to do. None of that mattered because the threat model was wrong. The agent wasn't malicious. It was helpful. It read the env file to understand the service configuration, then included what it learned in its output because it thought that was useful context.

That's the fundamental problem. Agents don't exfiltrate secrets through exploits. They exfiltrate them through helpfulness.

How Agents Leak: A Taxonomy of Accidental Exposure

I spent the next week auditing every output my agents had produced over the previous month. The patterns were consistent and terrifying.

Direct inclusion in commit messages. The agent reads a config file to understand what service it's modifying, then references specific values in the commit message. "Updated the Redis connection to handle the new cluster at redis://admin:p4ssw0rd@prod-cluster.internal:6379."

PR descriptions and comments. Agents love providing context. They'll quote environment variables, paste snippets from .env files, or reference API keys they found while exploring the codebase.

Tool call arguments. This one's subtle. When an agent calls a tool like create_file or run_command, the arguments are often logged. If the agent constructs a curl command with an embedded token, that token ends up in your observability pipeline.

Chat responses in shared contexts. If you're running agents in a team environment where conversations are logged or shared, any secret the agent mentions in its reasoning becomes visible to everyone with access to that log.

Debug output. Agents asked to troubleshoot a failing connection will happily dump the full connection string, including credentials, into their diagnostic output.

The common thread: the agent has legitimate read access to secrets (it needs them to understand the system), but no concept of why those values should never appear in its outputs.

The Interceptor: Absol Pattern

I named the solution after Absol, the Pokemon that appears before disasters. It doesn't cause the problem. It detects and warns before damage occurs. The architecture is simple: a proxy layer that wraps every output function the agent can call. Nothing leaves the agent without passing through the interceptor.

interface OutputChannel {
  name: string;
  write: (content: string) => Promise<void>;
}

interface InterceptionResult {
  action: "allow" | "block" | "redact";
  original: string;
  sanitized?: string;
  detections: SecretDetection[];
}

interface SecretDetection {
  type: "pattern" | "entropy" | "reference";
  value: string;
  location: { start: number; end: number };
  confidence: number;
  pattern?: string;
}

class SecretInterceptor {
  private channels: Map<string, OutputChannel> = new Map();
  private detectors: SecretDetector[];
  private allowlist: Set<string>;
  private knownSecrets: Map<string, string>;

  constructor(config: InterceptorConfig) {
    this.detectors = [
      new PatternDetector(config.patterns),
      new EntropyDetector(config.entropyThreshold ?? 4.5),
      new ReferenceDetector(config.secretSources),
    ];
    this.allowlist = new Set(config.allowlist ?? []);
    this.knownSecrets = new Map();
  }

  wrap(channel: OutputChannel): OutputChannel {
    return {
      name: channel.name,
      write: async (content: string) => {
        const result = await this.intercept(content, channel.name);

        if (result.action === "block") {
          await this.handleBlock(result, channel.name);
          return;
        }

        const output = result.action === "redact"
          ? result.sanitized!
          : result.original;

        await channel.write(output);
      },
    };
  }

  private async intercept(
    content: string,
    channelName: string
  ): Promise<InterceptionResult> {
    const detections: SecretDetection[] = [];

    for (const detector of this.detectors) {
      const found = await detector.scan(content);
      detections.push(...found);
    }

    const filtered = detections.filter(
      (d) => !this.allowlist.has(d.value) && d.confidence > 0.7
    );

    if (filtered.length === 0) {
      return { action: "allow", original: content, detections: [] };
    }

    const highConfidence = filtered.some((d) => d.confidence > 0.95);

    if (highConfidence) {
      return {
        action: "block",
        original: content,
        detections: filtered,
      };
    }

    return {
      action: "redact",
      original: content,
      sanitized: this.redact(content, filtered),
      detections: filtered,
    };
  }

  private redact(content: string, detections: SecretDetection[]): string {
    let result = content;
    // Process in reverse order to maintain correct positions
    const sorted = [...detections].sort(
      (a, b) => b.location.start - a.location.start
    );

    for (const detection of sorted) {
      const prefix = result.slice(0, detection.location.start);
      const suffix = result.slice(detection.location.end);
      const placeholder = `[REDACTED:${detection.type}]`;
      result = prefix + placeholder + suffix;
    }

    return result;
  }

  private async handleBlock(
    result: InterceptionResult,
    channelName: string
  ): Promise<void> {
    console.error(
      `[SecretInterceptor] BLOCKED output on channel "${channelName}". ` +
      `Detected ${result.detections.length} potential secret(s).`
    );
    // In production: alert, log to SIEM, quarantine the agent session
  }
}

The key insight is the wrap method. You don't modify the agent or the output channels. You wrap the channels so the agent doesn't even know the interceptor exists. It calls the same functions with the same signatures. The interception is invisible.

Detection: Three Layers Deep

A single detection method isn't enough. Pattern matching catches known formats but misses custom secrets. Entropy catches random strings but flags UUIDs and hashes that aren't secrets. Reference checking catches known values but can't detect secrets that were generated after the reference list was built. You need all three.

Pattern Detection

This is your first line of defense. Known secret formats have distinctive structures.

class PatternDetector implements SecretDetector {
  private patterns: RegexPattern[];

  constructor(customPatterns: RegexPattern[] = []) {
    this.patterns = [
      ...DEFAULT_PATTERNS,
      ...customPatterns,
    ];
  }

  async scan(content: string): Promise<SecretDetection[]> {
    const detections: SecretDetection[] = [];

    for (const pattern of this.patterns) {
      const regex = new RegExp(pattern.regex, "g");
      let match: RegExpExecArray | null;

      while ((match = regex.exec(content)) !== null) {
        detections.push({
          type: "pattern",
          value: match[0],
          location: {
            start: match.index,
            end: match.index + match[0].length,
          },
          confidence: pattern.confidence,
          pattern: pattern.name,
        });
      }
    }

    return detections;
  }
}

const DEFAULT_PATTERNS: RegexPattern[] = [
  {
    name: "aws_access_key",
    regex: "AKIA[0-9A-Z]{16}",
    confidence: 0.99,
  },
  {
    name: "stripe_secret",
    regex: "sk_(live|test)_[0-9a-zA-Z]{24,}",
    confidence: 0.99,
  },
  {
    name: "github_pat",
    regex: "ghp_[0-9a-zA-Z]{36}",
    confidence: 0.99,
  },
  {
    name: "generic_api_key",
    regex: "(?i)(api[_-]?key|apikey|api[_-]?secret)\\s*[=:]\\s*['\"]?([a-zA-Z0-9_\\-]{20,})",
    confidence: 0.8,
  },
  {
    name: "private_key_block",
    regex: "-----BEGIN (RSA |EC |DSA )?PRIVATE KEY-----",
    confidence: 0.99,
  },
  {
    name: "connection_string",
    regex: "(mongodb|postgres|mysql|redis)://[^\\s'\"]+:[^\\s'\"]+@[^\\s'\"]+",
    confidence: 0.95,
  },
];

Pattern detection is fast, precise, and the confidence scores are high because these formats are unambiguous. If something matches AKIA[0-9A-Z]{16}, that's an AWS access key. Period.

Entropy Detection

Secrets tend to be random. That's literally their defining property. Shannon entropy measures randomness per character, and secrets consistently score above the threshold of normal text.

class EntropyDetector implements SecretDetector {
  private threshold: number;
  private windowSize: number;
  private minLength: number;

  constructor(threshold = 4.5, windowSize = 64, minLength = 16) {
    this.threshold = threshold;
    this.windowSize = windowSize;
    this.minLength = minLength;
  }

  async scan(content: string): Promise<SecretDetection[]> {
    const detections: SecretDetection[] = [];
    const tokens = this.tokenize(content);

    for (const token of tokens) {
      if (token.value.length < this.minLength) continue;

      const entropy = this.calculateEntropy(token.value);

      if (entropy >= this.threshold) {
        // Scale confidence based on how far above threshold
        const confidence = Math.min(
          0.5 + (entropy - this.threshold) * 0.15,
          0.95
        );

        detections.push({
          type: "entropy",
          value: token.value,
          location: token.location,
          confidence,
        });
      }
    }

    return detections;
  }

  private calculateEntropy(str: string): number {
    const freq = new Map<string, number>();

    for (const char of str) {
      freq.set(char, (freq.get(char) ?? 0) + 1);
    }

    let entropy = 0;
    const len = str.length;

    for (const count of freq.values()) {
      const p = count / len;
      entropy -= p * Math.log2(p);
    }

    return entropy;
  }

  private tokenize(
    content: string
  ): Array<{ value: string; location: { start: number; end: number } }> {
    const tokens: Array<{
      value: string;
      location: { start: number; end: number };
    }> = [];
    // Split on whitespace and common delimiters
    const regex = /[^\s'"`,;:(){}\[\]]+/g;
    let match: RegExpExecArray | null;

    while ((match = regex.exec(content)) !== null) {
      tokens.push({
        value: match[0],
        location: { start: match.index, end: match.index + match[0].length },
      });
    }

    return tokens;
  }
}

The threshold of 4.5 bits per character is the sweet spot I found after analyzing about 10,000 agent outputs. Normal English text sits around 3.5-4.0. Base64-encoded secrets hit 5.5+. The gap is wide enough to be useful without drowning in false positives.

Reference Detection

This is the most effective detector but requires setup. You load your actual secrets from their sources and check if any appear in the output.

class ReferenceDetector implements SecretDetector {
  private secrets: Map<string, string> = new Map();
  private refreshInterval: NodeJS.Timeout | null = null;

  constructor(sources: SecretSource[]) {
    this.loadSecrets(sources);
    // Refresh every 5 minutes to catch rotated secrets
    this.refreshInterval = setInterval(
      () => this.loadSecrets(sources),
      5 * 60 * 1000
    );
  }

  private async loadSecrets(sources: SecretSource[]): Promise<void> {
    this.secrets.clear();

    for (const source of sources) {
      switch (source.type) {
        case "env_file": {
          const content = await readFile(source.path, "utf-8");
          const parsed = this.parseEnvFile(content);
          for (const [key, value] of Object.entries(parsed)) {
            if (value.length >= 8) {
              this.secrets.set(value, key);
            }
          }
          break;
        }
        case "environment": {
          for (const [key, value] of Object.entries(process.env)) {
            if (value && value.length >= 8 && this.looksSecret(key)) {
              this.secrets.set(value, key);
            }
          }
          break;
        }
        case "vault": {
          const vaultSecrets = await source.fetch();
          for (const [key, value] of Object.entries(vaultSecrets)) {
            this.secrets.set(value, key);
          }
          break;
        }
      }
    }
  }

  async scan(content: string): Promise<SecretDetection[]> {
    const detections: SecretDetection[] = [];

    for (const [secretValue, secretName] of this.secrets) {
      let idx = content.indexOf(secretValue);

      while (idx !== -1) {
        detections.push({
          type: "reference",
          value: secretValue,
          location: { start: idx, end: idx + secretValue.length },
          confidence: 0.99,
          pattern: `known_secret:${secretName}`,
        });
        idx = content.indexOf(secretValue, idx + 1);
      }
    }

    return detections;
  }

  private looksSecret(key: string): boolean {
    const secretKeywords = [
      "SECRET", "KEY", "TOKEN", "PASSWORD", "PASS",
      "CREDENTIAL", "AUTH", "PRIVATE", "CONNECTION",
    ];
    const upper = key.toUpperCase();
    return secretKeywords.some((kw) => upper.includes(kw));
  }

  private parseEnvFile(content: string): Record<string, string> {
    const result: Record<string, string> = {};
    for (const line of content.split("\n")) {
      const trimmed = line.trim();
      if (!trimmed || trimmed.startsWith("#")) continue;
      const eqIdx = trimmed.indexOf("=");
      if (eqIdx === -1) continue;
      const key = trimmed.slice(0, eqIdx).trim();
      let value = trimmed.slice(eqIdx + 1).trim();
      // Strip quotes
      if ((value.startsWith('"') && value.endsWith('"')) ||
          (value.startsWith("'") && value.endsWith("'"))) {
        value = value.slice(1, -1);
      }
      result[key] = value;
    }
    return result;
  }

  destroy(): void {
    if (this.refreshInterval) {
      clearInterval(this.refreshInterval);
    }
  }
}

Reference detection has a 0.99 confidence because if the exact string from your .env file appears in an agent's output, that's not a coincidence. The five-minute refresh handles secret rotation. The looksSecret heuristic on environment variable names prevents us from loading every env var (you don't need to track NODE_ENV or PORT).

Wiring It Up: The Proxy Architecture

Here's how you actually integrate this with an agent framework. I'll use a generic interface but this maps directly onto LangChain, Vercel AI SDK, or any custom agent loop.

function createGuardedAgent(
  agent: Agent,
  config: InterceptorConfig
): Agent {
  const interceptor = new SecretInterceptor(config);

  // Wrap all output channels
  const guardedTools = agent.tools.map((tool) => ({
    ...tool,
    execute: async (...args: unknown[]) => {
      // Intercept tool arguments (they often contain secrets)
      const argString = JSON.stringify(args);
      const argResult = await interceptor.intercept(argString, tool.name);

      if (argResult.action === "block") {
        throw new SecretLeakError(
          `Tool "${tool.name}" blocked: arguments contain secrets`
        );
      }

      const sanitizedArgs = argResult.action === "redact"
        ? JSON.parse(argResult.sanitized!)
        : args;

      const result = await tool.execute(...sanitizedArgs);

      // Also intercept tool output before it enters the context
      const outputResult = await interceptor.intercept(
        JSON.stringify(result),
        `${tool.name}:output`
      );

      if (outputResult.action === "block") {
        return { error: "Output contained sensitive data and was blocked" };
      }

      return outputResult.action === "redact"
        ? JSON.parse(outputResult.sanitized!)
        : result;
    },
  }));

  // Wrap the final response channel
  const guardedRespond = async (message: string): Promise<string> => {
    const result = await interceptor.intercept(message, "agent:response");

    if (result.action === "block") {
      return "[Response blocked: contained sensitive data]";
    }

    return result.action === "redact" ? result.sanitized! : message;
  };

  return {
    ...agent,
    tools: guardedTools,
    respond: guardedRespond,
  };
}

Notice we're intercepting in both directions. Tool arguments get checked (the agent might pass a secret as a parameter to create_pull_request), and tool outputs get checked too (a tool might return file contents that include secrets, and we don't want those entering the agent's context where they could leak later).

False Positives: The Real Engineering Challenge

The detection part is straightforward. The hard part is not blocking legitimate work. Here's what generates false positives:

  • UUIDs. They're high entropy but they're not secrets. Solution: UUID pattern exemption in the entropy detector.
  • Hash values. Git SHAs, content hashes, checksums. High entropy, not secret. Solution: context-aware detection that checks if the value appears after common hash prefixes like sha256: or looks like a git SHA.
  • Base64-encoded non-secret data. Images, encoded payloads, JWTs (the payload, not the signature). Solution: try decoding and check if the decoded content is mundane.
  • Example values in documentation. If the agent is writing docs that include placeholder API keys. Solution: allowlist known example values, check if the string appears within a code fence with "example" in the surrounding text.
interface AllowlistEntry {
  value: string;
  reason: string;
  addedBy: string;
  expiresAt?: Date;
}

class ContextAwareFilter {
  private allowlist: AllowlistEntry[];

  shouldSuppress(detection: SecretDetection, context: string): boolean {
    // UUID pattern
    if (/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i
        .test(detection.value)) {
      return true;
    }

    // Git SHA
    if (/^[0-9a-f]{40}$/.test(detection.value)) {
      return true;
    }

    // Inside a code fence marked as example
    const fencePattern = /```[\s\S]*?```/g;
    let match: RegExpExecArray | null;
    while ((match = fencePattern.exec(context)) !== null) {
      if (
        detection.location.start >= match.index &&
        detection.location.end <= match.index + match[0].length
      ) {
        const preceding = context.slice(
          Math.max(0, match.index - 100),
          match.index
        );
        if (/example|placeholder|dummy|sample/i.test(preceding)) {
          return true;
        }
      }
    }

    return false;
  }
}

I set allowlist entries to expire by default. A value that's safe today might be dangerous tomorrow if it gets promoted from a test key to production.

Real Numbers

I've been running this interceptor in production across four different agent workflows for three weeks now. Here's what I've seen:

| Metric | Count | Notes | |--------|-------|-------| | Total agent outputs scanned | 14,832 | Across commit, PR, chat, and tool channels | | Detections triggered | 247 | Before false positive filtering | | True positives (actual secrets) | 43 | After context-aware filtering | | Blocked outputs | 12 | High-confidence, would have leaked | | Redacted outputs | 31 | Medium-confidence, sanitized and released | | False positive rate (post-filter) | 3.2% | Down from 82% pre-filter |

Forty-three times in three weeks. That's roughly twice a day that an agent tried to include a real secret in its output. Twelve of those would have been catastrophic, things like database passwords in PR descriptions and AWS keys in commit messages.

The pre-filter false positive rate of 82% is why you can't just ship regex patterns and call it done. Without the context-aware filtering layer, you'd block so much legitimate output that engineers would disable the system within a day.

What I'd Do Differently

If I were starting over, I'd build the reference detector first and skip the entropy detector initially. Reference checking against your actual secrets gives you 0.99 confidence with zero false positives. It handles 80% of the risk. Entropy detection is a nice safety net for secrets you don't know about, but the false positive management overhead is significant.

I'd also integrate tighter with the agent's permission model. Right now the interceptor is stateless, it doesn't know why the agent read a secret. If you could track data lineage through the agent's reasoning (this value came from .env.local line 7), you could make smarter decisions about what to block.

The Stripe incident cost us about four hours of incident response and a very uncomfortable conversation with our security team. This interceptor took two days to build. I wish I'd built it first.