Essays

The Hallucination Defense

Why logs make 'The AI Did It' the perfect excuse

Niki A. Niyikiza published on
8 min, 1468 words

“The AI hallucinated. I never asked it to do that.”

That’s the defense. And here’s the problem: it’s often hard to refute with confidence.

A financial analyst uses an AI agent to “summarize quarterly reports.” Three months later, forensics discovers the M&A target list in a competitor’s inbox. The agent accessed the files. The agent sent the email. But the prompt history? Deleted. The original instruction? The analyst’s word against the logs.

Without a durable cryptographic proof binding the human to a scoped delegation, “the AI did it” becomes a convenient defense. The agent can’t testify. It can’t remember. It can’t defend itself.

Read More

Semantic Attacks: Exploiting What Agents See

The Era of Reality Injection.

Niki A. Niyikiza published on
12 min, 2371 words

In Map/Territory, I covered the agent→tool boundary: what happens when an agent’s string gets interpreted by a system. Path traversal, SSRF, command injection. The execution layer.

This post covers the opposite direction: world→agent.

World → [perception] → Agent → [authorization] → Tool → System
         ^                      ^
         This post              Map/Territory
Read More

Claude Code CVE-2025-66032: Why Allowlists Aren't Enough

Validating strings will never secure command execution

Niki A. Niyikiza published on
10 min, 1881 words

Recently, RyotaK at GMO Flatt Security published 8 ways to execute arbitrary commands in Claude Code without user approval. Anthropic patched it fast by switching to an allowlist.

That stops the bleeding, but it doesn’t cure the disease.

The error was in the layer, not the list. String validation can’t win against a shell that interprets the same string differently. Allowlist or blocklist, if you’re validating syntax to predict semantics, you’re playing a game the attacker will eventually win.

Read More

The Map is not the Territory: The Agent-Tool Trust Boundary

Or Why You Can't Regex Your Way to Agent Security

Niki A. Niyikiza published on
15 min, 2971 words

The longer I work on Tenuo, the more I realize there’s a specific blind spot in the current AI agent landscape that almost no one is talking about, even as the theoretical foundations solidify.

There is exceptional momentum in security research right now. Simon Willison has extensively documented and popularized the prompt injection threat model. Google’s CaMeL paper proposes adapting models to strict capability sets. Microsoft’s FIDES is tackling information flow control.

The theory is solidifying. Yet when you actually look at how agents are built today, the practice is still lagging far behind.

We spend a lot of time analyzing the model alignment or the high-level policy. We don’t spend enough time looking at the connector. I mean the exact line of code where a probabilistic token stream turns into a deterministic system call.

This is where the abstractions leak. Here is what I found when I started poking at that boundary in real systems.

TL;DR: LLM tool calls pass strings (the Map) that get interpreted by systems (the Territory). Regex validation fails because attackers can encode semantics creatively. You need semantic validation (Layer 1.5) and execution-time guards (Layer 2). Skip to solutions →

Read More

Flowing Authority: Introducing Tenuo

Capability-based authorization for AI agents

Niki A. Niyikiza published on
8 min, 1470 words

What if authority followed the task, instead of the identity?

I’ve been scratching my head over that question for a while. Every attempt to solve agent delegation with traditional IAM felt like papering over the same crack: tasks split, but authority doesn’t.

Agents decompose tasks.
IAM consolidates authority.
The friction is structural.

I’ve been building Tenuo to experiment with the idea. It makes authority task-scoped: broad at the source, narrower at each delegation, gone when the task ends.

Rust core. Python bindings. ~27μs verification.

Read More