THREE PHYSICISTS
AND A MARTIAL ARTIST.
We did not all come from the same place.
Three of us came from physics research labs and production ML infrastructure. We built systems that had to work under real load, with real data, with real consequences for failure. One competed internationally in martial arts, with a decade of experience at the infrastructure that moves real money at scale.
That combination turned out to matter. The physicists approached agent security as an experiment: build a model, find where it breaks, update the model. The martial artist had already seen what breaks in production.
We built agents. Then we broke them.
From 2023 onward we were building AI agents for production workloads. Not demos. Agents with filesystem access, live API keys, database connections, shell execution. The things enterprises are actually deploying today.
We immediately started attacking them. We reproduced every attack class now appearing in CVEs: prompt injection that redirected outbound traffic, tool poisoning that exfiltrated SSH keys, jailbreaks that bypassed deny rules. We were early. We had working exploits before most of these attacks had names.
The experience crystallized one observation: every defense we tested was probabilistic. Prompt-based guardrails, LLM classifiers, regex filters. All bypassable with enough persistence or a clever injection. Security as a cat-and-mouse game at the application layer, where the attacker always gets another turn.
Lilith Zero and the SF moment.
We built and open-sourced Lilith Zero: application-layer enforcement with taint propagation, policy hooks for Claude Code and GitHub Copilot, and OpenClaw integration. Real enterprises started using it. It closes real attack vectors.
In 2025, our CEO Janos Mozer was accepted into Entrepreneur First, giving us access to the SF startup ecosystem and direct conversations with enterprise security teams at scale.
Those conversations confirmed what our research had already shown: no solution fully solved the problem. Every tool on the market was a probabilistic filter, a proxy, an LLM-based heuristic. Bypassable. Noisy. Architecturally insufficient.
Why the kernel is the only answer.
Application-layer security has one structural flaw: it runs in the same trust domain as the thing it is protecting. An attacker who controls the agent process controls the defense. Any sufficiently capable agent can, by design, read files, make network calls, and execute commands. The only layer that can constrain this without being in the same trust domain is the OS kernel.
We chose BPF-LSM for four reasons. First, durability: the Linux syscall interface has not changed meaningfully in decades. High-level AI frameworks change every six months. Security infrastructure needs to last. Second, observability: every syscall an agent makes is visible, logged, and attributable before it executes. Third, fail-closed correctness: BPF programs are statically verified before attachment; the heartbeat mechanism ensures agents are blocked if the daemon dies. Fourth, zero agent awareness: the kernel intercepts at the syscall boundary before the application layer can react.
For policy authoring we chose Cedar, whose semantics are formally proven in Lean 4 and whose policy analysis runs on a CVC5 SMT solver. Operators write policy intent in natural language; it compiles to Cedar AST via LLM, undergoes formal verification, and is signed into a tamper-evident capsule. The policy is mathematically proven correct before it enforces anything.
Durable. Observable. Formally verified. Scalable. Impossible to bypass from userspace.