Pi — A Principle-by-Principle Deep Dive

Four tools. ~750 tokens of system prompt. 25+ providers. No MCP, no sub-agents, no permission popups — and somehow it outperforms harnesses 10× its complexity. Here's how, principle by principle, file by file.

core toolsread · write · edit · bash

~750

prompt tokensvs. ~5K in Claude Code

25+

LLM providersunified, lazy-loaded

design principlesextracted in this site

30-second pitch

Pi is a coding-agent harness built by Mario Zechner (badlogic) that defines itself by what it refuses to ship: no MCP, no plan mode, no sub-agents, no permission popups, no built-in to-dos. What's left is a tiny, transparent kernel with hooks for everything you might want to add. Same models as Claude Code. Dramatically different product.

By the end of this site you'll know: why "the harness is the product, not the model" · what's actually inside the 96K-line monorepo · the 42 design decisions that make pi tick · the ~10 code patterns worth stealing for your own framework. ~50 minutes if you read everything · 10 minutes if you skim.

if I don't need it, it won't be built. Mario Zechner — the one-line philosophy that runs every file in pi

whyCoding in the post-agent era

Three years ago, the bottleneck in software was "can I write this code?"

Today, with frontier models, it's "can the agent see what I see, do what I want, and tell me the truth about what it did?"

The shift

An LLM coding agent is no longer fancy autocomplete.

It's a teammate that reads files, runs shell commands, and rewrites your codebase autonomously — sometimes for hours, often while you're not watching.

That shift creates a new product category: the harness.

The harness is the wrapper around the model. It decides which tools it has, what's in its system prompt, how its decisions get persisted, and what you can and can't see.

Why this matters

Pick any frontier model — Claude, GPT, Gemini. They're all good.

What differs is the harness.

Claude Code, Cursor, Aider, Codex CLI, Amp, Cline, opencode — same models, dramatically different products.

The harness is the product. Most users never realize this.

!?The core problem with today's harnesses

Every popular harness today is opaque, opinionated, and unstable in ways that hurt serious users.

Mario Zechner — the author of pi — used Claude Code and Cursor seriously for years. He accumulated five specific complaints:

Opaque context engineering. Claude Code injects content into your prompt that's never surfaced in the UI. You cannot see exactly what was sent. Debugging "why did it do that?" is guesswork.
Hidden machinery. Sub-agents you can't audit. Plan modes that run sealed sessions. MCP servers that dump 10K+ tokens of tool descriptions into context before your prompt even starts.
Constant churn. System prompts and tools change between releases without notice. Workflows you built last month break this month.
Security theater. Permission popups for every bash command. Everyone disables them or learns to spam Yes. Real safety would be at the container or VM level, but no harness offers that as the default.
You can't fork the design. Want a different prompt, a different tool, a different compaction strategy? File an issue. Wait. Probably get told "no."

The deeper problem behind all five

The harness pretends to be a finished product. It's actually a workflow you've been forced to inherit.

The features it ships are someone else's bets about how you'll work.

The features it omits are someone else's bets about what you don't need.

Neither is your decision.

vsBefore / after — the same workflow on two harnesses

Same task — "fix the failing test in src/foo.ts" — running on a typical batteries-included harness vs. a pi-style minimal harness. Read both columns; the differences are what this site is about.

Before · typical harness

You type a prompt. The harness silently injects a multi-thousand-token system prompt, prepends 10–20K of MCP tool descriptions, maybe spawns a sub-agent for "planning."
The model runs. A status bar moves. You cannot see the literal JSON sent to the API.
It calls bash. A popup asks for permission. You click Yes for the 47th time today and stop reading the commands.
It calls a hidden "plan-mode" sub-agent. You see a plan summary at the end. You cannot inspect the prompts the sub-agent received, the tools it had, or the reasoning it produced.
It "succeeds." You can't tell if anything quietly went wrong because there's no event log you control.
Want a different system prompt? You can't. It's compiled into the binary and changes between versions.
Want to see what the agent did last week? "Recent" lets you scroll a list of titles in a UI. The underlying state is binary or in a DB you can't grep.
Want to time-travel and try a different approach from an earlier point? Restart the chat. Re-type context.
The harness updates. Your workflow breaks. The maintainer changes the system prompt to "improve" something. You discover this by accident, three days later.

After · pi-style harness

You type a prompt. The system prompt is ~750 tokens, on disk, in a file you can edit (SYSTEM.md) or replace per-project.
The model runs. You see every event as a delta. Every byte sent and received is in a JSONL file. grep works.
It calls bash. It just runs. If you wanted safety you'd have launched pi inside a container. The harness does not pretend to be your last line of defense.
It needs to plan. It writes PLAN.md. You can read it, edit it, commit it. There is no hidden sub-agent. There is no sealed reasoning trace.
It finishes. Every prompt, response, tool call, and result is a line in ~/.pi/agent/sessions/<...>.jsonl. You own the audit trail.
Want a different system prompt? Drop a markdown file. Want a different tool? Write a 50-line extension. Want a different compaction strategy? Same.
Want to see what the agent did last week? Open the JSONL. Or run /tree in the TUI to navigate the conversation graph visually.
Want to time-travel? /tree picks any prior turn. Continue from there. The original branch stays in the file forever.
The harness updates when you run pi update. Not before. The version you installed is the version you have.

=What pi proposes

Five moves, applied with discipline:

1 Build the minimum kernel.
2 Make every choice visible.
3 Make every choice overridable.
4 Expose hooks instead of features.
5 Persist everything in a format you can grep.

Then trust the user to assemble the workflow that fits them — instead of trying to predict what 100,000 people want.

The post-agent era is about you designing the workflow — not living inside someone else's.

The rest of this site shows — principle by principle, file by file — how that idea becomes code.

the ah-ha

The product is the harness, not the model. Once you internalize this, you realize most "AI tool" comparisons are wrong. Claude Code vs Cursor isn't about Claude vs GPT — it's about two different opinionated wrappers calling the same APIs. Pi's bet is that the right wrapper is the one that gets out of your way and shows you everything.

📖About this site

This is an interactive companion to a deep technical study of the Pi coding agent framework built by Mario Zechner (badlogic). It pairs the philosophical "why" — distilled from Mario's launch post — with the implementation "how" — pulled directly from the source.

Every claim links to the exact file in the repo, like this: packages/coding-agent/src/core/system-prompt.ts. Click any badge to read the code.

Goal of this site — not just to understand pi, but to extract principles you can apply when building your own thoughtful agent framework. We'll cover what pi does, why it does it, and which choices generalize beyond its specific domain.

How to read this

Section 1 (Foundations) — read this first if you're new to LLM agents. It's the glossary that makes everything after make sense.
Section 2 (Problem) — Mario's specific frustrations that motivated pi. The "enemy" the project was built against.
Section 3 (42 Principles) — the meat. Organized A–F by abstraction level.
Section 4 (Architecture) — the bird's-eye view of the four packages and how they fit.
Section 5 (Diagrams) — 9 detailed architecture diagrams (system map, request lifecycle, agent loop state machine, streaming pipeline, tool execution, session tree, TUI rendering, startup, data model).
Sections 6–9 — cross-cutting patterns, traction analysis, takeaways, copyable code patterns.
Code-link badges — every claim is anchored to a real file. Click any path/to/file.ts badge to jump to the source on GitHub.

1. Foundational Concepts

6 min read·13 concepts

TL;DR: Vocabulary check. The terms you need before pi's design choices make sense. Skim freely — jump back when something later confuses you.

The vocabulary you need before pi's design choices will make sense. Skim this; jump back when something later confuses you.

1.1What is an LLM "coding agent"?

An LLM model on its own can only produce text. A coding agent is a wrapper (often called a harness) that does three things in a loop:

Sends the conversation + a list of available tools to the model.
If the model decides to call a tool (e.g. "read the file at src/foo.ts"), the harness executes the tool, captures the result, and appends it to the conversation.
Calls the model again with the updated conversation. Repeats until the model stops calling tools.

Claude Code, Cursor, Aider, Codex CLI, Amp, opencode, and pi are all coding-agent harnesses. They differ in which tools they expose, how they format the conversation, and what they hide from the user.

1.2The agent loop

The core control structure. Pseudocode:

while True:
    response = llm.complete(messages, tools=available_tools)
    messages.append(response)
    if response.has_tool_calls():
        for call in response.tool_calls:
            result = execute(call.name, call.args)
            messages.append({role: "toolResult", content: result})
        continue          # loop again with the tool results
    break                 # model produced a final answer

In pi this lives in packages/agent/src/agent-loop.ts. Real implementations add: parallel tool execution, abort/cancel, error handling, streaming events, steering (interruptions), context compaction, etc. Every coding agent is ultimately this loop with sophistication around the edges.

1.3Tools (a.k.a. function calling)

A tool is a function the model can call. It has a name, a description, a JSON-schema for its arguments, and a real implementation. The provider's API supports passing tool definitions and parsing tool calls out of model output.

Pi defines four core tools — read, write, edit, bash — implemented in packages/coding-agent/src/core/tools. The argument schemas use TypeBox, which gives you JSON Schema + TypeScript types from one definition.

tool_call

The model's output asking to invoke a function. Contains id, name, arguments (parsed JSON).

tool_result

Your reply with the function's output. Tied to the original call by id. Can be text, JSON, or include images.

parallel tool calls

A single model response can request multiple tool calls. Harnesses can run them sequentially or in parallel. Pi defaults to parallel.

tool_choice

Provider-specific option to force the model to call a specific tool, any tool, or none.

1.4System prompt

The first message in the conversation, sent with every request, that sets the model's persona and instructions. Claude Code's system prompt is long and changes between releases. Pi's is under ~750 tokens and shown in plain sight at packages/coding-agent/src/core/system-prompt.ts.

1.5Context window, tokens, compaction

token

The unit a model sees. Roughly 4 characters of English text. Pricing and context limits are measured in tokens.

context window

How many tokens the model can attend to at once. Claude Sonnet 4: ~200K. GPT-5: varies. Hit the limit → the provider rejects the request.

context engineering

The discipline of deciding what to put in the prompt, in what order, with what framing. Most "agent quality" differences come from this.

compaction

Summarizing old turns so the conversation fits in the window. Pi: triggered automatically near the limit, or via /compact. See .../compaction.ts.

prompt cache

Providers cache the prefix of a prompt and charge less for cached tokens. Anthropic: 5 min default, 1 h "extended." Critical for long agentic sessions.

thinking / reasoning

Some models emit private reasoning before the final answer. Anthropic calls it "extended thinking"; OpenAI calls it "reasoning"; Google calls it "thoughts." Pi normalizes to thinking blocks.

1.6Streaming & SSE

LLM APIs can return their response as a stream of tiny events (Server-Sent Events) instead of one big JSON blob. Each event is a delta: "here's another word," "here's another tool-call-argument chunk," "here's a usage update." Streaming lets the UI update live. It also makes abort meaningful — you can stop mid-stream and keep the partial result.

Pi's streaming event protocol is defined in packages/ai/src/types.ts as the AssistantMessageEvent union. Each event carries both the delta (the new chunk) and the partial (the accumulated full message so far).

1.7TUI: stream-based vs full-screen

Terminal UIs have two paradigms:

Full-screen (Amp, opencode): treat the terminal as a pixel buffer with character cells. You lose native scrollback and have to reimplement search, copy, scroll. Apps like vim and htop are full-screen.
Stream-based (Claude Code, Codex, pi): append to scrollback like a normal CLI, occasionally jump the cursor back to redraw. You keep all the terminal's built-in features. Pi's TUI is here, in packages/tui/src/tui.ts.

1.8MCP (Model Context Protocol)

Anthropic's MCP is a standard for exposing tools to LLMs via a separate process (an "MCP server"). The harness connects to the server, lists its tools, and exposes them to the model. Pi explicitly refuses to ship MCP support — see Principle 7. Mario's full argument: "What if you don't need MCP?"

1.9Claude Code features pi rejects (for contrast)

Plan mode

An ephemeral read-only "planning" session whose output is a markdown plan. Pi's alternative: write a PLAN.md file directly with full observability.

Sub-agents

Spawning child agents with their own context. Pi: gather context in a separate session, save an artifact, use it in a fresh session.

To-do tool

Built-in to-do list state. Pi argues this confuses models; use a TODO.md file with checkboxes.

Background bash

Long-running processes managed by the harness. Pi: use tmux — it already does this.

Permission popups

"Allow this bash command?" Pi argues that if the agent can write and execute code, popups are theater. Run pi in a container if you need real safety.

1.10YOLO mode

"You only live once." Slang for "no permission prompts, just do it." Pi defaults to YOLO. The alternative — popups for every command — has been documented to lead to "yes-spam" trained into users, which is worse than honestly unsafe. The real safety boundary belongs at the container or VM level, not in-process.

1.11Provider terminology

provider

A company that hosts models (Anthropic, OpenAI, Google, Groq, xAI, Together, etc.). Pi supports 25+. See packages/ai/README.md.

API family

The wire format. Pi recognizes ~9: anthropic-messages, openai-completions, openai-responses, google-generative-ai, google-vertex, azure-openai-responses, openai-codex-responses, mistral-conversations, bedrock-converse-stream.

model

A specific (provider, model-id) pair. e.g. ("anthropic", "claude-sonnet-4-5"). Has costs, context window, capabilities (vision, reasoning).

OpenAI-compatible

Many providers (Ollama, vLLM, LM Studio, xAI, Cerebras, Together) speak the OpenAI Completions wire format with small variations. Pi handles these via the openai-completions API + per-provider compat flags.

1.12JSONL session format

One JSON object per line. Append-only. Easy to grep, easy to diff.

Pi stores every session as JSONL. Each entry has an id and parentId, forming a tree rather than a list.

The payoff: branching, time-travel, and compaction — for free. See .../session-manager.ts.

1.13TypeBox

A library that lets you define JSON Schema with TypeScript type inference.

One source of truth: the schema validates runtime input, AND the TypeScript compiler knows the resulting shape.

Pi uses TypeBox for every tool's parameter schema. Alternative: Zod. Pi rejected it because TypeBox schemas are pure JSON Schema, so they serialize over the wire cleanly.

You're ready. Everything that follows builds on these concepts. If a later section references something here, click the section in the sidebar.

2. The Problem Space Pi Is Reacting To

2 min read·5 complaints

TL;DR: Five specific complaints with today's harnesses — opacity, churn, hidden machinery, fake safety, and inability to fork the design.

Mario Zechner spent three years cycling through ChatGPT → Copilot → Cursor → Claude Code.

He accumulated a specific set of grievances that pi is the answer to:

Opaque context engineering. Existing harnesses inject content into the prompt that's never surfaced in the UI. You can't see exactly what was sent, so debugging "why did the model do that?" is guesswork.
Constant churn. Claude Code's system prompt and tools change between releases. Behavior shifts under your feet. Workflows built on a moving target rot.
Feature creep with hidden costs. Plan mode, sub-agents, to-dos, MCP — each consuming context (often 5–10% of the window) and adding behavioral surface area you never opted into.
YOLO theater. Every harness has permission popups, and every serious user disables them. If the agent can already write code and execute it, popups don't add real safety — just friction.
Bad provider abstractions. Vercel AI SDK has leaky abstractions and accumulated baggage. Mario wanted full control of the request/response shape per provider, not a lowest-common-denominator API.

"if I don't need it, it won't be built."

— Mario Zechner

That sentence is pi's entire philosophy. Everything else flows from it.

3. The 42 Principles

18 min read·42 principles·6 levels

TL;DR: 42 design decisions across 6 abstraction levels — project culture, agent design, architecture, the loop, the TUI, the coding agent. The meat of the site.

Organized A → F by abstraction level: project culture → agent design → architecture → loop semantics → terminal UI → coding-agent specifics.

All 42 in one glance — scan the whole shape before drilling in

A · Project culture

P1Auto-close issues from new contributors; lgtm grants rights. P2Extensibility replaces features. No sub-agents, MCP, plan mode. P3Dependencies as reviewed code — exact pinning, lockfile gate. P4The project's own AGENTS.md is a prompt-engineering masterclass.

B · Agent design

P5~750-token prompt. Models already know what an agent is. P6Four tools: read · write · edit · bash. That's it. P7Refuse MCP. CLI tools with READMEs = progressive disclosure. P8Refuse plan mode. Write a PLAN.md instead. P9Refuse sub-agents for context. Use separate sessions + artifacts. P10Refuse background bash. Use tmux. P11Refuse built-in to-dos. Use TODO.md. P12YOLO by default. Real safety lives at the container level.

C · Architecture

P13Four packages, lockstep-versioned, independently useful. P14Build on raw provider SDKs, not a meta-SDK like Vercel AI. P15Lazy-load providers via memoized import() wrapped in a stream. P16Every streaming event = delta + accumulated snapshot. P17Streams are async-iterable AND awaitable via .result(). P18Cross-provider compatibility via stateful message transform. P19AbortSignal threads through everything; partial results on abort. P20Tool validation throws into the stream, not the program.

D · The agent loop

P21Copy-on-write state via getter/setter properties. P22Subscribers as barriers — await ordering = free sync gates. P23Steering + follow-up = two queues, two polling sites. P24Termination by unanimous vote across the tool batch. P25Tool errors are tool results, not exceptions. P26Parallel by default; sequential when any tool requires it. P27transformContext → convertToLlm — two-phase shaping.

E · The TUI

P28Stream-based UI over full-screen — keep native scrollback. P29Three-strategy diff, all wrapped in synchronized output. P30Components return string arrays. 3-method interface. No vDOM. P31IME cursor positioning via zero-width APC markers. P32Bracketed paste — big pastes fold into [paste #N +M lines]. P33Kitty keyboard protocol with xterm fallback. P34Inline images via Kitty graphics + iTerm2 protocols.

F · The coding agent

P35Sessions as JSONL trees. parentId = branching for free. P36Compaction is just another entry pointing at firstKeptEntryId. P37AGENTS.md merged root-down to cwd, deepest-last wins. P38Skills are content, not tools. Appended to the system prompt. P39Prompt templates with shell-style $1 · $@ · ${@:N:L} expansion. P40Extensions are TypeScript modules with a fat ExtensionAPI. P41Four runtime modes (interactive · print · JSON · RPC) from one binary. P42SDK as first-class entry. CLI is a thin wrapper.

Section A
Project-level principles

P1Build for one user first, then expose seams.

Pi is opinionated and dictatorial. Mario has explicitly said he'll be a "dictator" with contributions — "I will also do my best to give you reasons why." New contributors' issues and PRs are auto-closed by default.

See it in action: .github/workflows/issue-gate.yml, .github/workflows/pr-gate.yml, .github/workflows/approve-contributor.yml. Maintainers triage daily; only lgtm/lgtmi comments grant submission rights. This isn't hostile — it's protection against the death-by-1000-features that every successful OSS project suffers.

P2Extensibility replaces features.

Every "I want X" doesn't become a core feature. It becomes a place where an extension could go. Pi ships with no sub-agents, no plan mode, no MCP, no background bash, no permission prompts, no built-in to-dos — but each can be added via:

docs/extensions.md — TypeScript modules
docs/skills.md — markdown skill packages
docs/prompt-templates.md — slash-command expansions
docs/themes.md — color schemes
docs/packages.md — install via pi install npm:... or git:...

The README brags about what extensions can do: "Sub-agents and plan mode. Custom compaction… Permission gates… Games while waiting (yes, Doom runs)." See it in examples/extensions.

P3Treat dependencies as reviewed code.

The supply-chain hardening is unusually rigorous:

Direct external deps pinned to exact versions: .npmrc sets save-exact=true and min-release-age=2
Pre-commit blocks accidental lockfile commits unless PI_ALLOW_LOCKFILE_CHANGE=1
Lifecycle-script allowlist in scripts/generate-coding-agent-shrinkwrap.mjs
Isolated install smoke tests before publishing: scripts/local-release.mjs
Scheduled npm audit GitHub workflow

P4Make the rules explicit so agents and humans can follow them.

Pi's own AGENTS.md is itself a study in rule-writing for LLMs. Sample rules:

"No any types unless absolutely necessary"
"Single-line helper functions with a single call site are forbidden; inline them instead"
"Use only erasable TypeScript syntax compatible with Node strip-only mode"
"Always include fixes #<number> in the commit message when there is a related issue"
"ONLY commit files YOU changed in THIS session. Multiple agents may work on different files in the same worktree simultaneously." — they assume agents will work in parallel and have written safety rules for that explicitly.

Section B
Agent-design principles

P5Sub-1000-token system prompts work, because the model already knows what a coding agent is.

The actual base prompt is in packages/coding-agent/src/core/system-prompt.ts. Approximate text:

You are an expert coding assistant operating inside pi, a coding
agent harness. You help users by reading files, executing commands,
editing code, and writing new files.

Available tools:
  - read: ...
  - write: ...
  - edit: ...
  - bash: ...

In addition to the tools above, you may have access to other
custom tools depending on the project.

Guidelines:
  ... [a few short lines, conditional on which tools are loaded]

Pi documentation (read only when the user asks about pi itself):
  - Main documentation: ...

Current date: ...
Current working directory: ...

That's ~650–750 tokens. Mario's argument: frontier models have been "RL-trained up the wazoo" on coding-agent traces. They already know what read/write/edit/bash mean. Test this in your own project before assuming it's true — but pi's Terminal-Bench 2.0 numbers support it.

P6Four tools cover ~all coding work.

That's it. Optional read-only variants (grep, find, ls) for restricted modes. Anything more is either a wrapper around bash or belongs in an extension.

read — tools/read.ts. path + optional offset/limit. Auto-detects images (jpg/png/gif/webp), returns them as image blocks. Truncates at 100 lines / 64KB with a [Showing lines N–M of TOTAL. Use offset=X to continue.] hint.
write — tools/write.ts. path + content. Auto-mkdir -p. Full overwrite, no append mode.
edit — tools/edit.ts. path + array of {oldText, newText} edits. Atomic (all-or-nothing). oldText must be unique. Strips/restores BOM. Detects CRLF vs LF. Returns unified diff in details.
bash — tools/bash.ts. command + optional timeout. Streaming output via OutputAccumulator. Last 100 lines / 64KB kept; full output saved to /tmp/pi-bash-*.txt if truncated (no data loss). Kills the whole process tree on timeout/abort.

Common pattern: abort-signal-safe, errors thrown as exceptions (the agent loop catches them and turns them into isError: true tool results so the model can self-correct).

P7Refuse MCP. Use CLI tools with READMEs.

Mario's MCP critique: dumping all tool descriptions into context on every session is wasteful. Playwright MCP = 21 tools × ~13.7K tokens; Chrome DevTools MCP = 26 tools × ~18K tokens. That's 7–9% of context for tools you may never use.

Alternative: ship a CLI tool with a README. The agent reads the README only when it actually needs the tool — progressive disclosure. It then invokes via bash. Composable (pipe, chain), trivially extensible, token-cheap.

Full argument: "What if you don't need MCP?" Peter Steinberger's mcporter wraps existing MCP servers as CLI tools so you can use them this way.

P8Refuse plan mode. Use a file.

Pi has no Plan Mode. Recommended pattern: write PLAN.md directly. Shareable across sessions, versionable with code, fully observable. Claude Code's Plan Mode runs a "read-only" sub-agent whose work you can't see except via its final markdown output.

P9Refuse sub-agents for context gathering.

Better: gather context in a separate session, save the artifact, start a fresh session. Full observability and steerability.

using a sub-agent mid-session for context gathering is a sign you didn't plan ahead. Mario Zechner, on why pi has no sub-agent tool

Only legitimate sub-agent use case he endorses: code review — spawn pi --print from bash with a review prompt. Parallel sub-agent feature implementation: "an anti-pattern in my book and doesn't work, unless you don't care if your codebase devolves into a pile of garbage."

P10Refuse background bash. Use tmux.

Background process management adds complexity. Pi outsources it. tmux already does it, has a CLI the agent drives via bash. You can attach the same session and co-debug. Pi's own AGENTS.md even teaches the workflow:

tmux new-session -d -s pi-test -x 80 -y 24
tmux send-keys -t pi-test "your prompt" Enter
sleep 3 && tmux capture-pane -t pi-test -p

There's simply no need for background bash. Claude Code can use tmux too, you know. Mario Zechner, on outsourcing complexity to tools that already exist

P11Refuse built-in to-dos.

"They confuse models." Use TODO.md with checkboxes. Same pattern as plan mode: file > internal state. Simple, visible, under your control.

P12YOLO by default; sandbox by container.

The real answer for safety is external — container, VM, or a different tool. Pi's --tools read,grep,find,ls flag exists for restricted read-only mode but Mario explicitly says "You won't be happy with that though."

Everybody is running in YOLO mode anyways to get any productive work done, so why not make it the default? Mario Zechner, on permission popups

why this works

The 8 refusals (P5–P12) are not the absence of features — they ARE the feature. Each refusal saves context tokens, removes a class of failure, and lets the user reach for a better external tool (a file, tmux, a container). Pi's bet: negative space, well-defended, is itself a product decision.

Section C
Architecture principles

P13Separate the layers ruthlessly.

Four packages, lockstep-versioned:

Package	LOC (src)	Responsibility
pi-ai	~30K	Unified LLM API across 25+ providers
pi-agent-core	~8K	Stateful agent loop with tool execution
pi-tui	~11K	Terminal UI with differential rendering
pi-coding-agent	~47K	The actual coding CLI

Each independently publishable. The coding agent is the most opinionated configuration of the underlying stack.

P14Build on raw provider SDKs, not on a meta-SDK.

pi-ai depends on @anthropic-ai/sdk, openai, @google/genai, @aws-sdk/client-bedrock-runtime, @mistralai/mistralai directly. Mario's argument against Vercel AI SDK: leaky abstractions, accumulated baggage, poor tool-call support for self-hosted models. Cost: ~30K LOC. Payoff: exact request/response control per provider.

P15Lazy-load providers.

Even with 25+ providers, you don't pay for SDKs you don't use. See packages/ai/src/providers/register-builtins.ts:

function createLazyStream(loadModule) {
  return (model, context, options) => {
    const outer = new AssistantMessageEventStream();
    loadModule()
      .then(module => forwardStream(outer, module.stream(model, context, options)))
      .catch(error => {
        outer.push({ type: "error", reason: "error", error: msg });
        outer.end(msg);
      });
    return outer; // Returns immediately, load happens async
  };
}

The stream returns immediately; the import happens in the background. Import failure → error event in the stream. Callers never see thrown exceptions.

P16Every streaming event is a delta + a snapshot.

Every AssistantMessageEvent in packages/ai/src/types.ts includes partial: AssistantMessage — the accumulated state. Consumers can render incrementally (using delta) or re-render from scratch (using partial).

P17Streams are async-iterable AND promise-able.

class EventStream<T, R = T> implements AsyncIterable<T> {
  constructor(isComplete, extractResult)
  // for await (event of stream)   →  events
  // await stream.result()         →  final result
}

The same object serves both consumers. See packages/ai/src/utils/event-stream.ts.

P18Cross-provider compatibility via stateful transformation.

packages/ai/src/providers/transform-messages.ts is the unsung hero. When you switch Claude → GPT-5 mid-conversation:

User and toolResult messages pass through unchanged.
Assistant messages from the same provider stay as-is.
Assistant messages from a different provider have their thinking blocks converted to <thinking>...</thinking> text. Signed thinking blobs are dropped if cross-provider.
Tool call IDs are remapped (Anthropic 1–64 chars vs OpenAI 450+ chars vs Google integers).
Orphaned tool calls get synthetic toolResult messages with isError: true.
Errored/aborted assistant messages dropped entirely — replaying them violates API invariants.

P19Abort is first-class.

Every StreamOptions accepts signal?: AbortSignal. It threads through provider SDK calls. On abort:

Provider streams stop yielding events.
Pi captures the partial output.
Emits { type: "error", reason: "aborted", error: outputWithPartialContent }.
The final AssistantMessage has stopReason: "aborted" but includes all content received before abort.

Aborted messages can be re-added to context with "Please continue". Demoed in packages/ai/README.md.

P20Tool validation throws into the event stream, not the program.

TypeBox schemas compiled once and cached in WeakMap. Streaming JSON has a 3-tier fallback (see utils/json-parse.ts). Validation errors become tool results with isError: true — the model self-corrects on the next turn.

the worldview

Errors are values, not exceptions. Provider load fails → event-stream error. Validation fails → tool result with isError: true. Tool runtime throws → tool result. LLM errors → stopReason: "error". The whole stack is un-throwable from the consumer's perspective. You never wrap a pi call in try/catch "just in case" — there is no catch. This isn't a pattern; it's a worldview that shows up in 5+ places.

Section D
Agent-loop principles

P21Copy-on-write state with getter/setter exposure.

In packages/agent/src/agent.ts:

get tools() { return tools; },
set tools(nextTools) { tools = nextTools.slice(); },  // copy on assign
get messages() { return messages; },
set messages(nextMessages) { messages = nextMessages.slice(); },

External code can mutate the array it gets back (agent.state.messages.push(...) works), but assigning a new array always copies. Protection against accidental external mutation, ergonomic API.

P22Subscribers as barriers.

for (const listener of this.listeners) {
  await listener(event, signal);
}

JavaScript's await ordering gives you a barrier mechanism for free. If a listener awaits during a message_end event, the next tool_execution_start event waits. Elegant — no framework-level barrier syntax.

P23Steering and follow-up are two queues with two polling sites.

Steering = "interrupt me while you're working" (Enter while agent is running). Follow-up = "do this after you naturally stop" (Alt+Enter). Each queue has two modes: "one-at-a-time" (default, drain one per turn) and "all" (drain whole queue). The same PendingMessageQueue.drain() implementation handles both. See packages/agent/src/agent-loop.ts.

P24Termination by unanimous vote.

A tool can return { terminate: true }. But the loop only stops if every tool in the batch terminates. Mixed batches continue normally. This is how you implement "notify_done" tools without breaking parallel execution.

P25Tool errors are tool results, not exceptions.

try {
  const result = await tool.execute(...);
  return { result, isError: false };
} catch (error) {
  return {
    result: createErrorToolResult(error.message),
    isError: true,
  };
}

Throws in execute() are caught, turned into tool results, fed back to the model. A single tool failure never crashes the agent.

P26Parallel by default, sequential when needed.

toolExecution: "parallel" (default) preflights tool calls sequentially, then executes allowed ones concurrently. Tool results re-ordered into assistant source order for the transcript. Per-tool override: executionMode: "sequential" forces the entire batch sequential — conservative-promotion-wins prevents race bugs.

P27`transformContext` then `convertToLlm` — two-phase context shaping.

AgentMessage[]
  → transformContext  (optional: prune/inject)   → AgentMessage[]
  → convertToLlm      (required: filter custom)   → Message[]
  → LLM

Extensions extend CustomAgentMessages via declaration merging, then drop or transform them in convertToLlm. Context engineering at two different abstraction levels without coupling them.

Section E
TUI principles

P28Stream-based UI over full-screen.

Mario: "Coding agents have this nice property that they're basically a chat interface…everything is nicely linear, which lends itself well to working with the 'native' terminal emulator." You scroll, copy, search with your terminal's built-in features. You don't reimplement them.

P29Three-strategy differential rendering.

In packages/tui/src/tui.ts:

First render — output all lines, no clear.
Full clear & redraw — width changed, content shrank, or viewport scrolled past visible area.
Normal update — find first differing line, jump cursor there, clear-to-end, render the tail.

All wrapped in synchronized output escapes: CSI ?2026h ... CSI ?2026l. The terminal buffers between begin/end, then flushes atomically. No partial-state flicker.

Critical edge case in tui.ts:1137-1141: if the first changed line is above the viewport (user scrolled back), full clear — you can't update lines you can't see.

P30Components return arrays of strings, not buffers.

interface Component {
  render(width: number): string[];      // each line ≤ width or TUI errors
  handleInput?(data: string): void;
  invalidate?(): void;
}

Three methods. No virtual DOM, no reactive layer, no JSX. Caching by convention. Render is pure: given width, produce lines.

P31IME cursor positioning via APC zero-width markers.

For CJK input methods to show their candidate window at the right place, the hardware cursor has to be positioned at the logical input cursor. Pi solves this with an Application Program Command escape sequence:

const CURSOR_MARKER = "\x1b_pi:c\x07";

The Editor emits this marker at the cursor position. The TUI scans rendered lines for it, computes the column (visibleWidth(line.slice(0, markerIndex))), strips the marker, and positions the hardware cursor there. Zero-width means it doesn't affect width calculations.

P32Bracketed paste + `[paste #N +M lines]` markers.

\x1b[?2004h enables bracketed paste mode — the terminal tells you when input is a paste vs typed. Pastes >10 lines fold into a marker in the editor display; the actual content is held separately. Keeps the editor responsive on huge pastes (think log dumps) without losing the data.

P33Kitty keyboard protocol with graceful fallback.

Kitty's protocol (CSI ?u) sends unambiguous key codes including modifiers — solving the eternal problem that Ctrl+I and Tab are the same byte. Pi queries on startup, falls back to xterm modifyOtherKeys (CSI >4;2m) after 50ms.

P34Kitty/iTerm2 inline images.

The Image component supports the Kitty graphics protocol (Kitty/Ghostty/WezTerm) and iTerm2 inline images, with text-placeholder fallback. Inline screenshot rendering in the chat without an external viewer.

the trick that makes it flicker-free

Synchronized output (CSI ?2026h ... ?2026l) is the single biggest reason pi's TUI feels smooth. The terminal buffers everything between the markers, then paints in one frame. No partial-state flicker. Combine that with the three-strategy differential diff (first-render, full-clear, normal-update) and you get a chat-style UI that updates 60× per second without redrawing the whole screen. Ghostty + iTerm2 + Kitty all support it; modern terminal UIs should just do this.

Section F
Coding-agent principles

P35Sessions as JSONL trees, not linear logs.

Every session is a JSONL file at ~/.pi/agent/sessions/<cwd-hash>/<session-id>.jsonl. Each entry has id + parentId. This is a tree, not a list — branching is in-place.

See packages/coding-agent/src/core/session-manager.ts and docs/session-format.md. /tree lets you navigate to any prior point. /fork creates a new session with parentSession pointer. /clone duplicates the active branch.

The genius: rebuilding context from a leaf is just walking parentId back to root. No special branching logic — graph traversal.

P36Compaction as another tree entry.

{"type":"session","version":3,"id":"...","cwd":"..."}
{"type":"message","id":"a1","parentId":null,"message":{...}}
{"type":"message","id":"a2","parentId":"a1","message":{...}}
{"type":"compaction","id":"a3","parentId":"a2",
  "firstKeptEntryId":"a1","summary":"...","tokensBefore":45000}
{"type":"message","id":"a4","parentId":"a3","message":{...}}

When rebuilding context for an LLM call, the compaction entry says "everything before me, replace with this summary." Full history stays in the file (so /tree can revisit pre-compaction state) but the LLM sees a compressed view. Default keep-recent: 20K tokens; reserve: 16K. See .../compaction/compaction.ts.

P37Context files merged root-down to cwd.

AGENTS.md/CLAUDE.md discovery walks from ~/.pi/agent/ (global) → ancestors → cwd, collecting in reverse so the deepest file is appended last (highest priority). Injected as XML-wrapped sections:

<project_context>
<project_instructions path="/path/to/AGENTS.md">
... content ...
</project_instructions>
</project_context>

See packages/coding-agent/src/core/resource-loader.ts.

P38Skills are content, not tools.

Skills are markdown files with frontmatter, discovered in ~/.pi/agent/skills/<name>/SKILL.md. They're appended to the system prompt as text (only if the read tool is loaded). Not registered as tools. The model decides whether to load and follow a skill. Much lighter-weight than tools. See .../core/skills.ts.

P39Prompt templates with shell-style substitution.

Typing /review focus on auth expands a template. Supports $1, $2, $@, ${@:N}, ${@:N:L}. Non-recursive. Stored as plain markdown. Sharable via pi packages. See .../core/prompt-templates.ts.

P40Extensions are TypeScript modules with a fat API.

export default async function (pi: ExtensionAPI) {
  pi.registerTool({ name: "deploy", ... });
  pi.registerCommand("stats", { ... });
  pi.on("tool_call", async (event, ctx) => { ... });
}

Default export can be async — supports remote model list fetches, etc. Extensions get a fat ExtensionAPI: register tools, commands, providers, event hooks, UI state, full TUI component access. Examples in examples/extensions/.

Security: Pi packages run with full system access. No sandbox. Same trust model as VS Code extensions or shell aliases. The README is explicit: "Review source code before installing third-party packages."

P41Three runtime modes from one binary.

Mode	Use case	Code
`pi`	Interactive TUI	modes/interactive/
`pi --print`	Text streaming, no TUI	modes/print-mode.ts
`pi --mode json`	JSONL event stream	docs/json.md
`pi --mode rpc`	JSON-RPC 2.0 for IDE integration	docs/rpc.md

The RPC mode warning is hard-earned: "Clients must split records on \n only. Do not use generic line readers like Node readline, which also split on Unicode separators inside JSON payloads."

P42SDK as first-class entry.

The same code that powers the CLI is exported as a library:

import {
  AuthStorage, createAgentSession,
  ModelRegistry, SessionManager
} from "@earendil-works/pi-coding-agent";

const authStorage = AuthStorage.create();
const modelRegistry = ModelRegistry.create(authStorage);
const { session } = await createAgentSession({
  sessionManager: SessionManager.inMemory(),
  authStorage,
  modelRegistry,
});
await session.prompt("What files are in the current directory?");

Real-world embed: openclaw/openclaw. The CLI is a thin wrapper over the library. See docs/sdk.md.

the structural payoff

JSONL with parentId pointers gives you branching, time-travel, and compaction for free. It's the same data structure as a git commit graph. /tree = checkout. /fork = branch. Compaction = a non-destructive squash that adds an entry instead of rewriting history. You can grep your entire conversation history, copy a session to a coworker by sending one file, or replay your debugging session for a bug report. The data structure is the feature.

4. Architecture — How the pieces fit

2 min read·4 packages · ~96K LOC

TL;DR: Four packages, lockstep-versioned. The coding agent is the most opinionated configuration of an otherwise reusable stack.

┌─────────────────────────────────────────────────────────────┐
│  pi-coding-agent                                            │
│  - CLI / Interactive / Print / JSON / RPC modes             │
│  - System prompt builder (sub-1000 tokens)                  │
│  - 4 tools: read, write, edit, bash (+ grep, find, ls)      │
│  - Sessions as JSONL trees + compaction                     │
│  - Extensions, skills, prompts, themes loaders              │
│  - Package manager (npm/git installs)                       │
└──────────────┬──────────────────────────────┬───────────────┘
               │ uses                          │ uses
               ▼                               ▼
┌──────────────────────────┐    ┌──────────────────────────┐
│  pi-agent-core           │    │  pi-tui                  │
│  - Agent class           │    │  - TUI render loop       │
│  - agentLoop             │    │  - Differential render   │
│  - beforeToolCall /      │    │  - Components (Editor,   │
│    afterToolCall hooks   │    │    Markdown, ...)        │
│  - Custom message types  │    │  - Overlays, IME cursor  │
│  - streamProxy for       │    │  - Kitty protocol +      │
│    browser/headless      │    │    fallback              │
└─────────────┬────────────┘    └──────────────────────────┘
              │ uses
              ▼
┌──────────────────────────────────────────────────────────────┐
│  pi-ai                                                       │
│  - getModel / stream / complete                              │
│  - 25+ providers via lazy-loaded registry                    │
│  - Unified Context / Message / AssistantMessage types        │
│  - AssistantMessageEventStream                               │
│  - TypeBox tool schemas + cached validators                  │
│  - Cross-provider message transformation                      │
│  - AbortSignal throughout, partial results on abort           │
│  - Token + cost tracking per message                          │
└──────────────────────────────────────────────────────────────┘

What happens on a single user message

User hits Enter in the Editor component.
pi-coding-agent's interactive mode calls agent.prompt(text, images).
Agent enqueues the message and calls runAgentLoop.
The loop emits agent_start, turn_start, message_start/end for the user prompt.
transformContext(messages) runs (extensions can prune here).
convertToLlm(messages) filters custom message types.
streamFn(model, context, options) — by default pi-ai's streamSimple.
pi-ai looks up model.api in the lazy registry, imports the provider, calls the SDK, returns an AssistantMessageEventStream.
Events stream in (text_delta, toolcall_delta, thinking_delta). Agent emits message_update with each one. TUI components update live.
On stream end: if tool calls, validate args via TypeBox, run beforeToolCall, execute tools, emit tool_execution_*, run afterToolCall, push toolResult messages.
Check steering queue (if user typed mid-execution). Check follow-up queue. Eventually emit agent_end.
SessionManager appends each new message as a JSONL entry.
TUI re-renders only the changed lines via synchronized output.

5. Detailed Architecture Diagrams

6 min skim·9 diagrams

TL;DR: Nine annotated diagrams — system map, request lifecycle, agent-loop state machine, streaming pipeline, tool execution, session tree, TUI rendering, startup, data model. Read these instead of the code.

Section 4 was the bird's-eye view. This section unpacks pi at the level of files, functions, and event flows. Nine diagrams, each annotated. Drag horizontally on narrow screens.

Color key: pi-coding-agent pi-agent-core pi-tui pi-ai external

5.1System map: packages, submodules, external boundaries

Where every major file lives. Each subgraph is a package; nodes list the source files or directories. External systems (LLM providers, the terminal, the filesystem) sit on the periphery.

System Map

flowchart · LR

flowchart LR
    classDef ai fill:#1a0f1c,stroke:#ff7a90,color:#e6e6e8
    classDef ac fill:#0f1820,stroke:#6ec1e4,color:#e6e6e8
    classDef tui fill:#11201a,stroke:#b8e986,color:#e6e6e8
    classDef ca fill:#1f1611,stroke:#ffb454,color:#e6e6e8
    classDef ext fill:#0c0c0e,stroke:#8a8a96,color:#a9a9b3,stroke-dasharray: 4 3

    subgraph EXT["external systems"]
      direction TB
      LLM["LLM servers\n(Anthropic, OpenAI, Google,\nVertex, Bedrock, Mistral,\nGroq, xAI, Cerebras, ...)"]:::ext
      TERM["terminal\nstdin/stdout · ANSI ·\nKitty protocol · CSI 2026"]:::ext
      FS["filesystem\n~/.pi/agent/\n  sessions/ · auth/ ·\n  extensions/ · skills/ ·\n  prompts/ · themes/ ·\n  models.json\n.pi/ (project-local)\nAGENTS.md / CLAUDE.md"]:::ext
    end

    subgraph CA["pi-coding-agent · 47K LOC"]
      direction TB
      CA_CLI["cli.ts · cli/args.ts\nflag parsing, mode resolve"]:::ca
      CA_MAIN["main.ts\nstartup orchestration"]:::ca
      CA_MODES["modes/\ninteractive · print · json · rpc"]:::ca
      CA_CORE["core/\nsystem-prompt · session-manager ·\ncompaction · auth-storage ·\nresource-loader · model-registry ·\nprompt-templates · skills · footer"]:::ca
      CA_TOOLS["core/tools/\nread · write · edit · bash\n(+ grep · find · ls)"]:::ca
      CA_EXT["core/extensions/\ndynamic loader + ExtensionAPI"]:::ca
    end

    subgraph AC["pi-agent-core · 8K LOC"]
      direction TB
      AC_AGENT["agent.ts\nAgent class · MutableAgentState\nsteering + followUp queues"]:::ac
      AC_LOOP["agent-loop.ts\nrunAgentLoop · executeToolCalls\nstreamAssistantResponse"]:::ac
      AC_PROXY["proxy.ts\nstreamProxy for browser/headless"]:::ac
      AC_TYPES["types.ts\nAgentEvent · AgentTool ·\nAgentMessage · CustomAgentMessages"]:::ac
    end

    subgraph TUI["pi-tui · 11K LOC"]
      direction TB
      TUI_TUI["tui.ts\nrender loop · diff · overlays\nIME cursor (APC marker)"]:::tui
      TUI_COMP["components/\neditor · markdown · input ·\nselect-list · settings-list ·\nimage · loader · box · ..."]:::tui
      TUI_TERM["terminal.ts\nProcessTerminal / VirtualTerminal"]:::tui
      TUI_INF["stdin-buffer · keys ·\nkeybindings · autocomplete ·\nkill-ring · undo-stack"]:::tui
      TUI_UTIL["utils.ts\nvisibleWidth · truncateToWidth ·\nwrapTextWithAnsi · sliceByColumn"]:::tui
    end

    subgraph AI["pi-ai · 30K LOC"]
      direction TB
      AI_PUB["index · stream · complete\n(public surface)"]:::ai
      AI_TYPES["types.ts\nContext · Message ·\nAssistantMessage · events ·\nModel"]:::ai
      AI_REG["api-registry +\nproviders/register-builtins\n(memoized lazy import)"]:::ai
      AI_PROV["providers/\nanthropic · openai-completions\nopenai-responses · codex\ngoogle · vertex · bedrock\nmistral · azure · ..."]:::ai
      AI_TRANS["providers/transform-messages\ncross-provider replay rewrites"]:::ai
      AI_MODELS["models + models.generated\n(provider, modelId) → Model"]:::ai
      AI_UTIL["utils/\nevent-stream · validation ·\njson-parse · oauth"]:::ai
    end

    CA_MAIN --> CA_MODES
    CA_MAIN --> CA_CORE
    CA_MAIN --> CA_EXT
    CA_MAIN --> CA_TOOLS
    CA_MODES --> AC_AGENT
    CA_MODES --> TUI_TUI
    CA_EXT --> AC_AGENT
    AC_AGENT --> AC_LOOP
    AC_LOOP --> CA_TOOLS
    AC_LOOP --> AI_PUB
    AI_PUB --> AI_REG
    AI_REG --> AI_PROV
    AI_PROV --> AI_TRANS
    AI_PROV --> LLM
    TUI_TUI --> TUI_TERM
    TUI_TUI --> TUI_COMP
    TUI_COMP --> TUI_INF
    TUI_COMP --> TUI_UTIL
    TUI_TERM --> TERM
    CA_CORE --> FS

Read it as: pi-coding-agent drives — it parses flags, picks a mode, assembles an Agent backed by pi-agent-core, drawing tools from core/tools/ and UI from pi-tui. pi-ai is the bottom layer that speaks to every provider; transform-messages.ts is the bridge that lets you swap providers mid-conversation.

5.2End-to-end request lifecycle

What happens between a user keystroke and a fully-rendered assistant turn, including the streaming inner loop, tool execution, and session persistence. 30+ numbered steps.

Request Lifecycle

sequence

sequenceDiagram
    autonumber
    participant U as User
    participant T as Terminal + StdinBuffer
    participant Ed as Editor
    participant App as InteractiveMode
    participant Ag as Agent
    participant L as agentLoop
    participant AI as pi-ai
    participant P as Provider
    participant LLM as LLM server
    participant S as SessionManager

    U->>T: keystroke
    T->>Ed: demux + handleInput
    Note right of Ed: update buffer, autocomplete,\nbracketed paste, kill-ring/undo
    U->>T: Enter
    Ed->>App: onSubmit(text, images)
    App->>Ag: agent.prompt(text, images)
    Ag->>L: runAgentLoop(userMsg, context, config)
    L->>Ag: emit agent_start
    L->>Ag: emit turn_start
    L->>Ag: emit message_start / end (user)
    Ag->>S: append user msg to JSONL

    rect rgba(255,180,84,0.05)
    Note over L: TURN N
    L->>L: drain pending steering messages
    L->>L: transformContext + convertToLlm
    L->>AI: streamFn(model, context, options)
    AI->>AI: resolveApiProvider(model.api)
    AI->>P: lazy import → stream(...)
    P->>P: buildParams (transformMessages,\ntool-ID norm, cache_control)
    P->>LLM: POST messages stream:true + AbortSignal
    loop SSE streaming
        LLM-->>P: SSE event
        P->>P: parse, build content blocks,\npartial JSON for tool args
        P-->>AI: standardized event
        AI-->>L: AssistantMessageEvent
        L-->>Ag: message_update + delta
        Ag-->>App: subscribers await (barrier)
        App->>App: TUI requestRender → diff →\nsynchronized output → terminal
    end
    LLM-->>P: message_stop
    P-->>AI: done + final AssistantMessage
    L-->>Ag: emit message_end (assistant)
    Ag->>S: append assistant to JSONL
    end

    alt assistant has tool calls
        rect rgba(110,193,228,0.05)
        L->>L: executeToolCalls
        loop each tool call (parallel by default)
            L->>L: validateToolCall (TypeBox)
            L->>L: beforeToolCall hook
            L-->>Ag: emit tool_execution_start
            L->>L: tool.execute(id, args, signal, onUpdate)
            L-->>Ag: tool_execution_update (if onUpdate)
            L->>L: afterToolCall hook
            L-->>Ag: emit tool_execution_end
        end
        L->>L: build ToolResultMessage[]
        Ag->>S: append toolResults to JSONL
        L-->>Ag: emit turn_end (with toolResults)
        end
    else no tool calls
        L-->>Ag: emit turn_end
    end

    L->>L: poll steeringQueue.drain()
    L->>L: if empty, poll followUpQueue.drain()
    Note over L: both empty AND no more tool calls → exit
    L-->>Ag: emit agent_end
    Ag-->>App: prompt() promise resolves
    App->>App: footer recompute (tokens, cost, ctx%)

Notice: three nested loops — SSE streaming inside the LLM call (orange band), tool execution after the assistant message (blue band), and the outer turn/steering loop that wraps everything. Every event is also persisted to the JSONL session file as it happens.

5.3Agent loop state machine

The same flow, rendered as a state machine to expose all the transitions and termination conditions in one view. This is the precise control structure inside agent-loop.ts.

Agent Loop State Machine

state · TB

stateDiagram-v2
    direction TB
    [*] --> agent_start
    agent_start --> turn_start : emit agent_start
    turn_start --> inject_pending : drain steering + pending
    inject_pending --> stream_llm : emit message_start/end\nfor each pending
    stream_llm --> stream_llm : delta events\n(text · thinking · toolcall)
    stream_llm --> on_done : done event
    on_done --> exec_tools : stopReason = toolUse
    on_done --> turn_end : stopReason = stop / length
    on_done --> agent_end_err : stopReason = error / aborted

    state exec_tools {
        direction TB
        [*] --> validate
        validate --> preflight : args ok
        validate --> error_result : TypeBox fail
        preflight --> execute : ok
        preflight --> error_result : blocked
        execute --> after_hook : success
        execute --> error_result : threw
        error_result --> after_hook
        after_hook --> [*]
    }
    exec_tools --> append_results : all tools done
    append_results --> turn_end : emit toolResult msgs

    turn_end --> poll_steering : emit turn_end
    poll_steering --> inject_pending : has steering
    poll_steering --> turn_start : has more tool calls\n(continue same turn)
    poll_steering --> poll_followup : both empty
    poll_followup --> inject_pending : has follow-up
    poll_followup --> agent_end : empty
    agent_end --> [*]
    agent_end_err --> [*]

The two queues show up clearly: poll_steering right after each turn (interrupts), and poll_followup only when steering is empty (continuations). The inner exec_tools state never throws — every failure path leads to error_result → after_hook, which becomes a tool result with isError: true.

5.4LLM streaming pipeline (close-up of one provider call)

Zoom in on what happens inside pi-ai for a single LLM call. From unified Context → provider-specific JSON → SSE → standardized events → consumer.

Streaming Pipeline

flowchart · TB

flowchart TB
    classDef ai fill:#1a0f1c,stroke:#ff7a90,color:#e6e6e8
    classDef ext fill:#0c0c0e,stroke:#8a8a96,color:#a9a9b3,stroke-dasharray: 4 3
    classDef out fill:#1f1611,stroke:#ffb454,color:#e6e6e8
    classDef branch fill:#0a0a0c,stroke:#6ec1e4,color:#e6e6e8

    A["Context\n{ systemPrompt, messages, tools }"]:::ai
    A --> T["transformMessages(messages, model)\n• thinking → text on cross-model replay\n• tool ID remap\n• drop errored / aborted\n• inject synthetic toolResults"]:::ai
    T --> B["provider.buildParams(...)\n• provider-specific JSON shape\n• cache_control markers (Anthropic)\n• tool schema serialization"]:::ai
    B --> C["+ AbortSignal\n+ request options (timeout, retries)"]:::ai
    C --> SDK["provider SDK\ne.g. anthropic.messages.create({stream:true})"]:::ai
    SDK --> H["HTTP POST + Server-Sent Events"]:::ext
    H --> LLM["LLM server"]:::ext
    LLM -.->|SSE chunks| H
    H -.->|raw events| SDK
    SDK -.->|RawProviderEvent| ITER["iterateXxxEvents(response, signal)\n(per-provider iterator)"]:::ai
    ITER --> CB{"event
type?"}:::branch
    CB -->|message_start| US["read input / output / cache tokens\n→ output.usage"]:::ai
    CB -->|content_block_start| BS["create Text / Thinking / ToolCall block\nat the next content position"]:::ai
    CB -->|text_delta| TD["append block.text\npush text_delta event\nwith delta + partial + contentIndex"]:::ai
    CB -->|input_json_delta| JD["append to partialJson\nparseStreamingJson (3-tier fallback)\npush toolcall_delta event"]:::ai
    CB -->|thinking_delta| THD["append block.thinking\npush thinking_delta event"]:::ai
    CB -->|message_stop| MS["finalize usage\ncalculateCost(model, usage)\npush 'done' event"]:::ai

    US --> ES
    BS --> ES
    TD --> ES
    JD --> ES
    THD --> ES
    MS --> ES

    ES["AssistantMessageEventStream\nasync-iterable + .result()"]:::out
    ES --> CONS["consumer\n(agentLoop / app)"]:::out

    SDK -.->|signal.aborted| AB["catch in iterator"]:::ai
    AB --> ABF["output.stopReason = 'aborted'\npush 'error' event\nwith reason = 'aborted'"]:::ai
    ABF --> ABE["return partial AssistantMessage\n(content + usage so far)"]:::out

Every standardized event carries partial: AssistantMessage — the accumulated state. So even if you only listen for text_delta, you can always read the current full message. On abort, the catch path produces an 'error' event with reason='aborted' AND a complete AssistantMessage with the partial content. Aborted runs are continuable.

5.5Tool execution detail

What happens between "the LLM emitted a tool call" and "the next turn starts with the result." Includes validation, hooks, parallel execution, streaming updates, and error handling.

Tool Execution

sequence

sequenceDiagram
    autonumber
    participant L as agentLoop
    participant V as validate (TypeBox)
    participant BH as beforeToolCall
    participant T as tool.execute
    participant AH as afterToolCall
    participant Sub as subscribers (TUI · session)

    Note over L: assistant message has tool calls
    L->>L: gather tool calls in source order
    L->>L: pick executionMode\n(parallel vs sequential)

    par for each tool call
        L->>V: validateToolCall(tools, call)
        V-->>L: validated args OR throws
        alt validation failed
            L->>L: createErrorToolResult(msg)
            L-->>L: skip preflight + execute
        end
        L->>BH: { assistantMessage, toolCall,\nargs, context, signal }
        BH-->>L: { block?, reason? }
        alt block
            L->>L: createErrorToolResult(reason)
            L-->>L: skip execute
        end
        L->>Sub: emit tool_execution_start
        L->>T: execute(id, args, signal, onUpdate)
        activate T
        loop optional streaming
            T->>Sub: onUpdate(partialResult)
            Sub-->>T: tool_execution_update emitted
        end
        T-->>L: { content, details, terminate? }\nOR throws
        deactivate T
        alt threw
            L->>L: createErrorToolResult(err.message)
        end
        L->>AH: { toolCall, args, result,\nisError, context }
        AH-->>L: optional override\n{ content, details, isError, terminate }
        L->>Sub: emit tool_execution_end\n{ id, name, result, isError }
    end

    L->>L: build ToolResultMessage[]\n(content + isError + timestamp)
    L->>L: terminate = every result.terminate
    L->>Sub: append toolResult msgs\n(message_start/end events)
    L->>Sub: emit turn_end\n{ message, toolResults }

The "errors are values" pattern shows up in three places: invalid args, blocked-by-hook, and thrown-from-execute all funnel into createErrorToolResult, which becomes a normal tool result with isError: true. The agent loop never throws. The model self-corrects on the next turn.

5.6Session JSONL tree (data structure)

How conversations are persisted. Each entry has an id and parentId, forming a tree. Branches happen in-place via /tree. Compaction is just another entry pointing at firstKeptEntryId.

Session JSONL Tree

ascii · data

~/.pi/agent/sessions/<cwd-hash>/<session-id>.jsonl

┌────────────────────────────────────────────────────────────────┐
│ {"type":"session", "version":3, "id":"<uuid>",                 │
│  "cwd":"/home/me/proj", "parentSession":null,                  │
│  "timestamp":"2026-05-21T..."}                                 │
└────────────────────────────────────────────────────────────────┘
                              │  parentId pointer
                              ▼
┌────────────────────────────────────────────────────────────────┐
│ id="a1"  parentId=null      role=user                          │
│ "fix the failing test in src/foo.ts"                           │
└────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌────────────────────────────────────────────────────────────────┐
│ id="a2"  parentId="a1"      role=assistant                     │
│ content: [thinking, text, toolCall(read), toolCall(grep)]      │
│ usage:   { input:1234, output:567, cost:0.0042 }               │
└────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌────────────────────────────────────────────────────────────────┐
│ id="a3"  parentId="a2"      role=toolResult                    │
│ toolCallId="...", content:[{type:"text", text:"file contents"}]│
└────────────────────────────────────────────────────────────────┘
                              │
            ┌─────────────────┴────────────────────┐
            ▼                                      ▼
┌──────────────────────────────┐    ┌──────────────────────────────┐
│ id="a4" parentId="a3"        │    │ id="b1" parentId="a3"        │  ← branch
│ role=assistant               │    │ role=assistant               │  (you used /tree
│ "Found root cause line 42"   │    │ "Let me refactor first"      │   to re-anchor)
└──────────────────────────────┘    └──────────────────────────────┘
            │                                      │
            ▼                                      ▼
          ....                                   ....
            │
            ▼
┌────────────────────────────────────────────────────────────────┐
│ id="c1"  parentId="..."  type=compaction                       │
│ firstKeptEntryId="a3"  tokensBefore=45000  tokensAfter=4200    │
│ summary: "## Accomplishments\n- ...\n## Files Modified\n..."   │
│ filesModified: ["src/foo.ts", ...]                             │
│ filesRead:     [...]                                           │
└────────────────────────────────────────────────────────────────┘
                              │
                              ▼   on next replay, LLM sees:
                                  [synthetic summary, post-compaction msgs]

═══════════════════════════════════════════════════════════════════════════════

  RULES
  ─────
  • Append-only — old entries are NEVER rewritten
  • Active conversation = walk parentId from current leaf back to root
  • /tree picks any prior entry to re-anchor; next prompts get that as parentId
  • /fork copies the active path into a NEW file with parentSession=<src-id>
  • /clone duplicates the active branch into a new file in the same dir
  • Compaction is a NEW entry — pre-compaction history remains on disk and is
    reachable via /tree
  • Schema migrations (v1→v2→v3) handled in session-manager.ts on load

The magic is in the parentId field. A single file represents the entire conversation graph. "Continue the chat" = append a new entry pointing at the current leaf. "Go back and try again" = pick an earlier id, prompt, and the new entry points at it. Compaction is non-destructive — you can always rewind.

5.7TUI rendering pipeline

What happens inside tui.ts between a requestRender() call and characters reaching the terminal. The three-strategy diff is the heart of it.

TUI Rendering Pipeline

flowchart · TB

flowchart TB
    classDef tui fill:#11201a,stroke:#b8e986,color:#e6e6e8
    classDef out fill:#1f1611,stroke:#ffb454,color:#e6e6e8
    classDef dec fill:#0c0c0e,stroke:#6ec1e4,color:#e6e6e8

    REQ["requestRender(force?)\ncalled by any component\nor external state change"]:::tui
    REQ --> DEB["debounce ≈ 16 ms\nMIN_RENDER_INTERVAL_MS"]:::tui
    DEB --> DO["doRender()"]:::tui
    DO --> WALK["walk Component tree\nrender(width): string[]\nfor each child"]:::tui
    WALK --> COMP["composite overlays by focusOrder\ncompositeLineAt(base, overlay, ...)"]:::tui
    COMP --> CUR["extractCursorPosition\nscan lines for CURSOR_MARKER (APC)\nstrip + record row/col"]:::tui
    CUR --> RST["append ANSI SGR reset +\nOSC 8 reset at end of each line"]:::tui
    RST --> DEC{"render
strategy"}:::dec

    DEC -->|first render| FIRST["output all lines\n(no clear)"]:::out
    DEC -->|width changed| FULL["clear scrollback + screen\nCSI 2J + 3J + H\noutput all lines"]:::out
    DEC -->|height changed| FULL
    DEC -->|"content shrank · no overlays"| FULL
    DEC -->|"firstChanged < viewportTop"| FULL
    DEC -->|normal update| DIFF["find firstChanged / lastChanged\nexpand for Kitty image IDs\ncursor → firstChanged\nclear-to-end · output tail"]:::out

    FIRST --> WRAP
    FULL --> WRAP
    DIFF --> WRAP

    WRAP["wrap in synchronized output:\nCSI ?2026h ... CSI ?2026l"]:::out
    WRAP --> WRITE["terminal.write(buffer)"]:::out
    WRITE --> HW["position hardware cursor\nat extracted row/col\nshow / hide cursor"]:::out
    HW --> STATE["update previousLines,\npreviousWidth, previousHeight,\npreviousViewportTop,\nhardwareCursorRow,\nmaxLinesRendered"]:::tui

The atomic-update trick: wrapping everything in CSI ?2026h ... ?2026l tells modern terminals (Ghostty, iTerm2, Kitty) to buffer all writes between the markers and flush them in one paint. No partial-state flicker. The IME trick: components emit a zero-width APC marker at the logical cursor; the TUI strips it and positions the hardware cursor there. CJK candidate windows then appear in the right place.

5.8Startup & resource loading lifecycle

The sequence of work between pi the binary launching and the interactive editor being ready to accept input. ~20 numbered steps.

Startup Lifecycle

sequence

sequenceDiagram
    autonumber
    participant Bin as pi binary
    participant Args as cli/args
    participant Main as main.ts
    participant Auth as AuthStorage
    participant Set as SettingsManager
    participant Res as ResourceLoader
    participant MR as ModelRegistry
    participant SM as SessionManager
    participant Ctx as ContextFiles
    participant SP as system-prompt
    participant Ext as Extensions
    participant Ag as Agent
    participant TUI as TUI

    Bin->>Args: parse argv
    Args-->>Main: { mode, model, tools, session, ... }
    Main->>Main: resolve mode\n(interactive / print / json / rpc)

    Main->>Auth: load credentials\n(~/.pi/agent/auth/ + .pi/auth/)
    Main->>Set: load global + project\nsettings.json (deep merge)
    Main->>MR: register built-in models\n+ ~/.pi/agent/models.json (custom)

    Main->>Res: discover extensions\n(~/.pi/agent/extensions/,\n.pi/extensions/, pi packages)
    Main->>Res: discover skills\n(~/.pi/agent/skills/, .pi/skills/)
    Main->>Res: discover prompt templates
    Main->>Res: discover themes

    Main->>Ctx: walk AGENTS.md / CLAUDE.md\n(global → ancestors → cwd,\ndeepest last for priority)

    Main->>SM: load session\n(--continue / --resume /\n--session / new)
    SM-->>Main: AgentMessage[] from JSONL tree\n(walk parentId leaf → root)

    loop each extension
        Main->>Ext: import(path)
        Ext-->>Main: default(pi: ExtensionAPI)
        Note over Ext: registerTool / registerCommand /\nregisterProvider / on(event) /\ninstallTUIComponent
        Main->>Main: await (factory may be async)
    end

    Main->>SP: build system prompt\n(tools + guidelines + context\nfiles + skills + date + cwd)
    SP-->>Main: prompt text (~700–1000 tokens)

    Main->>Ag: new Agent({ initialState,\nconvertToLlm, transformContext,\ntools, hooks, ... })

    alt interactive mode
        Main->>TUI: start ProcessTerminal\n(raw mode, bracketed paste,\nKitty protocol query)
        Main->>TUI: addChild(Editor, Footer, ...)
        Main->>TUI: setFocus(Editor)
        Main->>Ag: agent.subscribe(uiListener)
        TUI-->>Bin: ready, prompt loop
    else print / json / rpc
        Main->>Ag: programmatic prompt flow
    end

Order matters. Settings are loaded before extensions (so extensions can read them). Extensions are loaded before the system prompt (so the tools they register show up in the prompt's tools list). Context files are merged deepest-last so cwd-local rules win. Async extension factories are awaited — that's how an extension can fetch remote model lists before pi displays them.

5.9Data model: Context / Message / Event types

The relationships among the core types in pi-ai/src/types.ts. Everything you put on the wire and everything you get back maps to this structure.

Data Model

class · UML

classDiagram
    direction LR
    class Context {
        +systemPrompt?
        +messages: Message[]
        +tools?: Tool[]
    }
    class Message {
        <<union>>
        UserMessage · AssistantMessage · ToolResultMessage
    }
    class UserMessage {
        +role: user
        +content
        +timestamp?
    }
    class AssistantMessage {
        +role: assistant
        +content: AssistantContentBlock[]
        +api · provider · model
        +responseModel?
        +usage: Usage
        +stopReason
        +errorMessage?
        +responseId?
    }
    class ToolResultMessage {
        +role: toolResult
        +toolCallId
        +toolName
        +content
        +isError
    }
    class AssistantContentBlock {
        <<union>>
        TextContent · ThinkingContent · ToolCall
    }
    class TextContent {
        +type: text
        +text
    }
    class ThinkingContent {
        +type: thinking
        +thinking
        +thinkingSignature?
        +redacted?
    }
    class ToolCall {
        +type: toolCall
        +id · name · arguments
        +partialJson?
        +thoughtSignature?
    }
    class Tool {
        +name · description
        +parameters: TSchema (TypeBox)
    }
    class AgentTool {
        +label
        +executionMode?
        +execute(id, args, signal, onUpdate)
        +prepareArguments?
    }
    class Model {
        +id · api · provider
        +baseUrl?
        +contextWindow · maxTokens
        +input · reasoning
        +cost: ModelCost
        +thinkingLevelMap?
        +compat?
    }
    class Usage {
        +input · output
        +cacheRead · cacheWrite
        +totalTokens
        +cost: CostBreakdown
    }
    class AssistantMessageEvent {
        <<union>>
        start · text_* · thinking_* · toolcall_* · done · error
    }

    Context "1" *-- "*" Message
    Context "1" *-- "*" Tool
    Message <|-- UserMessage
    Message <|-- AssistantMessage
    Message <|-- ToolResultMessage
    AssistantMessage "1" *-- "*" AssistantContentBlock
    AssistantContentBlock <|-- TextContent
    AssistantContentBlock <|-- ThinkingContent
    AssistantContentBlock <|-- ToolCall
    AssistantMessage "1" *-- "1" Usage
    Tool <|-- AgentTool
    AgentTool ..> ToolCall : invoked via
    ToolCall ..> ToolResultMessage : answered by

This is the entire surface of pi-ai. Everything else (providers, transforms, streaming) maps to and from these types. AssistantMessage.content is polymorphic by design — different providers produce different sequences of Text / Thinking / ToolCall blocks, but pi normalizes the events so consumers don't have to care.

Code refs for each diagram:

5.1 → top-level layout in packages/
5.2, 5.3, 5.5 → packages/agent/src/agent-loop.ts · packages/agent/src/agent.ts
5.4 → providers/anthropic.ts · utils/event-stream.ts · utils/json-parse.ts
5.6 → session-manager.ts · docs/session-format.md
5.7 → packages/tui/src/tui.ts
5.8 → main.ts · resource-loader.ts
5.9 → packages/ai/src/types.ts

6. Cross-cutting Design Patterns

2 min read·7 patterns

TL;DR: Seven design patterns that recur across pi's codebase — errors-as-values, late binding, lazy loading, two-phase context shaping, partial state, snapshot+delta, append-only persistence.

Patterns I noticed in 3+ places across the codebase:

CC1Errors as values, not exceptions

Provider load failures → event-stream errors. Tool validation failures → tool results with isError: true. Tool runtime errors → tool results. LLM errors → AssistantMessage with stopReason: "error". The whole stack is un-throwable from the consumer's perspective. The only place you handle errors is by looking at stopReason / isError.

CC2Late binding via registries

Providers registered via string keys (apiProviderRegistry). Models looked up by (provider, modelId). Tools registered by name. Extensions registering more tools/commands/providers. Skills, prompts, themes — all looked up by name at runtime. No hard-coded glue between layers.

CC3Lazy by default

Provider SDKs lazy-imported. Validator instances cached in WeakMap. Markdown rendering cached per (text, width). Compaction only when needed. Telemetry skips on PI_OFFLINE=1.

CC4Two phases of context shaping

transformContext → convertToLlm. AGENTS.md files collected, then injected. Skills loaded, then formatted. The separation of "gathering" and "formatting" recurs.

CC5Partial state is always available

Aborted requests return partial content + partial usage. Tool calls during streaming have arguments as best-effort parse of partial JSON. Compaction stores the full history alongside the summary. You can always go back.

CC6Snapshot + delta

Every streaming event includes both. Lets consumers pick their abstraction level.

CC7Append-only as the persistence model

JSONL session files. Tree branching via parentId without rewriting. Compaction adds an entry, doesn't delete. Old versions of messages are still on disk. /tree to revisit.

7. Why Pi Gained Traction

3 min read·10 reasons

TL;DR: Ten reasons pi caught on — the marketing playbook for opinionated OSS. The blog post is the trojan horse; the contribution gate is the moat.

Partly speculation, but here's what I see in the artifacts:

It speaks directly to a frustration thousands of devs share. Anyone who has used Claude Code or Cursor seriously has felt the opacity. Mario named the feeling and offered a working alternative.
The author was credible. Mario is well-known in the game-dev / LLM-tools community (badlogicgames). The first 1000 stars come from existing trust.
The blog post is the trojan horse. Not a feature list — an argument. Frames every design choice as a thoughtful response to a specific failure mode. Read it.
It walks the talk. Posts the exact system prompt. Posts benchmark gists. Publishes own sessions on Hugging Face: badlogicgames/pi-mono.
Extensibility = ecosystem moat. Pi packages are just npm packages with keywords: ["pi-package"]. Browse the catalog. Barrier to authoring is comically low.
OSS session sharing as a content engine. The README explicitly asks users to publish sessions to HF via pi-share-hf. Each published session is a tutorial, training data, and social proof.
Lockstep versioning across 4 packages. No "which version of which package" hell.
Contribution gate keeps quality high. Auto-closing new-contributor issues protects maintainer attention.
Multiple modes from one binary. CLI + library + JSON stream + RPC = reaches multiple audiences.
Distribution simplicity. curl -fsSL https://pi.dev/install.sh | sh or npm i -g. Friction-to-first-prompt is maybe 60 seconds.

8. Lessons for Building Your Own

4 min read·20 lessons

TL;DR: Twenty principles that generalize beyond pi to any framework you'd build. If you only read one section, read this one.

Not pi — in pi's spirit. The principles that generalize:

Pick a specific frustration you've actually felt. Don't build a general-purpose framework. Build an answer to one well-articulated complaint. Pi answers "I can't see what my agent is doing." Your tool needs a one-sentence enemy.
Write the blog post first. If you can't write a convincing essay arguing for your design choices, you don't have a coherent product yet.
Embrace minimalism as a feature. Default to "no." Every feature has compounding maintenance cost.
Make extensibility the answer to feature requests. "Can pi do X?" → "Yes, via extensions." People are happier shipping a 50-line extension than waiting for you to merge a feature.
Separate layers so they're independently reusable. Pi-ai works without pi-coding-agent. If your packages aren't usable standalone, you have coupling you don't need.
Build the abstraction that lets you not abstract. Pi-ai unifies just enough for cross-provider handoff and unified streaming — but exposes provider-specific options for when you need them.
Streams everywhere. Even errors are events in a stream. Mental model is uniform.
AbortSignal is your friend. Plumb it through every async call. Return partial results on abort.
Errors as values. Inside the system, propagate failures as typed values. Consumers should never wrap your APIs in try/catch out of fear.
Persist everything as append-only. JSONL with parent pointers gives you branching, time-travel, replay, compaction for free.
Write the AGENTS.md for your own project. The discipline of telling an LLM how to work in your codebase forces clarity for humans too.
Use TypeBox (or Zod) for tool schemas. Get static types from a JSON Schema source-of-truth.
Sub-1000-token system prompts. Test the assumption that your model already knows the domain.
Default off, opt-in for capability. Telemetry, version checks, optional features — opt-in by env var or flag.
Don't ship sub-agents. Or make them transparent. Hidden sub-agents are anti-debuggable.
Lockstep version your own packages.
Auto-close low-quality issues. Not hostile — protective.
Publish your own usage. Sessions, blog posts, demos. Walking the talk is the cheapest, highest-converting marketing.
Read READMEs in full before assuming. Pi over-documents. The difference matters.
Decide your worldview, write it down, defend it. Pi has a philosophy section in every README. Even if you disagree, you know exactly where you stand.

the question to carry forward

If you take only one thing from this entire site, take this: the question isn't "what features do I add?" — it's "what feature would I refuse to add even if 1000 users requested it, and why?" The answer to that question is your product's soul. Pi has a soul. Most agent harnesses don't. Decide your worldview, then build the negative space around it.

9. Code Patterns to Steal

3 min read·10 snippets

TL;DR: Ten copyable snippets — each one a concrete pattern that captures one of pi's best ideas. Drop these into your own code.

Concrete enough to lift into your own codebase:

1. EventStream — both async-iterable AND awaitable

class EventStream<T, R = T> implements AsyncIterable<T> {
  constructor(
    isComplete: (event: T) => boolean,
    extractResult: (event: T) => R
  ) {}
  async result(): Promise<R> { /* drain to completion, return last R */ }
  async *[Symbol.asyncIterator](): AsyncIterator<T> { /* yield events */ }
}

packages/ai/src/utils/event-stream.ts

2. Lazy provider loading with errors-as-events

function createLazyStream(loadModule) {
  return (model, context, options) => {
    const outer = new AssistantMessageEventStream();
    loadModule()
      .then(m => forwardStream(outer, m.stream(model, context, options)))
      .catch(error => {
        outer.push({ type: "error", reason: "error", error: msg(error) });
        outer.end(msg(error));
      });
    return outer; // return immediately
  };
}

3. Copy-on-write state exposure

get tools() { return tools; },
set tools(next) { tools = next.slice(); },

4. Streaming JSON 3-tier fallback

function parseStreamingJson(s) {
  try { return parseJsonWithRepair(s); } catch {}
  try { return partialParse(s) ?? {}; } catch {}
  try { return partialParse(repairJson(s)) ?? {}; } catch {}
  return {};
}

packages/ai/src/utils/json-parse.ts

5. Tool errors → tool results

try {
  return { result: await tool.execute(...), isError: false };
} catch (e) {
  return {
    result: { content: [{ type: "text", text: e.message }] },
    isError: true,
  };
}

6. APC zero-width cursor marker for IME

const CURSOR_MARKER = "\x1b_pi:c\x07";
// Component emits at the cursor position;
// TUI strips it and positions the hardware cursor.

7. Synchronized output wrapping

const SYNC_BEGIN = "\x1b[?2026h";
const SYNC_END   = "\x1b[?2026l";
write(SYNC_BEGIN + allUpdates + SYNC_END);

8. Two-queue model: interrupt + continuation

class PendingMessageQueue {
  drain() {
    if (this.mode === "all") {
      const all = this.messages.slice();
      this.messages = [];
      return all;
    }
    const first = this.messages[0];
    if (!first) return [];
    this.messages = this.messages.slice(1);
    return [first];
  }
}

9. JSONL with parent pointers = branching for free

{"type":"message","id":"a1","parentId":null,...}
{"type":"message","id":"a2","parentId":"a1",...}
{"type":"message","id":"a3","parentId":"a1",...}   // branch from a1

// Rebuild "current conversation" = walk parentId from leaf to root.

10. AGENTS.md walk-up discovery

function discoverContextFiles(cwd) {
  const files = [];
  files.push(...readIfExists(["~/.pi/agent/AGENTS.md"]));
  let dir = cwd;
  const ancestors = [];
  while (dir !== "/") { ancestors.push(dir); dir = dirname(dir); }
  // reverse so cwd is last (highest priority)
  for (const a of ancestors.reverse()) {
    files.push(...readIfExists([join(a, "AGENTS.md"), join(a, "CLAUDE.md")]));
  }
  return files;
}

10. Closing Thought

1 min read

TL;DR: The one question to ask when designing your own product: "what feature would I refuse to ship even if 1000 users requested it?"

The single most important takeaway: pi is what happens when someone with strong taste refuses to compromise on the things they care about, and ruthlessly cuts everything else.

The codebase is technically impressive — the differential renderer, the cross-provider transformation, the lazy registry — but the real skill is in what's missing. There's no plan mode. There's no built-in to-do list. There's no MCP. There's no sub-agent tool. There's no permission popup. Each of those absences was a decision, defended in writing, with a reason you can argue with.

When you build your own thing, the question isn't "what features do I add?" It's "what feature would I refuse to add even if 1000 users requested it, and why?" The answer to that question is your product's soul.

Pi has a soul. Most agent harnesses don't.

What now?

Use the chat panel to ask follow-ups. The chat has the full synthesis as context and knows the file layout of the repo. Try:

"Walk me through what happens between the user pressing Enter and the assistant's first token appearing."
"Compare pi's session model to Claude Code's. Which would you steal for a new project?"
"Sketch a 4-tool agent I can build in 200 lines based on these principles."
"Why TypeBox over Zod for tool schemas?"

Pi — A Principle-by-Principle Deep Dive

whyCoding in the post-agent era

The shift

Why this matters

!?The core problem with today's harnesses

The deeper problem behind all five

vsBefore / after — the same workflow on two harnesses

Before · typical harness

After · pi-style harness

=What pi proposes

📖About this site

How to read this

1. Foundational Concepts

1.1What is an LLM "coding agent"?

1.2The agent loop

1.3Tools (a.k.a. function calling)

tool_call

tool_result

parallel tool calls

tool_choice

1.4System prompt

1.5Context window, tokens, compaction

token

context window

context engineering

compaction

prompt cache

thinking / reasoning

1.6Streaming & SSE

1.7TUI: stream-based vs full-screen

1.8MCP (Model Context Protocol)

1.9Claude Code features pi rejects (for contrast)

Plan mode

Sub-agents

To-do tool

Background bash

Permission popups

1.10YOLO mode

1.11Provider terminology

provider

API family

model

OpenAI-compatible

1.12JSONL session format

1.13TypeBox

2. The Problem Space Pi Is Reacting To

3. The 42 Principles

Section AProject-level principles

P1Build for one user first, then expose seams.

P2Extensibility replaces features.

P3Treat dependencies as reviewed code.

P4Make the rules explicit so agents and humans can follow them.

Section BAgent-design principles

P5Sub-1000-token system prompts work, because the model already knows what a coding agent is.

P6Four tools cover ~all coding work.

P7Refuse MCP. Use CLI tools with READMEs.

P8Refuse plan mode. Use a file.

P9Refuse sub-agents for context gathering.

P10Refuse background bash. Use tmux.

P11Refuse built-in to-dos.

P12YOLO by default; sandbox by container.

Section CArchitecture principles

P13Separate the layers ruthlessly.

P14Build on raw provider SDKs, not on a meta-SDK.

P15Lazy-load providers.

P16Every streaming event is a delta + a snapshot.

P17Streams are async-iterable AND promise-able.

P18Cross-provider compatibility via stateful transformation.

P19Abort is first-class.

P20Tool validation throws into the event stream, not the program.

Section DAgent-loop principles

P21Copy-on-write state with getter/setter exposure.

P22Subscribers as barriers.

P23Steering and follow-up are two queues with two polling sites.

P24Termination by unanimous vote.

P25Tool errors are tool results, not exceptions.

P26Parallel by default, sequential when needed.

P27transformContext then convertToLlm — two-phase context shaping.

Section ETUI principles

P28Stream-based UI over full-screen.

Section A
Project-level principles

Section B
Agent-design principles

Section C
Architecture principles

Section D
Agent-loop principles

P27`transformContext` then `convertToLlm` — two-phase context shaping.

Section E
TUI principles

P32Bracketed paste + `[paste #N +M lines]` markers.

Section F
Coding-agent principles