A local operator-system CLI for AI agent memory, handoffs, and guardrails across 18 writer harnesses. Not a runner, not a daemon, not a hosted service. This is how every piece works, from the decide() guardrail to the cross-model run engine.
Brigade’s own one-liner: "Your agents run loops. Brigade keeps the receipts." It is AI agent memory, handoffs, and local guardrails for Codex, Claude Code, OpenCode, and over a dozen other harnesses. It runs on the machine you control: local by default, loud about the exceptions. Holding both halves of this panel in your head is the whole mental model.
Brigade was hand-rolled one incident at a time around an always-on OpenClaw agent plus daily Codex and Claude Code sessions. Two failures shaped the design more than anything planned. First, a nightly "dreaming" job promoted raw session fragments straight into memory and bloated MEMORY.md past the bootstrap budget; every session then started with truncated memory and nobody noticed for weeks. Blind auto-promotion died that day. Second, 195 handoff notes sat unread across 35 repos because the ingester had a hardcoded three-repo allowlist and nothing warned about the gap. Silence is the failure mode. Every part of Brigade that lints, warns, or writes a receipt exists because something once failed in silence.
The memory loop is the core; everything else orbits it. Writer harnesses leave handoff notes as they work. Brigade lints, guards, and classifies each one, then files the safe, targeted notes into durable memory on its own. A memory owner (OpenClaw, Hermes, or just you) only steps in for the ambiguous few.
The same five steps as a diagram: writer harnesses hand off, Brigade lints and classifies, safe notes reach durable memory through the owner, the risky few branch to review, and memory feeds the next session.
Brigade borrows the vocabulary of a professional kitchen brigade (a brigade de cuisine). The terms are load-bearing, not decoration: internalize them and the rest of the system reads cleanly. Definitions are verbatim from docs/technical-guide.md.
docs/technical-guide.mdThe 41KB incident produced the central design decision: memory has two layers. Knowledge cards under memory/cards/ hold the detail; MEMORY.md stays a slim, one-line-per-card index that loads every session and is guarded against the 12KB budget. Brigade never edits canonical memory itself — the owner does the writing.
Brigade is partitioned into nine builtin stations (the _BUILTIN tuple in registry.py). Each is a frozen Station dataclass with a name, a summary, kitchen aliases, an optional doctor() health check, and the managed tools attached to it. The station is the unit of the system.
This is the single most important piece of Brigade. decide() (ingest.py) is the pure function that turns a parsed handoff into one of four outcomes: promoted, routed, inboxed, or skipped. Every branch that cannot prove a note is safe routes it to the review inbox. Conservative by default — that pause is the point.
The same guardrail in code. Notice that every failure path constructs an Outcome("inboxed", ...) instead of writing — the safe default is always review.
src/brigade/ingest.py:209pythonThe card-promotion branch of decide(). The document-routing branch below it follows the same shape.
def decide(sections, target, promote_cards, route_documents) -> Outcome:
action = sections.get("recommended memory action", "").strip().lower()
stray = [s for s in sections if s not in KNOWN_SECTIONS]
if stray:
return Outcome("inboxed",
reason=f"unknown sections present (parser may have split content): {stray}")
if action in ("create-card", "update-card") and promote_cards:
card = sections.get("target card", "").strip()
content = sections.get("suggested card content", "")
if not SAFE_CARD_NAME_RE.match(card):
return Outcome("inboxed", reason=f"target card name unsafe: {card!r}")
if not content.lstrip().startswith("---"):
return Outcome("inboxed", reason="card content missing YAML frontmatter")
if scan_untrusted(content).flagged:
return Outcome("inboxed", reason="injection signal in card content ...")
return Outcome("promoted", dest=target / "memory" / "cards" / card)stray = [s for s in sections if s not in KNOWN_SECTIONS]An unrecognized ## heading means the parser likely split the note wrong. Refuse to guess; inbox it.and promote_cardsAuto-promotion is opt-in. Off by default, you choose to enable it per ingest.SAFE_CARD_NAME_RE.match(card)A strict filename allowlist (^[A-Za-z0-9._-]+\.md$) blocks path traversal into other dirs.scan_untrusted(content).flaggedPrompt-injection scan runs before anything is allowed to become durable memory.brigade run "<task>" is a bounded cross-model orchestration. The aboyeur (the kitchen expediter) plans the task into staged JSON assignments; workers run in parallel within a stage through their own CLIs; later stages receive earlier results; the orchestrator synthesizes. It is intentionally bounded: two orchestrator calls plus the planned worker calls. No infinite loop.
orchestratorcli=codexcli=claudecli=...orchestratorHow "serial across stages, parallel within a stage, later stages see earlier results" is actually implemented.
src/brigade/aboyeur.py:456pythonall_results: list[WorkerResult] = []
stages = sorted({a.stage for a in assignments})
for stage in stages:
stage_assignments = [a for a in assignments if a.stage == stage]
prior_results = list(all_results)
max_workers = min(roster.max_workers, len(stage_assignments))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
future_to_index = {
executor.submit(run_one, a, prior_results): i
for i, a in enumerate(stage_assignments)
}
for future in as_completed(future_to_index):
i = future_to_index[future]
try:
stage_results_by_index[i] = future.result()
except Exception as exc:
a = stage_assignments[i]
stage_results_by_index[i] = WorkerResult(
worker=a.worker, task=a.task, text="", ok=False, detail=str(exc)[:200])
all_results.extend(stage_results_by_index[i] for i in range(len(stage_assignments)))for stage in stages:Stages are serial; ordering is preserved so dependent work runs after its inputs.prior_results = list(all_results)Each stage receives a snapshot of every earlier worker’s output as context.min(roster.max_workers, len(stage_assignments))The pool never over-provisions: cap is the smaller of the roster limit and the stage size.except Exception ... ok=FalseA worker that crashes becomes a failed result, not an exception that kills the run.Brigade reaches models with no SDKs and no API keys: it builds subprocess argv for the user’s own authenticated CLIs. Because not every CLI can truly enforce read-only, Brigade is honest about it. hard = a native sandbox the model cannot escape; soft = read-only is only a prompt instruction; none = read_only is not applied at all. A writable --sandbox override downgrades even a hard CLI to prompt-only, with a loud advisory.
| hard | soft | none | |
|---|---|---|---|
| codex | |||
| antigravity | |||
| pi / cursor / aider | |||
| continue / qwen / kimi | |||
| goose / copilot / adal | |||
| openhands / grok | |||
| amp / crush | |||
| claudeargv ignores read_only entirely | |||
| opencode |
8 hard · 7 soft · 2 none. From READ_ONLY_ENFORCEMENT in agents.py. ollama:<model> refs always resolve to "none".
The 17 run-CLI adapters split across three enforcement strengths. Knowing which bucket your CLI falls in tells you how much to trust a read-only run.
The codex adapter is the cleanest example of the "build argv, no shell" pattern. Every external command is an argv list run through proc.run, which captures output and normalizes failure to exit codes.
src/brigade/agents.py:22 · src/brigade/proc.py:30pythondef _codex_argv(prompt, read_only, sandbox):
if sandbox:
return ["codex", "exec", "--sandbox", sandbox, prompt] # writable override
if read_only:
return ["codex", "exec", "--sandbox", "read-only", prompt] # hard sandbox
return ["codex", "exec", prompt]
# proc.run: no shell, bounded, failures become exit codes
cp = subprocess.run(args, capture_output=True, text=True,
timeout=timeout, check=False, stdin=subprocess.DEVNULL)
# FileNotFoundError -> Result(code=127, ...) # missing tool, not a raise
# TimeoutExpired -> Result(code=124, ...) # hung tool, bounded["codex", "exec", "--sandbox", "read-only", prompt]codex gets a real OS sandbox flag, which is why its enforcement is "hard".sandbox: ... return [..., "--sandbox", sandbox, ...]A writable --sandbox override is honored, downgrading enforcement to prompt-only with a loud advisory.subprocess.run(args, ... ) # no shell=Trueargv list, never a shell string: there is no shell-injection surface.stdin=subprocess.DEVNULLTools can never block waiting on input; a hung tool is killed by timeout (code 124).brigade init does not run a script of imperative steps. It composes manifests (a depth manifest + one per selected harness + includes), dedupes them by destination so a harness can override a baseline file, renders the templates, and atomically rewrites a marker-delimited block in .gitignore.
src/brigade/install.py:257python# Dedupe files by dst (last-wins): a harness manifest can override a baseline file.
seen: dict[str, dict] = {}
for entry in files:
seen[entry["dst"]] = entry
deduped_files = list(seen.values())
# Per selected harness, write a managed, marker-delimited gitignore block:
for h in selection.harnesses:
inbox = WRITER_INBOXES.get(h)
if inbox:
lines += [f"{inbox}/*", # ignore session-local handoffs
f"!{inbox}/TEMPLATE.md", # but keep the template
f"!{inbox}/.gitkeep"]seen[entry["dst"]] = entryLast-wins by destination path: later (harness-specific) manifests override earlier (baseline) files.GITIGNORE_BEGIN ... markersRe-running brigade init only rewrites the content between the markers, never your edits outside them.f"{inbox}/*" + f"!{inbox}/TEMPLATE.md"Handoffs are private session context, so they are gitignored — but the shared template is kept.Brigade exposes its skills and memory cards to agents over a zero-dependency, read-only MCP server: a single line-oriented JSON-RPC loop on stdin/stdout that only ever initializes, lists resources, and calls read tools. Managed tools follow the same humility: an absent or unconfigured fleet tool is advisory (WARN/MANUAL), never a hard failure of the workspace doctor.
src/brigade/mcp_server.py:50pythonfor line in sys.stdin:
request = json.loads(line) # newline-delimited JSON-RPC 2.0
method = str(request.get("method") or "")
if method == "initialize":
_emit(response(id, result={"protocolVersion": "2024-11-05",
"capabilities": {"resources": {}, "tools": {}}}))
elif method == "resources/list":
_emit(response(id, result={"resources": list_resources()}))
elif method == "tools/call":
payload, failed = call_tool(name, arguments) # station-supplied callback
_emit(response(id, result={"content": [{"type": "text", "text": text}],
"isError": failed}))for line in sys.stdin:No framework: a plain newline-delimited JSON-RPC loop, which is why it has zero dependencies.initialize / resources/list / tools/callThe only methods it answers. There is no mutating method — the server is read-only by construction.call_tool(name, arguments)Each station supplies a read callback; the loop owns the JSON-RPC envelope.Everything above is reachable through one CLI. COMMAND_GROUPS (cli/_common.py) organizes all 38 top-level commands into five families; a test enforces that every command appears in exactly one group, so nothing can be silently un-grouped. This is the map of where each capability lives.
inithandoffhandoff-templateingestmemorydoctorstatusoperatordailyworkfrictioncenterrunbookbudgetsnotificationsaddskillstoolspantryrosterrunrunsdogfoodsecurityscrubuntrustedresearchlearnchatcontextprojectsreleaseroadmapreposreconfigurecompletionsopenclaw-fragmentshermes-fragmentsBrigade is the brigade de cuisine itself — the executive chef and the line — but it does not work alone. The stations you saw attach to managed tools (content-guard, agentpantry, code-search, miseledger, and more), each owning one job and handing off cleanly to the next.