This is the field guide for fanning work across your accounts with clikae —
whether you are at the keyboard or an LLM agent is driving clikae for you
(e.g. Claude Code running clikae burn / clikae conduct in the background).
It's task-oriented; for the full command reference see usage.md, for
the language see grammar.md.
If you're an agent reading this: this page is the contract. Follow the rules in §3 and you won't fire a blank.
1. The mental model — brain + muscle
clikae is the muscle: it knows where each engine keeps its config and transcripts, how each one signals a usage limit, which tank still has fuel, and how to route a job onto a specific account. It does not judge the work.
The brain is the conductor — you, or a session model acting as one. The brain writes the prompt, decides which tank/engine/effort, and picks the winner. clikae carries the state and burns the right subscription; the brain decides what's good.
Keep them separate and the whole thing stays auditable: clikae reshapes where state lives, it never sits in the middle of a request and never grades output.
2. Three ways to dispatch
| Want | Use | Shape |
|---|---|---|
| One task, headless, survive a dry tank | clikae burn | one tank → artifact-verified → auto-reroute to the next reserve tank on dry |
| The SAME prompt across N accounts, pick a winner | clikae conduct (BETA) | fan read-only in parallel → collect N outputs → you judge |
| One conductor-decided shot with an effort knob + full-fidelity capture | conductor skill legs (claude-leg.sh / codex-leg.sh) | single shot on a named tank; back-and-forth is the conductor's call |
burnowns the reserve walk (dry → next tank, account-aware, skips the tank your interactive session is on). Use it for an unattended task that must finish somewhere.conductis best-of-N / breadth: audits, analyses, design proposals across accounts. It never reroutes and never judges — it hands you the files.- legs (the
conductorClaude Code skill) add the--effortknob the Agent tool doesn't expose, and--tank <name>routes throughclikae envso a leg runs on a real named account (and, for a--writeleg, inherits that tank's git identity — see §6). A leg is a single shot; if it must survive a dry tank, useburninstead.
agy (Antigravity) is the exception — read this before dispatching to it
agy is half-in: conduct can fan a read-only leg to it (cheap, fast breadth —
--leg agy/<tank>), but burn and the conductor legs can't. agy's login is one
global Keychain entry, so there's no per-shell env to switch and burn can't
reroute it ("not burnable" ≠ "not usable"). The conduct leg sidesteps that by
running on whatever agy tank is active (it never switches — a leg naming another
tank is reported, not run, since the ~/.gemini swap is machine-wide and exclusive,
so two agy tanks can't run in parallel):
# agy as a best-of-N breadth leg, alongside claude/codex (agy runs on its active tank):
clikae conduct --prompt-file review.md --leg claude/C --leg codex/H --leg agy/c --add-dir "$PWD"
# Or drive agy directly headless to spend its quota instead of your main budget:
clikae agy <tank> -- --print-timeout 900s -p "$(cat /tmp/prompt.txt)"
agy's headless personality differs from claude/codex enough that hand-rolling it the
usual way fires a blank (it buffers big stdout and returns nothing; it wanders and
burns the timeout; -i dies without a TTY). Use the dedicated recipe —
docs/agy-dispatch.md — before sending agy a headless job. It's
a real paid engine; the recipe is how you stop wasting it.
3. The rules that keep it honest (hard-won — break one and you fire a blank)
-
Judge by the artifact / output, never the exit code. A headless
codex execorclaude -pexits0even when it hit its usage limit and wrote nothing.burnjudges by the artifact's presence + fresh mtime;conductby the captured output; the legs by--outcontent. Never trust$?. -
Give the task the easy way — don't hand-roll the engine flags. Use
--prompt-file <f>(or--prompt) +--add-dir <dir>and clikae fills in each engine's headless-write dialect for you (claude's-p … --dangerously-skip-permissions --add-dir,codex'sexec -C … -s workspace-write). Hand-writing-- -p '…'is the #1 way to ship a job that can't write (see §4). -
The artifact must be the file the task really produces.
burnsucceeds when--artifactappears (or its mtime changes). A codegen task's artifact is the code file it writes, not a/tmp/report.mdyou hoped it would also produce. Point--artifactat the real output orburnwill call a real success a failure. -
Point at the right repo. A write task needs
--add-dir <repo>(clikae turns that into the engine's write-permission + working-dir flags). Running from the wrong directory with no--add-dirmeans the engine can't even see your files. -
Multi-line prompts go through
--prompt-file. The prompt is passed as data (NUL-framed internally), so a multi-line prompt survives intact. Don't try to cram a paragraph into a shell-quoted-p '…'. -
Pre-stage inputs to
/tmpand make tasks idempotent. A burned tank should never be handed slow iCloud-backed I/O, and an artifact-checked, idempotent task can be re-fired on another tank for free when one runs dry.
4. Anti-pattern — a real misconfigured burn (and its fix)
Seen in the wild:
cd /Users/me
clikae burn claude L --artifact /tmp/reef-core-seam1-report.md --timeout 1200 --fresh \
-- -p 'Execute the first seam of the refactor — write real code and commit …'
Three red flags, all from the raw -- -p … form:
- Can't write.
-- -p '…'has no--dangerously-skip-permissions, so headlessclauderuns read-only — a "write real code" task can't touch a file. - Wrong place. cwd is the home dir and there's no
--add-dir <repo>, so the engine can't reach the repo it's meant to edit. - Artifact mismatch. The task writes code, but
--artifactis a/tmpreport that the task never produces →burnreports failure even if code were written.
The fix — the convenience surface does all three for you:
clikae burn claude L \
--artifact ~/dev/reef/src/editor_backend/__init__.py `# the file the task really creates` \
--prompt-file /tmp/reef-seam1.md \
--add-dir ~/dev/reef \
--timeout 1200 --fresh
5. Seeing your fleet
From a terminal: clikae (the board — traffic-light fuel dots per tank) and
clikae tanks (accounts). That's the authoritative view of who's fuelled and who's
dry.
From inside a Claude Code session that's driving clikae: the input footer shows
· N shells · — the count of background shells this session is running. Press ↓
to manage them (Enter to view output, x to stop). That count is shell-granular,
not tank-aware: it tells you how many jobs, and the manager shows what each one
is.
Make the manager self-labeling. The manager previews the start of each command, truncated. If you lead the background command with the tank + role, every job identifies itself at a glance:
# Lead with a [tank·role] token → the ↓ manager shows "[L·deadlinks]" not a generic prefix
tag='[L·deadlinks]'
clikae burn claude L --artifact … --prompt-file … --add-dir …
Without it, several jobs that share a leading cd … / VAR=… prefix all preview
identically and you can't tell them apart. (A native aggregated roster in clikae is
on the backlog; until then, the token-first convention rides the harness's own view
for free.)
6. Cross-account, dry, and identity
- Each tank burns its own subscription. Fanning across tanks spends each account's quota, not the budget of your main interactive session. That's the whole point — the expensive supervisor stays asleep; cheap workers burn whichever account still has gas.
- Dry handling.
burnauto-reroutes to the next reserve tank on a dry hit (account-aware: it skips siblings that share an already-dried login, and the tank an interactive session is live on).conductdoesn't reroute — it reports each leg as captured / dry / empty so you decide. - Same-account fan-out shares one bucket. Three legs on the same tank run in parallel but draw from one quota — wall-clock parallelism, not 3× throughput, and they go dry together. For independent quota, fan across different accounts.
- Git identity for write jobs. Before dispatching a
--writeleg that commits, set the tank's identity so commits aren't stamped with the engine's account email:clikae git-id claude L --name "You" --email you@example.com.clikae env(which--tankrides) then exportsGIT_AUTHOR_*/GIT_COMMITTER_*for that shell.
7. The boundary — what still needs a human
clikae proves the plumbing: a job ran, on which account, produced which file. It cannot judge:
- Output quality — whether an audit is correct, whether generated code is good. A neutral grader (another model, or you) decides.
- Runtime behaviour — a UI that renders, a server that answers.
burnonly fits tasks whose success is a file you can name. - An API error that looks like output. A transient
API Error: …string written to stdout is non-empty, so a naive "has output" check can read it as success — glance at short results before trusting them.
That irreducible human (or independent-model) judgement is a feature, not a gap: clikae stays a switcher, the conductor stays the brain.
8. Model-tiering by task risk
Dogfooding a real multi-model fleet (a full app build across claude + codex + agy) produced a working rule for which model tier to put where:
| Role | Model tier | Why |
|---|---|---|
| Orchestrator / verifier | High-capability (e.g. claude Max) | Plans, judges output, makes cross-task decisions — mistakes here cascade |
| Implementer | Mid-tier (e.g. claude Sonnet / codex) | Net-new trust-critical work: new integration code, security-adjacent paths |
| Mechanical grunt | Cheap (e.g. agy via stdin, a sub-Sonnet model) | Reformatting, summarising, boilerplate — already well-specified, easily verified |
Red line: don't drop below the mid tier for net-new, trust-critical integration work. "Cheap" makes sense for tasks where the output is fully verifiable by inspection or by a test; it is risky for tasks where the verifier itself would need to be as capable as the implementer to catch a subtle bug.
Parallelism ≠ redundancy. Fanning the same task across accounts (same tier)
gives you speed + a dry-tank fallback, not a correctness vote. For a correctness
vote, use clikae conduct and a different model tier per leg — then a neutral
third model grades the outputs, not the same one that produced them.
9. Independent verification — the neutral-grader principle
The orchestrator must independently verify a sub-agent's claims rather than accept its self-report. This matters because a model that produced output is a poor judge of whether that output is correct: it tends to rate its own work confidently even when it has made a subtle error (the "confident-wrong" failure mode).
Practical checks in a clikae fleet:
- Grep / stat the artifact directly before trusting a "done" self-report. If the file doesn't exist or is empty, the job failed regardless of what the agent said.
- Run the test suite or a targeted invariant check from the orchestrator after a write leg — not from the same leg that wrote the code.
- Use
clikae conduct(N legs, same task) and route the outputs through a separate model acting as grader — a model that only sees the outputs, not the reasoning that produced them. A grader reading N blind outputs spots errors the producer's self-assessment misses. - Do not infer correctness from tone. A confident, well-structured completion message ("I've implemented X, added tests, and updated the docs") is not evidence the implementation is correct. Grep for the invariants; run the binary.
The orchestrator's job is to hold the epistemic standard the workers cannot hold for themselves.
8. Quick recipes
# Best-of-N audit across accounts — read-only, parallel, you pick the winner
clikae conduct --prompt-file review.md \
--leg codex/H --leg claude/C --leg claude/L --add-dir "$PWD"
# Headless codegen with automatic failover when a tank runs dry
clikae burn claude C --artifact out/feature.ts \
--prompt-file task.md --add-dir "$PWD" --timeout 900
# Carry a live session onward when you hit a wall (same engine resumes; another = brief)
clikae to L # next fuelled tank, same conversation
clikae to codex # cross-vendor: a written brief, summarised on-device
See also: grammar.md (the language), usage.md (full
reference), EXPECTATIONS.md ("is this a bug?" — deliberate
surprises), and the conductor Claude Code skill for session-driven leg routing.