Task Tracking in Plain Text
For months, the work on my Rails-in-TypeScript port lived in markdown plan docs
inside the code repo — a docs/ folder of checklists where every - [ ] was a
future PR. The cost crept up on me. Because the docs sat in the code repo, every
edit inherited that repo's weight: a full CI suite and a review ran on each
change, so marking a task done was itself a pull request — open it, wait for
CI, get a review, merge. Nobody does that one box at a time, so cleanup got
batched, and batched cleanup is cleanup that doesn't happen. Agents would
sometimes burn a whole PR just to tick a box.
I've written before about loving
RFCs: durable, asynchronous, reasoning that outlives its author. I wanted that
again, but for a fleet of coding agents instead of a team — with one constraint
the human version never had: an agent has to read the whole thing cheaply,
constantly, with no ceremony. That ruled out the obvious options. A hosted board
means a token and a round trip for every "what's next." A database means state
you can't cat — state you have to interrogate. So the design fell out of the
constraint: keep the work as text, in git, and let "where's the work" be a
question you answer by reading files.
Two layers: RFCs and stories
It's a sibling git repo — twenty-one RFCs over a hundred and eighty stories as I
write this, a hundred and six done. An RFC is the long-lived "why": a numbered
folder whose README carries the motivation, design, alternatives, and open
questions — the durable record of intent I knew from the human world. (The shape
isn't mine: numbered markdown, debated in a PR, numbered on merge, is the model
Rust pioneered and Ember borrowed; Yehuda Katz had a hand in both.) A story is
the new part — one markdown file under its RFC's stories/ folder, a
human-readable body and a machine-readable contract up top:
status: ready # draft → ready → claimed → in-progress → done
rfc: "0003-activerecord-cli"
deps: [] # other stories this one waits on
priority: null # lower = higher in the ready queue
pr: null # the PR is a field on the story, not the reverse
claim: null # who's holding it
That contract is the whole idea. The PR became a field on the work instead of
the work living inside the PR — so a story can sit ready for weeks, get
claimed, and spawn a PR whose number it just picks up along the way. The plan
docs, welded to the PR lifecycle, could never do that.
What plain text buys you
Everything good about it follows from "it's just files in git."
An agent reads it like source. No API to call, no rows to query — a link between stories is a slug in a file, and following one is a file open. The agent walks the directory the way it walks a codebase, because to it they're the same thing.
Git is the store. Status changes are commits; claiming a story is a
git push. Push is atomic, so it doubles as the lock — crude, but for a handful
of agents it's enough, and there's no lock server to run. A pre-commit hook
regenerates an index, so there's a fast view of the whole queue with nothing
running in the background.
Strict where design lives, loose where state moves. RFCs go through a pull request, because that's where a design argument belongs. Status flips go straight to main, because a claim or a "done" isn't a decision and shouldn't cost a CI run and a review. The tracker moves at the speed of the work, not the speed of the codebase — the exact gap that was rotting the in-repo docs.
The schedule writes itself. Because each story carries its own dependencies,
priority, and claim state, "what next" stopped being a judgment call and became a
query: pnpm tasks next-bundle returns the unblocked stories, ranked. The loop I
run used to need an LLM in the scheduling seat
reading prose to decide what was open; now a plain process reads the index, claims
the top story, and hands it to a worker. The intelligence moved into the doing;
the scheduling got boring, which is what you want from a scheduler.
The backlog is writable, so scope has somewhere to go. When an agent finds its story is too big, or notices adjacent work it never mentioned, it used to have nowhere to put that but the PR it was already in — and the diff would balloon past what anyone could review. Now the overflow becomes new stories, split off into the same RFC, and the PR stays the size a review can hold.
Where it frays
One honest caveat, and it's the real one: this only works because the work is decomposable and there's a hard reference — Rails — to grade each story against. Plain text doesn't make fuzzy work actionable; it just stops adding friction to work that already is. Give it tasks with no concrete acceptance criterion and you'll have vaguer vagueness, now in YAML.
But for work that decomposes, the payoff compounds. Because it's all just files with a generated index, even the dashboard I use to watch the queue was nearly free to build — no schema, no API, no service to stand up first. The whole tracker is a folder of markdown I can clone and read on a plane with no wifi, the same files the agents read. No service to sign into, nothing hidden in a table. The work lives where the agent already is.
Comments
Leave a Comment