Task Tracking in Plain Text

For months, the work on my Rails-in-TypeScript port lived in markdown plan docs inside the code repo — a docs/ folder of checklists where every - [ ] was a future PR. The cost crept up on me. Because the docs sat in the code repo, every edit inherited that repo's weight: a full CI suite and a review ran on each change, so marking a task done was itself a pull request — open it, wait for CI, get a review, merge. Nobody does that one box at a time, so cleanup got batched, and batched cleanup is cleanup that doesn't happen. Agents would sometimes burn a whole PR just to tick a box.

I've written before about loving RFCs: durable, asynchronous, reasoning that outlives its author. I wanted that again, but for a fleet of coding agents instead of a team — with one constraint the human version never had: an agent has to read the whole thing cheaply, constantly, with no ceremony. That ruled out the obvious options. A hosted board means a token and a round trip for every "what's next." A database means state you can't cat — state you have to interrogate. So the design fell out of the constraint: keep the work as text, in git, and let "where's the work" be a question you answer by reading files.

Two layers: RFCs and stories

It's a sibling git repo — twenty-one RFCs over a hundred and eighty stories as I write this, a hundred and six done. An RFC is the long-lived "why": a numbered folder whose README carries the motivation, design, alternatives, and open questions — the durable record of intent I knew from the human world. (The shape isn't mine: numbered markdown, debated in a PR, numbered on merge, is the model Rust pioneered and Ember borrowed; Yehuda Katz had a hand in both.) A story is the new part — one markdown file under its RFC's stories/ folder, a human-readable body and a machine-readable contract up top:

status: ready        # draft → ready → claimed → in-progress → done
rfc: "0003-activerecord-cli"
deps: []             # other stories this one waits on
priority: null       # lower = higher in the ready queue
pr: null             # the PR is a field on the story, not the reverse
claim: null          # who's holding it

That contract is the whole idea. The PR became a field on the work instead of the work living inside the PR — so a story can sit ready for weeks, get claimed, and spawn a PR whose number it just picks up along the way. The plan docs, welded to the PR lifecycle, could never do that.

What plain text buys you

Everything good about it follows from "it's just files in git."

An agent reads it like source. No API to call, no rows to query — a link between stories is a slug in a file, and following one is a file open. The agent walks the directory the way it walks a codebase, because to it they're the same thing.

Git is the store. Status changes are commits; claiming a story is a git push. Push is atomic, so it doubles as the lock — crude, but for a handful of agents it's enough, and there's no lock server to run. A pre-commit hook regenerates an index, so there's a fast view of the whole queue with nothing running in the background.

Strict where design lives, loose where state moves. RFCs go through a pull request, because that's where a design argument belongs. Status flips go straight to main, because a claim or a "done" isn't a decision and shouldn't cost a CI run and a review. The tracker moves at the speed of the work, not the speed of the codebase — the exact gap that was rotting the in-repo docs.

The schedule writes itself. Because each story carries its own dependencies, priority, and claim state, "what next" stopped being a judgment call and became a query: pnpm tasks next-bundle returns the unblocked stories, ranked. The loop I run used to need an LLM in the scheduling seat reading prose to decide what was open; now a plain process reads the index, claims the top story, and hands it to a worker. The intelligence moved into the doing; the scheduling got boring, which is what you want from a scheduler.

The backlog is writable, so scope has somewhere to go. When an agent finds its story is too big, or notices adjacent work it never mentioned, it used to have nowhere to put that but the PR it was already in — and the diff would balloon past what anyone could review. Now the overflow becomes new stories, split off into the same RFC, and the PR stays the size a review can hold.

Where it frays

One honest caveat, and it's the real one: this only works because the work is decomposable and there's a hard reference — Rails — to grade each story against. Plain text doesn't make fuzzy work actionable; it just stops adding friction to work that already is. Give it tasks with no concrete acceptance criterion and you'll have vaguer vagueness, now in YAML.

But for work that decomposes, the payoff compounds. Because it's all just files with a generated index, even the dashboard I use to watch the queue was nearly free to build — no schema, no API, no service to stand up first. The whole tracker is a folder of markdown I can clone and read on a plane with no wifi, the same files the agents read. No service to sign into, nothing hidden in a table. The work lives where the agent already is.

Comments

Leave a Comment

Used for confirmation only. Not displayed publicly.