file://writing.log

Writing

Long-form technical writing on agent tooling and evals. Pieces ship when they survive review, not on a schedule.

Nothing published here yet. The standard for this page: a piece about a real shipped thing that teaches one transferable idea, with claims you can check. If a post would not earn a bookmark from someone who works on this stuff, it does not ship.

Up next · Summer 2026

A 0-21 rubric for how well a CLI serves an AI agent

Seven axes, scored 0-3 each. Scored against my own four CLIs first, including where they fail, then one change shipped to the worst scorer and the delta reported.