The Checklist is Almost Done

Musings on product marketing, as Anthropic starts running out of backlog

Mar 10, 2026

There’s a specific kind of product announcement that only happens when a team has run out of existential problems and wild brainstorms and started working down the backlog. You can recognize it by a certain quality of... inevitibilty. Not “we built something surprising” but “we built the obvious thing that was obviously next, obviously.” The feature exists not because someone had a vision, but because there’s an empty checkbox on some product manager’s list and we’ve got nothing better to do than ship it.

Anthropic has been shipping a lot of these lately. It’s what happens when a product approaches maturity.

Voice mode. Enterprise plugin marketplace. `/code-review`. `/security-review`. `/simplify`. `/batch`. Claude in Chrome. Claude in Excel. Cowork. Scheduled commands. These are real, useful, well-executed features. They are also, nearly without exception, features that were completely implicit in the product vision from the moment Claude Code launched. I’m fairly confident I could find Slack messages from two-and-a-half years ago speculating about most of them. Not because anyone was clairvoyant — because they were *obvious*. A coding agent would eventually need a security review command. An enterprise product would eventually need a marketplace. A voice interface was always going to happen. These aren’t innovations; they’re obligations.

I know they were obvious because I built them myself. Not intentionally, and not as a product — but eight month’s ago I put together something I called Bad Dave’s Robot Army (BDRA): 34 specialized Claude Code agents in a hierarchical architecture, designed to do systematic codebase analysis and lightweight product management. The thing I built as an experiment in theoretical software architecture turns out to be, in retrospect, a pretty decent product requirements document for Anthropic’s roadmap. `/code-review`, `/security-review`, and `/simplify` together cover roughly a third of what BDRA does. `/batch` covers another third. I’m half-expecting Anthropic to ship the remaining third — educational scaffolding for codebases, essentially — sometime in the next few weeks.

There’s a specific kind of humility that comes from realizing you didn’t have a novel idea; you just had an early one. The features I shipped felt like inventions at the time, crackling with potential. They were actually just deductions.

I’m not bitter, truly. Paying down the obvious backlog is how you build a mature product. The alternative — chasing novelty instead of completing the product — is how you get something that’s perpetually interesting but never quite ready.

The Market Yawned

OpenAI recently released a model with somewhat better coding benchmark scores than Anthropic’s latest. The reaction from developers was approximately: a mild, polite shrug. “Oh, you’re still doing that? Good luck, I guess.”

This is remarkable. Eighteen months ago, a benchmark swing like that would have generated real discussion about switching costs, ecosystem decisions, migration strategies. Now it generates... not much. The developer community looked at the numbers, noted them, and went back to what they were doing.

One interpretation: Claude Code users have become inert — captured by switching costs, habituated into complacency. That’s possible. But a more interesting interpretation is that the benchmark stopped being the relevant question. Developers have stopped asking “which model scores highest on HumanEval” and started asking “which tool do I actually want to work inside for eight hours?” Those are different questions, and the second one is much harder to dislodge with a benchmark delta.

When the market stops reacting to benchmark news, it means the market has made a decision.

The Last Holdouts

There’s a specific type of skeptic who’s been unusually quiet recently. Not the “AI is overhyped” crowd — they’re still around — but a more technically sophisticated objector who was arguing, with reasonable evidence, that agentic coding tools wouldn’t actually work at scale. That the context window limitations would be fatal. That the error-compounding problem would make long autonomous sessions unreliable in practice. That the economics of token consumption would make serious usage prohibitive.

These were real arguments. They were made by smart people who had actually tried the tools. And they have largely gone quiet. Not because they were beaten in debate, but because the tools started working well enough that the objections became moot. Epic’s Seth Hain recently mentioned that over half of their Claude Code usage is now by *non-developer roles*. That’s not a sentence you say when a tool is still in the “technically promising but practically limited” phase. That’s something you say when you start the work of deeply refactoring your enterprise workflows to leverage new best practices.

The last holdouts have put down their arms.

What Product Maturity Actually Means

Product maturity is a funny thing to declare. It sounds like stasis, like the end of something. It isn’t. What it means, in practice, is that the core value proposition has stopped being in question. The product no longer needs to prove that it works. It works. You’ve gotten past the customers that marketing books classify as “innovators” or “early adopters” and are making serious inroads into the mass market. The remaining work — the marketplace, the slash commands, the voice, the enterprise integrations — is extension and polish and business development. Important work, but a different category of work.

Claude Code launched in February 2025 as a “limited research preview.” Anthropic described its goals carefully and modestly: understand how developers use Claude for coding, improve tool call reliability, add support for long-running commands. The hedging was honest. It wasn’t clear yet whether the thing would actually work.

Thirteen months later, it has over $2.5B ARR (more than doubling since January) and the technically sophisticated skeptics have gone quiet. That’s the whole story, and it happened faster than anyone predicted — including the people building it.

The Compressed Maturity Cycle

Here’s the part that deserves more attention than it’s getting: we just watched a developer tool go from proof-of-concept to product maturity in roughly sixteen months, crossing a two billion dollars in revenue on the way. That timeline would have been considered science fiction for a software product as recently as 2022. It makes the WhatsApp growth and acquisition story seem kinda slow and stodgy.

The conventional wisdom on enterprise software maturity — even for beloved developer tools — was measured in years. You needed time for trust to accumulate, for the horror stories to get resolved, for the champions inside large organizations to fight the internal battles. npm took the better part of a decade to become the water developers swam in. Docker, from launch to genuine enterprise ubiquity, was a similar story.

Something structural changed. “AI makes iteration faster” is probably part of it, but that explanation is a little too convenient and a little too circular. I suspect the more interesting explanation involves the state of customer readiness — a community that had spent years being primed for exactly this kind of tool, so that when it worked well enough, adoption compressed dramatically. The proof-of-concept phase ended fast because developers knew what they were looking at.

This matters beyond the Claude Code story. If the product maturity cycle has genuinely compressed to eighteen months, then the next category-defining tool could go from “interesting research preview” to “the thing everyone just uses” before most people have finished forming an opinion about whether it will work.

How Maturity Ends

Which raises the uncomfortable question that any honest declaration of product maturity has to address: how does this end?

The failure mode that concerns me isn’t a competitor beating Claude Code on features — that’s just the benchmark problem again, and we’ve established the market doesn’t particularly care. And it’s not a reliability failure; the tool is past the point where that narrative can get traction.

The concerning failure mode is being *surrounded*. Not losing a fight but having the level of abstraction shift underneath you.

npm reached product maturity in the mid-2010s — totally won the mindshare race, developers stopped asking whether to use it, the checklist was done. And then it just... kept being npm. The tools that were supposed to displace it (Bun, pnpm, Yarn) mostly didn’t, or only partially did, and npm is still the default a decade later. Sometimes mature just means mature, and the next thing never quite arrives.

But then there’s Garmin circa 2007. Clearly the winner, clearly mature, everyone had stopped asking whether dedicated GPS navigation worked. The hardware was good, the maps were good, the UI was refined from years of iteration. And then the smartphone absorbed the use case without even trying to. Google Maps didn’t win a GPS competition — it didn’t think of itself as a GPS replacement. It was just a map application on a device people were already carrying. The product didn’t fail; the abstraction level shifted. Garmin’s mature, polished, best-in-class navigation device turned out to be a feature, not a product, and five years after peak Garmin you’d get a confused look asking someone where they kept their GPS unit. Smartphones simply ate their product.

I don’t know which of these futures Claude Code is headed toward. I’m not sure anyone does. What I’m confident about is that if Claude Code gets displaced, it won’t be by a tool with better slash commands. It will be by a new product category that makes the current interaction model — a highly capable agent you personally direct — look like the layer below the interesting problem.

What that category would be, I genuinely have no idea. But if it exists, we might find out in eighteen months or less. It might be just about ship v0.1 now out of some skunkworks in Miami or Guangzhou, and be market dominant before the next presidential election is underway. Goodness, but I love living in the future.

This post was constructed with the able assistance of Claude Sonnet 4.6, who appears to be growing into maturity with grace and aplomb.

8Lee

I think Anthropic kind of answered one of the larger questions: If Anthropic's greatest weakness is being "surrounded" then their behavior, right now, is to attack every possible surface area around them to manage this (or future-proof themselves).

Is that managing the moat? Or building it? Or both? There will always be competitors but if Claude is literally everywhere, then, the embedded habits and behavior may reign supreme.

How many folks have used Microsoft's products (Word, Excel, Powerpoint) and have almost no incentive to ever leave? A lot. So many folks.

Dancing with Robots: A Software Architect's Journey

Discussion about this post

Ready for more?