First experiences with spec driven development

Updated 29 May 2026

JetBrains has a Spec-Driven Development with Coding Agents class on deeplearning.ai. I took the class since I hoped it would help with tokenmaxxing. Spend tokens on the spec, so you have a long list of tasks. Spend tokens on the tasks without any human intervention using Auto mode in Claude Code. Anthropic increased the quota limits for Pro subscriptions through a deal with SpaceX. Perfect time to try.

Claude Code ran for about 4 hours without any intervention before I hit my quota limit. The quota is large enough now, that I can have it running for most of the 5 hour window. If I’m only running one agent at a time, this doesn’t seem too bad. While Claude was working, I was looking for a replacement using opencode.

I noticed that the session would be in a weird state and mess up the creation of new specs. I think this is due to compaction. I should start a new session with each spec to avoid compaction.

I am using the Spec-kit from github. I keep forgetting the optional commands, which do make things better. To the point that I wouldn’t consider them optional, especially if you are tokenmaxxing. The tasks also have some parallelization which is mentioned explicitly when it is able to do so.

speckit workflow Spec-kit workflow

I spend less time coding, more time waiting. More time fixing things that were under specified. As you are building manually, you have some idea of the spec as you go. I think tests are a must since they help steer the coding agent to the correct solution. As long as something is verifiable, the agent can make progress. Haiku isn’t smart enough. Sonnet is good enough for default. Haven’t had to break out Opus yet.

I need to do increasingly difficult and longer time horizon tasks to know whether or not AI capability is improving.

We are at a point that the agent can slog away all night and you can wake up with an application with some minor issues since it was under specified.

Another issue is that it is bad to modify the code by hand, because the coding agent will lose context of the changes. It will refer to what it has written, not what is currently written to disk. If you want to make changes, it is best to make the agent do it.

Also, work stops when you run out of tokens. There is so much code written, it doesn’t feel worth it to look at the code. If you are only after the end product, the code is irrelevant. All that matters is if the application is within spec and it feels right. You could change one word (framework, programming language, etc) in the spec and all the code would be different. You need to wait for tokens before work can continue again. In this way, AI is like electricity. Work stops when the lights are out. It feels like you are an addict if you are paralyzed without tokens.

Spec driven development shines in greenfield projects, but I haven’t tried it on brownfield projects. If you have a new idea for something, you can spend an hour on the spec, go to sleep and wake up with an app.

What would be a 2 day weekend hackathon with a team of two people is now a 1 hour single person specification session with an agent spending tokens through the night. Then another hour or two of fixing issues when trying to use the app due to bugs or underspecification.