I have a docker image with all the dev tools installed (like uv, cargo, npm, ...). I run one agent (gemini cli in my case, but doesnt matter) per container => with multiple tasks, I spawn multiple containers.
I have AGENTS.md in each github repo, with like "this is a python backend that has purpose P. Whenever you make a change, do uv run ty check, then uv run pytest; and the coding style is [...]".
For each task I want to hand off to async agent, I create a PLAN.md which start with general description "we want to add functionality F", and then a rather detailed spec like "1. Create a new DB entity schema with user id, foreign key to T, 2. Implement read and insert functions in the db module, 3. Implement a function f(a: T1, b: T2) -> T3 that does [...]". I start the agent's CLI, tell it "read the plan, implement only step 1 then run tests, then wait for further instructions"
few mins later I check if it succeed, then do `git diff`, if not happy I tell it "fix that and that", otherwise I commit (no push yet) with "step 1", otherwise tell agent "Proceed with step 2 then run tests, then wait for further instructions".
sometimes agent goes crazy and loops, or implements it very badly -- I git restore all (thats why I commit after every step!), and refine plan / implement myself, and then start the CLI anew
writing the PLAN.md takes me quite some time, can be an hour. But I really enjoy that -- that is the "true" programmer work for me, and sometimes I decide afterwards I'm going to implement it all myself -- but having that PLAN.md is still valuable! Eg if I need to interrupt work. I don't need to be exact or 100% correct with function signatures and names and types and typos, etc, the agents are quite good at getting my intent. But usually writing more is better
when I have multiple agents at once, only rarely they actually run truly in parallel, most of the time they wait for my input. When I deep work, I don't check on them too often. But when I'm like on many calls in a day, or reading many articles etc, I check on them with higher freq. It's more like async than multiprocessing :) The point of having multiple of them at once is that when I do have this "lets review someone" opportunity, I want at least one agent to be finished with their current step. Thats another reason for writing the plan granular -- each step should be fast to implement, fast to review. As you say, its very easy to review implementations of one's own intent. And the context switching is actually not that bad, compared to other context switches we suffer from (like "urgent slack message", or "system is down alert", because its voluntary and cached
and I only push after the full plan is implemented, usually squashing a bit, and archive the plan.md as "design spec" in the repo itself
I use Beyond Better (https://beyondbetter.app) - it runs in CLI or browser (or REST API) and works with models from all providers. The pattern I find useful is working on multiple projects in different browser tabs. The orchestrator/agent mode of Beyond Better (BB) means I can set it off running on larger tasks and then switch to a different tab and continue working on a different project (or task).
It's easy to add cloud services (Google Drive, etc) so research results can easily be shared with others.
Hi Simon! We're thinking along the same lines with parallel coding. I wrote up my thoughts on running a "software atelier" here: https://www.linkedin.com/pulse/vibing-painting-douglas-squirrel-d7ffe/
Glad you're at the forefront describing and trying out the tech practices that make such a structure work!
Awesome article. I actually wrote about the operation modes (in Spanish) a few weeks ago because I still struggle with parallel “vibe coding”
https://rlbisbe.net/2025/09/25/llm7-modelos-mentales-y-concentracion/
I do fire up some research tasks as you were mentioning, but I definitely struggle handling “write” tasks. Will give a try to your patterns.
Thanks for the posts!
Hello,
> Please share your patterns that work
I have a docker image with all the dev tools installed (like uv, cargo, npm, ...). I run one agent (gemini cli in my case, but doesnt matter) per container => with multiple tasks, I spawn multiple containers.
I have AGENTS.md in each github repo, with like "this is a python backend that has purpose P. Whenever you make a change, do uv run ty check, then uv run pytest; and the coding style is [...]".
For each task I want to hand off to async agent, I create a PLAN.md which start with general description "we want to add functionality F", and then a rather detailed spec like "1. Create a new DB entity schema with user id, foreign key to T, 2. Implement read and insert functions in the db module, 3. Implement a function f(a: T1, b: T2) -> T3 that does [...]". I start the agent's CLI, tell it "read the plan, implement only step 1 then run tests, then wait for further instructions"
few mins later I check if it succeed, then do `git diff`, if not happy I tell it "fix that and that", otherwise I commit (no push yet) with "step 1", otherwise tell agent "Proceed with step 2 then run tests, then wait for further instructions".
sometimes agent goes crazy and loops, or implements it very badly -- I git restore all (thats why I commit after every step!), and refine plan / implement myself, and then start the CLI anew
writing the PLAN.md takes me quite some time, can be an hour. But I really enjoy that -- that is the "true" programmer work for me, and sometimes I decide afterwards I'm going to implement it all myself -- but having that PLAN.md is still valuable! Eg if I need to interrupt work. I don't need to be exact or 100% correct with function signatures and names and types and typos, etc, the agents are quite good at getting my intent. But usually writing more is better
when I have multiple agents at once, only rarely they actually run truly in parallel, most of the time they wait for my input. When I deep work, I don't check on them too often. But when I'm like on many calls in a day, or reading many articles etc, I check on them with higher freq. It's more like async than multiprocessing :) The point of having multiple of them at once is that when I do have this "lets review someone" opportunity, I want at least one agent to be finished with their current step. Thats another reason for writing the plan granular -- each step should be fast to implement, fast to review. As you say, its very easy to review implementations of one's own intent. And the context switching is actually not that bad, compared to other context switches we suffer from (like "urgent slack message", or "system is down alert", because its voluntary and cached
and I only push after the full plan is implemented, usually squashing a bit, and archive the plan.md as "design spec" in the repo itself
> Please share your patterns that work
I use Beyond Better (https://beyondbetter.app) - it runs in CLI or browser (or REST API) and works with models from all providers. The pattern I find useful is working on multiple projects in different browser tabs. The orchestrator/agent mode of Beyond Better (BB) means I can set it off running on larger tasks and then switch to a different tab and continue working on a different project (or task).
It's easy to add cloud services (Google Drive, etc) so research results can easily be shared with others.