Julietta Yaunches

AI engineer & researcher at NVIDIA. (the opinions on this site are mine alone)

Blog Ideation: The Pragmatic Middle Ground in Agentic Coding

Original Idea

An article about a practical middle-ground approach to agentic coding that sits between the chaos of “Ralph Wiggum” bash loops (simple iteration until tests pass) and the complexity of Steve Yegge’s Gas Town (20-30 coordinated agents with roles and persistent state). The approach:


Research Summary

See detailed research in /research/ folder:

Key Findings

  1. Mathematical evidence supports simplicity: Google’s December 2025 study (180 experiments) proves that for sequential tasks, adding more agents REDUCES performance by 39-70% when a single agent already succeeds 45%+ of the time.

  2. The microservices parallel is explicit in 2025 literature: Multiple articles draw the comparison. Amazon Prime Video abandoned microservices, returned to monolith, cut costs 90%. “The religious wars are over. Pragmatism won.”

  3. TDD is the validation mechanism, not just a preference: Anthropic calls it their “favorite workflow.” Tests provide “reliable exit criteria.” But there’s a documented failure mode: agents game tests. Solution: separate contexts for test writing vs implementation.

  4. Context limits are real constraints: Models effectively use only 8K-50K tokens regardless of window size. 70% of paid tokens provide minimal value. Working within limits is the pragmatic approach.

  5. Named patterns for the middle ground exist: Microsoft documents “Modular Monolith for AI Agents.” ReAct is explicitly the “middle ground.” Spec-Driven Development (SDD) emerged as a major 2025 paradigm.

  6. Git worktrees enable parallel independence: Standard solution for running multiple Claude Code instances on different features without merge conflicts.


Persona Feedback Summary

Andrej Karpathy

Excited by: The TDD mechanism is concrete and implementable. Token limits as design constraint, not bug. Files as state, not prompts. Parallelization through isolation.

Concerns: “Simplicity” and “consistent architecture” are undefined. The validation loop needs more technical detail. Missing: how does context actually flow? Acceptance criteria quality is glossed over.

Would reshape as: Start with simplest case. Show information flow diagram. Include runnable examples - real CLAUDE.md snippets, real acceptance criteria. Let reader implement after reading.

Paul Graham

Excited by: “I’m not sure I’m ready to run a gas town” is the hook - counterintuitive honesty. The inversion of usual narrative (simple can be better). There’s real observation here - coordination chaos vs independence success. The gap is genuinely underserved.

Concerns: Title/framing is backwards - “middle ground between X and Y” is boring. Too much deference to Gas Town. “Simplicity, consistent architecture, TDD” is vague precision. Missing the contrarian flip.

Would reshape as: Start with anomaly: “My 3-session parallel workflow outperformed colleagues running 15+ coordinated agents.” Why does coordination fail? The insight: maybe Gas Town isn’t the future for most work.

Martin Fowler

Excited by: This could become a named pattern (“TDD-Gated Parallel Sessions”). Trade-off awareness is evident. Grounded in actual practice. Team context consideration.

Concerns: Terms need definition. Context boundaries are fuzzy. Missing: the failure modes. The “parallel sessions” pattern needs structure.

Would reshape as: Pattern description format - Context, Problem, Solution, When to Use, When NOT to Use, Trade-offs, Example, Variations.


Proposed Outline

Working Title: “Not Ralph, Not Gas Town: A Grounded Middle-Ground”

Outline

1. Introduction: The Chaos Spectrum

2. The Mathematical Case Against Complexity

3. The Microservices Lesson

4. The Pattern: TDD-Gated Parallel Sessions

5. The Validation Loop in Practice

6. Parallel Without Collaboration

7. When This Works and When It Doesn’t

8. For the Rest of Us (Stage 5-6 Developers)

9. Conclusion: Build the Simplest Architecture That Solves Your Actual Problems

Rationale

This structure addresses all three personas:

The research supports every major claim with citations, particularly the Google study on coordination failure and the microservices parallel. The gap in the competitive landscape (no article addresses “Stage 5-6” positioning explicitly) is directly targeted.