So You Wanna Vibe Code

A framework for the journey into AI-assisted development: levels, tools, mentality, and the plan-delegate-test loop.

2025 was by far my most productive year, and in the most time constrained period of my life. My assessment is that in the last 1.5 years, vibing a modest feature has gone from a highly frustrating experience with a dozen of iteration loops to a structured plan, delegate, and test loop. My general sense is that I'm 4-6 times more productive in 1.5 years.

Here's a high level framework that I use as a template for your journey into Vibe coding. This blog post describes the 8 levels of "Yegge's Developer-Agent Evolution Model."

TLDR

You want to try to start in level 2 and level 3.

Stage 2: Coding agent in IDE, permissions turned on. A narrow coding agent in a sidebar asks your permission to run tools.
Stage 3: Agent in IDE, YOLO mode. Trust goes up. You turn off permissions, agent gets wider.

The best choice long term for vibing is Claude Code. The Claude harness for their model and tools just seems to outperform using the same models in Cursor, Antigravity, Copilot, Codex, OpenCode, Amp, etc. Claude Code's plugin system is great and is accelerating what is possible.

There are alternatives such as OSS OpenCode that might be worth trying. Commercial alts: Gemini CLI, Codex from OpenAI, Amp (ad-supported) for free usage.

Start a new project, and don't try to vibe in existing projects. Avoid the anxiety and learn in a safe environment where you don't care if you throw the code away.

My primary project (fantasy baseball) is on its 3rd codebase. But I've also worked on two small projects (macOS screensaver, and OSS window manager ShiftIt) using programming languages I have zero knowledge of: Rust, Objective-C, and Swift.

Mentality

The 10x engineer is actually possible with great planning, research, task management, delegation to agents, and feedback loops.
Writing great feature and bug tickets is where you spend most of your time.
Planning and vibing with agents will produce great code 80-90% of the time.
Expect failures, slop, and refactoring as a way of vibing.
Top tier models with proper guidance are writing code at 80-90% better than all engineers.
Agents will write a lot more code than you would have written. You have to learn to be ok with that.

My Experience Since October 2024

I used about 3 billion tokens inside of Cursor in 2025. I'd guess another billion across Claude, Antigravity, Copilot, and OpenCode. It feels like I'm at least 4-6 times more productive in 1.5 years.

You really want to spend most of your time planning what to build using deep research agents. I think it's important to be directional with your plans and don't be too rigid. Unless you really care about the tool that is used, tell the research agent what's important for the feature and any future plans about what you are building.

For example:

"I want to implement a caching frontend and backend for our 3rd party API calls because the data doesn't change more than once a day. Please use Redis for the backend, but please make several suggestions for frontend and backend libraries. Provide multiple architecture options."

There's a lot of value in seeing several options explained. And you'll want to limit negative instructions, such as "do not use these CSS libraries: Bootstrap, YUI, SCSS, inline CSS, or Materialize."

Spec driven development makes sense in enterprises and certain scenarios. However, we should be trying to shift our mindset towards fast implementation and loosening our grip over implementation details. As the models and tooling gets better, we need experience and time vibing to gain skills and get comfortable with this new paradigm.

Claude Code

Claude Code advantages:

Terminal-native — Works directly in your shell, no context-switching to a browser or IDE
Full filesystem access — Reads, writes, and edits files across your entire project
Agentic workflow — Autonomously explores codebases, runs commands, and iterates on solutions
Git integration — Creates commits, PRs, and handles version control operations
Tool ecosystem — Extensible via MCP servers and plugins for custom workflows
Persistent context — Maintains conversation history with automatic summarization for long sessions
Multi-file reasoning — Understands relationships across large codebases, not just single files
Safe execution — Sandboxed command execution with user approval for destructive operations

The key differentiator is the tight terminal integration — it operates where developers already work, with direct access to run builds, tests, and git commands while reasoning about code changes.

Claude Plugins

What is a plugin?

Installs /slash-commands for Agents, Hooks, Skills, MCP, etc.
Agents — markdown files
Hooks — event driven triggers
Tool management like LSP, permissions
Skills — focused tasks and abilities using markdown, scripts, permissions
MCP — functions that LLMs can call to external APIs (documentation, browser, CRM, SaaS apps, etc.)
Output styles

Official plugins

My recommended official plugin agents: commit-commands, code-review, code-simplifier

MCP servers: context7, supabase

Beads: Local Task Management

Beads is a local Jira-like issue tracker and task-agent that creates a local SQLite database for the LLM to find work.

Workflow

Add Beads issues/tasks manually
start task-agent to process your beads issues
Agent updates and closes issues when work is completed
Agent generates commit message and commits code

Automated Workflow

Use deep research to generate and update a plan
Tell Claude to turn the deep research into a beads epic with tasks and dependencies
Start 1 or more agents: Start frontend and backend task-agent
Optionally include test planning in research to add those tasks as well
Manually validate new features

When Beads Works Well

Persisting context across Claude sessions
Tracking blockers and dependencies
Recovering context after conversation compaction
Solo developer or small team projects
Work that lives entirely in the codebase

Limitations

Agents all run in project folder, doesn't create separate git worktrees
Running multiple agents can cause file editing conflicts
No sync with GitHub, Jira, Linear
No IDE extensions
No time tracking or sprint planning features

Cursor

Cursor advantages:

IDE-native — Built as a VS Code fork, so full editor experience with syntax highlighting, debugging, extensions
Visual diff interface — See proposed changes inline before accepting, easy to review multi-file edits
Codebase indexing — Indexes your project for fast semantic search and context retrieval
Tab completion — Predictive autocomplete that suggests multi-line changes as you type
Composer mode — Multi-file editing with visual preview of all changes across files
Familiar UX — VS Code users get instant familiarity, existing extensions work
@ mentions — Reference specific files, docs, or symbols with @file syntax
Git worktrees — Agents allow you to edit the same files in parallel, requiring you to evaluate and merge worktrees

The key differentiator is the visual editing experience — you see exactly what will change before it happens, with tight integration into the editor workflow developers already use.

Elite Level: Level 8

At level 8, you are managing and building your own orchestrator. You are on the frontier, automating your workflow.

Spend: $$$$$ | Agents: 10+ quasi-infinity

Announcement blog post
Gastown — Level 8 Tool