The Developer Setup I Didn't Know I Needed

How moving to Claude Code led to a complete overhaul of my terminal, containers, CLI tools, and task management.

When I moved from VS Code and Cursor to Claude Code, I didn't set out to overhaul my entire developer environment. But once you start building orchestrators and pipelines that run for hours, spinning up 5-10 parallel agents, every inefficiency becomes a bottleneck. What started as "let me try this CLI tool" turned into a full rethink of my local development stack.

Here's what changed, why, and what I'd recommend if you're heading down the same path.

The Terminal: iTerm to WezTerm

I started where most Mac developers start: iTerm2. It's fine. But once Claude Code became my primary interface, "fine" wasn't good enough. I evaluated several modern terminals:

Ghostty — Fast, minimal, promising. But still early.
WezTerm — Rust-based, extremely stable, highly configurable via Lua.

I landed on WezTerm. The Rust foundation matters more than you'd think. When you're running 10+ GB of container images alongside multiple Claude Code sessions, terminal stability and memory efficiency aren't nice-to-haves. iTerm's resource usage was noticeable. WezTerm just stays out of the way.

Containers: Docker Desktop to OrbStack

This one was painful to learn. Docker Desktop is bloated. Podman Desktop kept corrupting my virtual hard drive and container images. When you're doing agentic development, you cannot afford your container runtime to be the thing that breaks.

I did deep research and evaluated two replacements:

Colima — Free, open source, works well. But no GUI. When you're juggling multiple services and just want to glance at what's running, the lack of a visual interface is friction you feel daily.
OrbStack — Paid, but worth it. Fast, stable, lightweight GUI menu bar app. Drop-in Docker replacement that just works.

I went with OrbStack. The GUI matters. When I'm deep in a Claude Code session and need to quickly check if a service is healthy, I don't want to context-switch to a terminal just to run docker ps. OrbStack gives me multiple ways to see the same information, and on 32 GB of RAM with heavy container workloads, the lower overhead is meaningful.

Voice: Mac Whisper with Push-to-Talk

I tried several local transcription tools. SuperWhisper, the built-in macOS dictation, a few others. I settled on Mac Whisper running locally.

The push-to-talk button was the game changer. It sounds trivial, but having a physical key that activates transcription keeps me focused on articulating what I want. Without it, I'd start clicking around, lose my train of thought, and end up with garbage transcriptions. Push-to-talk forces a mode shift: when I'm talking, I'm only talking. That discipline feeds directly into better prompts, better tickets, and better planning.

Text Expansion: Espanso

Not all of my work runs through my PAS CLI pipeline. Sometimes I'm in a browser, sometimes I'm in a chat, sometimes I'm writing a quick prompt from scratch. Espanso gives me text expansion everywhere on the system.

I use it for:

Common prompt fragments and templates
Boilerplate for beads tickets
Repeated configuration snippets
Quick expansions for things I type dozens of times a day

It's a small tool that removes a surprising amount of friction. The compound effect of saving 10-30 seconds on every repetitive text entry adds up fast when you're creating hundreds of tickets and prompts a week.

Rust-Based CLI Replacements

This is the optimization that most people would call unnecessary. I'd argue it's one of the most impactful.

When your agents are processing information, the speed of every underlying system utility matters. I started replacing standard Unix tools with their Rust-based equivalents:

Task	Old Tool	Replacement	Why
Search file contents	`grep`	`ripgrep`	5-10x faster, respects `.gitignore`
Search by filename	`find`	`fd`	Faster, saner defaults
List directories	`ls`	`lsd`	Git status, better formatting
Display files	`cat`	`bat`	Syntax highlighting, line numbers
Tail logs	`tail`	`uu-tail`	Uses kqueue, no polling overhead

You install them with Homebrew, configure your CLAUDE.md to tell the LLM to prefer them, and you're done. It's a drop-in upgrade.

Is any single call 100% faster? No. But across thousands of tool invocations per session, across multiple parallel agents, the cumulative effect is real. This is about reducing friction at every step. When your agents are making hundreds of file searches in a single pipeline run, those milliseconds compound.

Compound Engineering Plugin

The compound-engineering Claude Code plugin deserves its own mention. It provides a thorough, well-defined workflow for agentic development: planning, research, implementation, review, and iteration.

When you're generating hundreds of thousands of lines of code per week, you need structure. Compound engineering gives you that structure without forcing you to build it yourself. The slash commands, the review agents, the parallel execution patterns — it's all there.

Beads: Local Task Management for Agents

I've written about Beads before, but I want to make the case more directly here: Beads, or something like it, is the future of agentic development.

Beads is a local, git-backed issue tracker. Think of it as a local Jira, accessible via CLI, with no authentication, no API keys, no external service to manage. Here's why this matters:

The parallel agent workflow

Create an epic with tasks and dependencies
Launch 5-10 parallel agents
Each agent grabs a beads task, updates its status, does the work
Agent commits, closes the task, moves on
Each agent manages its own context window

This is fundamentally more efficient than handing an agent a 500-line PRD and hoping for the best. Each agent gets just the context it needs for its specific piece of work. The task is bite-sized, verifiable, and deterministic.

Why not markdown files?

I hear this pushback constantly. "I just use a TODO.md." But consider:

A markdown file doesn't track status, dependencies, or ownership
An agent reading a 1,000-line PRD is burning context on information it doesn't need
You can't run 5 agents against a markdown file without them stepping on each other
After a month of development, you have hundreds of files to manage with no structure

Why local over cloud?

Jira, Linear, and others are great for teams. But for agentic development:

Zero latency — the data is right there in the repo
No API keys to manage
No authentication overhead
No network calls during agent execution
Works offline, works in CI, works everywhere git works

People resist Beads because of tool friction. "I have to learn something new." But the learning curve is shallow and the payoff is immediate. The first time you launch 5 parallel agents and watch them systematically chew through a backlog, you'll wonder why you ever managed work any other way.

The Emerging Problem: Document Sprawl

Here's where I'm still figuring things out. As your agentic workflow matures, you end up creating a lot of artifacts:

PRDs with user stories and specs
Deep research outputs
Architecture documents
Beads epics and tasks

A hundred or more of these documents per month is not unusual. They're great for search, great for context, but the sheer volume becomes its own management problem.

I've been thinking about whether we need something that sits above Beads: a local database that manages versioned specs, PRDs, research docs, and architectural decisions. Something that can feed bite-sized work into Beads while maintaining the full context tree. The PRD is the source of truth, Beads tasks are the execution units derived from it, and the database maintains the relationship between them.

This doesn't exist yet, at least not in the way I'm imagining it. But I think it's inevitable. As we scale from building features to building systems of agents that build features, the document management layer becomes critical infrastructure.

The Common Thread

Every change I've described comes back to the same principle: reduce friction at every layer of the stack.

Faster terminal = less resource contention for agents
Stable container runtime = fewer unexpected failures mid-pipeline
Push-to-talk transcription = better prompts, better tickets
Text expansion = less time on repetitive input
Rust CLI tools = faster agent tool invocations
Local task management = less latency, less context waste

None of these are revolutionary on their own. But combined, they represent a fundamentally different development environment than what most people are running. And when you're pushing the limits of what a single developer can build with AI agents, these differences compound into a real advantage.

The bottleneck in agentic development isn't the model. It's everything around the model. Optimize that, and you'll be surprised how much more you can ship.