6 min read
How I Actually Build Software with Claude Code

A year ago I wrote a version of this post listing four AI tools I used daily: Claude, ChatGPT, GitHub Copilot, and Cursor. It read like a product comparison chart. The reality of my workflow today looks nothing like that.

After 1,500+ commits across 2025 and into 2026, the tool list has narrowed and the roles have sharpened. I use three AI systems, each for a specific surface: Claude Code for implementation, Claude Cowork with browser extensions for web-interactive tasks, and GPT as an automated PR reviewer. That’s it.

Claude Code: The Engine

Claude Code is where the work happens. Every feature branch, every test, every PR description — it’s all co-authored with Claude in the terminal.

This isn’t a vague “I use AI to help me code” statement. If you look at my ShopSmith_v2 commit history, nearly every commit carries a Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> line. In January 2026 alone, that repo saw 86 commits and 37 merged PRs — all through Claude Code.

What makes this productive isn’t the model itself — it’s the project context system. Every active repo has a CLAUDE.md file that serves as a living project spec: architecture decisions, coding conventions, phase tracking, implementation examples, and safety rules. When Claude Code starts a session, it reads this file first, which means it’s not generating generic code — it’s working within the project’s actual patterns.

For example, my graph_starz_nov_20 CLAUDE.md is over 900 lines. It includes phase-by-phase implementation guides with code examples, so Claude can copy-customize-test from working patterns rather than generating from scratch. This is the single biggest productivity lever I’ve found: invest in your CLAUDE.md and every session pays dividends.

I also use .claude/commands/ directories for project-specific workflows — reusable scripts that Claude can execute for common tasks like cleanup, test runs, or deployment prep.

Claude Cowork + Browser Extensions: The Web Layer

Some work can’t happen in a terminal. Publishing an Etsy listing, navigating GitHub’s UI, testing a live demo — these require a browser. That’s where Claude Cowork with the Chrome extension comes in.

The clearest example is PR #53 on ShopSmith_v2: “Claude-in-Chrome assisted Etsy publishing cockpit.” I built a feature where the app renders a structured JSON data contract — a PublishKit — that Claude-in-Chrome reads directly from the page. It then fills Etsy listing fields from that data, handles the copy-paste of descriptions, and stages files for upload.

The key design constraint: Claude never clicks “Publish.” The system enforces a draft-first workflow with explicit human verification. CAPTCHAs, login flows, and file picker dialogs are all handed back to the human. This isn’t about full automation — it’s about eliminating the tedious middle steps while keeping the human on the critical path.

I also used Cowork for the MCP Python SDK contribution — six sessions that spanned repo analysis, maintainer research, implementation, and CI debugging. Multi-session orchestrated work is where Cowork shines over single-shot conversations.

GPT as Automated PR Reviewer

This is the piece that surprises people. I have the ChatGPT Codex Connector set up as a bot on my repos. Every PR gets an automated review from GPT.

On the same ShopSmith PR #53, GPT’s review caught a real security vulnerability. The staging.ts file joined a productId route parameter directly into a filesystem path without validation. Since the staging service calls fs.rmSync(..., { recursive: true, force: true }) to clean up, a crafted request with ../ as the product ID could have deleted directories outside the staging root.

GPT flagged it. The next commit was fix(publish): address PR review feedback — adding a strict regex validation on product IDs and verifying that resolved paths stay within the staging directory before any filesystem operation.

This isn’t theoretical “AI finds bugs” content. It’s a specific path traversal vulnerability found by an automated reviewer on a specific PR, fixed in a specific commit. The multi-tool workflow earned its keep that day.

What Didn’t Stick

I tried Cursor and GitHub Copilot. Both are fine tools, but they didn’t survive contact with a Claude Code workflow. Once your project has a rich CLAUDE.md and Claude can see your entire codebase, the value of line-by-line autocomplete drops. And Cursor’s multi-file editing didn’t outperform Claude Code’s approach of working through a feature branch with full project context.

I also experimented with local models through Ollama. The performance gap was too large for production work. Maybe that changes as open-weight models improve, but for now it’s not close.

Principles (Updated)

These have evolved from the abstract version I wrote a year ago:

Documentation is the source of truth. A well-maintained CLAUDE.md is worth more than a clever prompting strategy. It makes every Claude Code session start from a place of deep project understanding rather than cold inference.

Automate the review layer. Having GPT review every PR catches things I’d miss — not because I’m careless, but because a second perspective at the code level is genuinely valuable. It costs nothing and occasionally catches real bugs.

Different tools for different surfaces. Terminal work is Claude Code. Browser work is Cowork. Review is GPT. Trying to force one tool into all three modes creates friction.

AI assists, humans decide. This hasn’t changed. Every merge is manual. Every publish is manual. The ShopSmith cockpit’s “NEVER click Publish” rule exists because the most dangerous failure mode is an AI taking an irreversible action without human review.

Verify claims, especially your own. The original version of this post listed tools I barely used and described a workflow I didn’t follow. Grounding your process documentation in actual evidence — commit history, PR descriptions, co-author lines — keeps it honest.