I’ve been writing software since 2012. The introduction of capable AI coding assistants in 2022–2023 is the largest change in the texture of day-to-day development work I’ve experienced. Not because it writes code for me — it mostly doesn’t — but because it changes the cost structure of certain tasks in ways that compound.

This post is about where I actually find leverage, and where the tool gets in the way.

The Cost Structure Change

The useful framing isn’t “AI writes code” but “AI changes what’s expensive.”

Things that used to be cheap and remain cheap:

  • Typing boilerplate you’ve typed a thousand times
  • Pattern-matching on familiar problems

Things that used to be expensive and are now much cheaper:

TaskBefore AI toolsWith AI tools
First draft of a standard CRUD handler10–20 min (muscle memory)2–3 min + review
Unfamiliar library/API surface exploration30–60 min (docs, examples, trial/error)5–10 min (ask, iterate)
Writing tests for existing code20–40 min/function5–10 min/function
Converting data formats (proto → JSON → SQL schema)30 min3 min
Regex / parser for a known-shape format15–30 min2 min
First draft of an RFC or design doc2–4 hours30–60 min (rough draft + substantive editing)

Things that remain expensive or are made harder:

TaskWhy AI tools don’t help much
Debugging subtle concurrency bugsRequires runtime understanding, not pattern matching
Designing systems for new constraintsNovel tradeoffs, no training data for your specific context
Understanding why code behaves a certain way in productionRequires observation, not generation
Code review for correctnessAI review misses class of errors humans would catch from context
Anything requiring deep domain knowledgeThe tool knows general patterns, not your business invariants

Where I Find Actual Leverage

Exploratory work in unfamiliar codebases. “Explain what this 300-line function does, then tell me what would break if I changed the error handling in the retry loop.” This used to require reading carefully for 20 minutes. Now it’s a 2-minute ask followed by a targeted 5-minute read to verify.

Test generation. I write the logic; the AI drafts the test cases. It’s good at covering the obvious cases (happy path, nil inputs, boundary values) and decent at edge cases I describe. It’s not good at knowing which edge cases matter for my system — I still add those manually. But the 80% it generates correctly saves significant time on what was previously the most tedious part of the development cycle.

Format conversion and glue code. Translating a Protobuf schema to a Go struct, converting a REST API response to a different shape for internal use, writing a migration script for a data format change. These are mechanical transformations where pattern matching is sufficient. AI is fast and accurate here.

Documentation and RFC first drafts. I write the outline and key decisions; the AI drafts the prose. The draft is always 40–60% of the way to publishable — coherent structure, reasonable sentences, obviously missing the specific context and nuance that I fill in. Writing the remaining 50% from a rough draft is faster than writing from a blank page.

Library/API surface mapping. “What are the relevant methods on pgx.Conn for handling transactions that might fail partway through?” Three years ago, this was a docs search, an example scan, and some trial/error. Now it’s a 30-second ask. The answer is often 85% correct; I check the parts that matter.

What Doesn’t Work (And Why People Get Misled)

Asking it to “build X.” Generating a complete feature from a description works if the feature is generic (CRUD, a common middleware pattern, a standard data pipeline). It breaks when the feature has any dependency on your specific system’s conventions, your data model, your error handling patterns, or your business logic. The output looks plausible and requires more effort to audit than to write from scratch with knowledge of the codebase.

Using generated code without understanding it. The failure mode that makes senior engineers nervous about the tooling: junior engineers accepting AI output they don’t understand, getting it through review because it looks reasonable, and creating time bombs. The tool accelerates production; it doesn’t accelerate understanding.

Treating it as correct on technical specifics. LLMs hallucinate API details. “How do I do X with library Y?” gets you a confident, plausible-looking answer that is sometimes subtly wrong about method signatures, behavior, or gotchas. Always verify against the actual docs or source for anything load-bearing.

The Compounding Effect

The leverage is highest for engineers who already know what good looks like. You generate quickly, you recognise quality instantly, you fix the gaps efficiently. The tool multiplies capability.

For engineers learning a domain, the leverage is lower and the risks are higher — generated code looks correct even when it isn’t, and you don’t have the domain knowledge to spot the wrong parts. This isn’t an argument against using the tools; it’s an argument for not mistaking “code was generated quickly” for “the code is correct.”

The productivity gains I’ve seen compound in a specific direction: the breadth of what one engineer can competently work on expands. I work in Go primarily, but I can now contribute meaningfully to Python services, write reasonable infrastructure-as-code in Terraform, and produce working SQL for schemas I don’t know by heart — because the AI handles the syntax and standard patterns while I handle the logic and judgment.

That breadth expansion is the genuinely new thing. Not “the AI does the work.” But “I can do more kinds of work without the per-domain overhead of syntax and API memorisation.”

Implications for How I Approach Problems

Some habits that have changed:

I prototype more aggressively. The cost of “let’s see if this approach works” dropped because standing up a working prototype is faster. I explore more options before committing to one.

I write tests first more often. Tests are now cheap enough to write alongside or before implementation rather than as a finishing step. The psychological barrier (tests are tedious) is lower.

I invest more time in the design document. With code generation faster, the bottleneck shifted to design. A day spent on a solid RFC pays off more than before because implementation catches up faster.

I’ve gotten worse at remembering API surfaces. Why remember sort.Slice vs slices.SortFunc when I can ask? This is fine for productivity and mildly concerning for the “what if the tools go away” scenario. The same tradeoff I’ve always made with IDE autocompletion, but larger.

The tools are good. They change the texture of the work. They don’t change what “good engineering” means — the judgment about what to build, how to make it correct, and how to make it maintainable. That part is still on you.