Bits, Trades & Systems

Field notes from 13+ years writing software for markets — HFT latency, JVM internals, Clojure in anger, and production Go.

Go 1.23 Range Over Functions: What It's For and What It Isn't

Go 1.23 stabilised range over function iterators, a feature that had been in the rangefunc experiment since 1.22. It’s the most significant addition to the range statement since channels were added. The reaction has been mixed: people who needed it find it elegant; people who didn’t need it find it confusing. Both reactions are reasonable. Here’s what it actually does and where it belongs. ...

Evaluating LLM-Integrated Systems: What Works and What Doesn't

The large US technology company builds systems where LLM calls are in the critical path. These systems are hard to test in the traditional sense — the outputs are probabilistic, the failure modes are subtle, and the “correct” answer for many queries doesn’t have a binary definition. After two years of building and evaluating LLM-integrated systems, here’s what actually works. ...

Distributed Consistency Models: What Your Service Actually Guarantees

At the large US tech company, the hardest design review conversations were almost never about which database to use. They were about what consistency guarantee the system needed, and whether the proposed design actually provided it. “Eventually consistent” is not a useful answer to “what does this service guarantee?” It describes the best case for a wide range of behaviours, some of which are harmless and some of which can cause correctness bugs in production. ...

AI-Native Development: What It Actually Means to Use These Tools Well

I’ve been writing software since 2012. The introduction of capable AI coding assistants in 2022–2023 is the largest change in the texture of day-to-day development work I’ve experienced. Not because it writes code for me — it mostly doesn’t — but because it changes the cost structure of certain tasks in ways that compound. This post is about where I actually find leverage, and where the tool gets in the way. ...

Staff Engineer or Engineering Manager: On Choosing the Road That Doesn't Come Back

Somewhere around year eight or nine, the question stops being hypothetical. Both paths are available. The organisation is large enough that both exist as genuine roles, not just titles. You have to actually decide. I’ve been close to this decision twice. The first time, at the startup, it was resolved by circumstance — there was no EM role to take, so staff-track it was. The second time, in a larger organisation, it was a real choice. Here’s the framework I used and what I’d change about it. ...

Building with AI Coding Tools: What Actually Changes and What Doesn't

I’ve been using AI coding assistants heavily since 2023 — first Copilot, then Claude, then a combination. At this point, not having them feels like losing a limb. But the way I use them now is different from how I started, and the difference is mostly about understanding what these tools are good at and building habits that work with their strengths. ...

Cross-Team Technical Alignment at Scale

At the large US technology company, no single team controls the entire system. A feature that touches payments, identity, and platform teams requires coordination across three codebases, three on-call rotations, and three sets of priorities. Getting technical alignment across teams without creating bureaucratic overhead is an active engineering problem. ...

Tail-Based Trace Sampling: Why Head Sampling Is Usually Wrong

The large US technology company runs at a scale where tracing every request is financially and computationally impractical. You have to sample. How you sample determines whether your traces are useful. Most teams implement head-based sampling — decide whether to trace a request when it starts. This is the easy implementation and produces useless traces for most debugging purposes. ...

RAG Systems in Production: What the Tutorials Don't Cover

RAG is architecturally simple: chunk documents, embed them, store in a vector DB, retrieve the top-k on query, pass retrieved context to an LLM, return answer. The demo takes an afternoon. The production system takes months, because “works on the demo documents” is nowhere near “answers correctly 95% of the time across the full document corpus.” This post is about the gap between those two states. ...

Writing RFCs for Wide Audiences

At the large US technology company, RFCs circulate widely. A proposal touching platform infrastructure might be read by engineering leadership, a dozen affected teams, security review, and a product counterpart — none of whom share the same technical context. Writing for a narrow expert audience is one skill. Writing for a wide, mixed audience is a different one. ...