Jvm-Internals

Project Loom Preview: Virtual Threads and What They Mean for Server Code

Java’s threading model has a fundamental scalability problem: OS threads are expensive. Creating thousands of them consumes gigabytes of stack memory and causes significant scheduling overhead. This is why reactive programming (Netty, Project Reactor, RxJava) became popular — it avoids the thread-per-request model by using event loops and async callbacks. Project Loom, announced in 2017 with early previews arriving in 2018, proposed a different solution: make threads cheap. Virtual threads — JVM-managed threads that are not 1:1 with OS threads — could make the thread-per-request model scalable again. ...

From Java 8 to Java 11 in a Regulated Environment: What Actually Broke

Java 11 was the first long-term support release after Java 8. Oracle’s announcement that commercial Java 8 support would end pushed the bank’s architecture committee to approve a migration. In theory: update the JDK, update the build files, done. In practice: six months of discovery. This is a frank account of what broke. ...

Reading GC Logs Like a Detective

GC logs are always-on, low-overhead diagnostic data that the JVM will produce for you. They tell you the timing, cause, duration, and effect of every collection — if you know how to read them. Most Java engineers can tell you what GC does. Far fewer can look at a GC log and immediately see why the p99 latency spiked at 14:37 last Tuesday. ...

Threading Models in Java: Which One Does Your System Actually Need?

The move from a small trading firm to a large financial institution meant working with codebases an order of magnitude larger, maintained by dozens of engineers across multiple teams. It also meant encountering the full spectrum of Java threading models in production — some appropriate, some inherited from a different era, and some that were actively causing problems. This is a survey of what those models look like, what they’re good at, and how you tell which one a system needs. ...

Heap Dumps and Flight Recorder: Diagnosing JVM Memory Problems in Production

At the large financial institution where I worked from 2016, the JVM services were larger and longer-running than anything I’d dealt with in the previous role. Old generation sizes in the hundreds of gigabytes. Services running for months between restarts. Memory problems that took days or weeks to manifest. The debugging approach that worked in trading — small heaps, frequent restarts, aggressive allocation control — didn’t apply here. You had to diagnose production JVM state without stopping it. ...

ZGC and Shenandoah: What Low-Pause GC Means for Trading Systems

In 2015, the state of GC for latency-sensitive Java was: use G1GC, tune it carefully, accept occasional 50–200ms pauses on large heaps, and work around them with off-heap storage and careful allocation management. The conventional wisdom was that sub-10ms GC pauses required small heaps (< 4GB) or near-zero allocation on the hot path. For trading systems with large position caches, this meant either expensive off-heap engineering or living with GC latency spikes. Then the early previews of ZGC (Oracle/Sun) and Shenandoah (Red Hat) started circulating. Both claimed sub-millisecond pause times regardless of heap size. The mechanisms were different, the implications were significant. ...

Understanding Safepoints: The JVM Pauses Nobody Talks About

We’d tuned GC to near-perfection. Pause times were sub-millisecond. The p99.9 latency was still spiking to 8ms several times a day, with no GC events anywhere near those spikes. It took three weeks to find the cause: safepoints, specifically a revoke-bias operation triggered by lock patterns in a third-party library. ...

Choosing a GC Collector for Low-Latency Java: A Practical Comparison

By mid-2014 we had run CMS, G1, and Parallel GC in production on the same workload, and had evaluated Azul Zing. Here’s what we found and the decision framework that came out of it. ...

Off-Heap Memory in Java: sun.misc.Unsafe and Chronicle Map

One of the FX trading desks kept a reference data structure in memory: all live position limits and risk parameters for every currency pair, updated in real time from a risk management system. The structure held about 40GB of data and was read by the trading engine on every price update. Putting 40GB on the Java heap was not an option. GC pauses on a 40GB heap are seconds, not milliseconds. The solution was off-heap allocation — memory that exists outside the GC’s visibility, managed explicitly by the application. ...

Why Average Latency Is a Lie: HdrHistogram and Measuring What Matters

If someone tells you their system has 2ms average latency, they’ve told you almost nothing useful. A system that delivers 1ms 99% of the time and 100ms 1% of the time has 2ms average latency. So does a system that delivers 2ms every single time. These behave completely differently in production. The problem isn’t measurement frequency — it’s that averages destroy the distribution. ...