The service started in 3 seconds in development, but the first live trade after deployment took 800ms instead of the expected sub-10ms. The second trade was fine. We couldn’t reproduce it in load tests.
The culprit was classloading. The trade execution path touched 47 classes that had never been loaded before. Loading them, verifying the bytecode, and running static initialisers took 800ms — once, at first use.
Understanding exactly how that happens is worth the time.
The Classloader Hierarchy
Java loads classes on demand through a delegation hierarchy:
Bootstrap ClassLoader (JVM built-in)
└── loads: java.lang.*, java.util.*, rt.jar / java.base module
Extension/Platform ClassLoader
└── loads: javax.*, jdk.*
Application ClassLoader
└── loads: your classpath, your code
Custom ClassLoaders (if any)
└── OSGi, Spring Boot fat-jar, application servers
When a class is needed, the classloader delegates upward first. The bootstrap loader checks if it can load it. If not, the extension loader tries. If not, the application loader loads it from the classpath.
This means java.util.HashMap is always loaded by the bootstrap loader, regardless of which classloader is in scope. The delegation prevents different versions of core classes from conflicting.
When Classloading Happens
A class is loaded the first time any of these occur:
- An instance is created (
new Foo()) - A static method or field is accessed (
Foo.BAR,Foo.doSomething()) - A static initialiser in a class you’re loading references another class
- Explicit:
Class.forName("com.example.Foo")
Importantly: class loading does not happen when a class is declared as a field type or method parameter. Only execution.
| |
RiskChecker is not loaded when OrderRouter is loaded. It’s loaded on the first call to route(). If RiskChecker has a static initialiser that reads config from disk, that disk I/O happens on the first call to route().
What Classloading Actually Does
Loading a class involves five steps:
1. Loading
Read bytecode from .class file (or JAR)
→ disk I/O, decompression if JAR
2. Verification
Check bytecode is structurally valid, no security violations
→ CPU-bound, can be significant for large classes
3. Preparation
Allocate static fields, set default values (null/0/false)
→ fast
4. Resolution
Resolve symbolic references to actual types
→ triggers loading of referenced classes (recursive!)
5. Initialisation
Execute static initialisers (<clinit> method)
→ runs your static { ... } blocks, static field initialisers
Step 4 — resolution — is the sneaky one. Loading OrderExecutor resolves its references, which loads PriceValidator, which resolves its references, loading CurrencyPair and RateLimit, and so on. Loading one class can trigger loading dozens of transitive dependencies.
Step 5 — static initialisation — is where first-call latency often lives. If a class does anything expensive in a static block, that cost is paid by the first caller:
| |
The Classloading Lock
Classloading is synchronised per-class. The JVM holds a lock while initialising a class to prevent duplicate initialisation from concurrent threads. This means:
- The first thread to need class
Fooloads and initialises it under the lock - Other threads that need
Fooconcurrently block until initialisation is complete - After loading, the class is cached — subsequent uses don’t acquire the lock
Under high concurrency at startup (many threads all making first-ever calls), classloading becomes a serialisation point. Multiple threads, each needing different classes, each blocking on their respective classloading locks, can create a startup latency spike far worse than any individual class load time.
Eager Loading Strategies
For latency-sensitive services where first-call timing matters, force class loading at startup:
Warm up the code paths explicitly:
| |
This triggers loading and initialisation of all transitive dependencies in a controlled, single-threaded context at startup rather than under live load.
Force-load by class name:
| |
AppCDS (Application Class Data Sharing):
Java 10+ allows pre-sharing class metadata so that classloading on subsequent startups skips the verification and resolution steps for cached classes:
| |
For long-running services, AppCDS improves startup time but has no effect after startup — all classes are loaded by then anyway. For short-lived processes (Lambda, CLI tools), AppCDS can cut startup from seconds to sub-second.
Diagnosing Classloading Problems
-verbose:class logs every class load to stdout:
[Loaded com.example.OrderExecutor from file:/app/app.jar]
[Loaded com.example.RiskChecker from file:/app/app.jar]
[Loaded java.util.concurrent.ConcurrentHashMap$Node from /java/java.base]
This produces enormous output for any real application — pipe to grep for the classes you care about.
For startup profiling, Flight Recorder’s “Class Loading” event category shows each load with timestamps, letting you see which classes are loaded when and which ones are slow.
The symptom to look for in production: first-request latency significantly higher than p99 steady-state, affecting the first N requests after startup or after a cold deployment where the process was restarted. If the spike is 10–100× the steady-state latency and then never recurs, classloading is almost certainly involved.