Distributed Transactions Are a Lie (And What to Do Instead)

Every discussion of distributed systems eventually reaches the question: “can we just wrap this in a transaction?” The answer is technically yes and practically no. Understanding why — and what to do instead — is one of the more important shifts in distributed systems thinking.

What 2PC Promises and Why It Fails

Two-phase commit (2PC) coordinates a transaction across multiple participants:

Coordinator                    Participant A    Participant B
    │                              │                │
    ├─ PREPARE ──────────────────→ │                │
    ├─ PREPARE ─────────────────────────────────→  │
    │                              │                │
    │ ←─ PREPARED ─────────────── │                │
    │ ←─ PREPARED ───────────────────────────────  │
    │                              │                │
    ├─ COMMIT ───────────────────→ │                │
    ├─ COMMIT ──────────────────────────────────→  │
    │                              │                │
    │ ←─ COMMITTED ─────────────  │                │
    │ ←─ COMMITTED ────────────────────────────── │

During the PREPARE phase, each participant locks the resources needed for the transaction and promises it can commit. During COMMIT, they apply the changes and release locks.

The problems:

Blocking under failure: if the coordinator fails after sending PREPARE but before sending COMMIT, all participants are stuck holding locks indefinitely, waiting for a COMMIT that never comes. The participants can’t safely abort (they promised to commit) and can’t commit (they haven’t been told to). The system is blocked until the coordinator recovers.

Latency: two round trips (prepare + commit) across network links. For cross-datacenter transactions, this is hundreds of milliseconds. For low-latency systems, this is prohibitive.

Availability vs. consistency trade-off: 2PC chooses consistency over availability. Under network partition, it halts rather than risk inconsistency. For many applications, halting is worse than temporary inconsistency.

The Saga Pattern

A saga is a sequence of local transactions, each of which publishes an event or message that triggers the next step. If a step fails, compensating transactions undo the preceding steps.

For a trade settlement workflow:

Step 1: Debit client account         (compensate: credit account)
Step 2: Credit counterparty account  (compensate: debit account)
Step 3: Update position records      (compensate: revert positions)
Step 4: Send settlement confirmation (no compensation — it's done)

Success flow:
  Step 1 → Step 2 → Step 3 → Step 4 ✓

Failure at Step 3:
  Step 1 → Step 2 → Step 3 FAILS
  ← Compensate Step 2 ← Compensate Step 1 ✓ (rolled back)

Each local transaction commits to its own database. The saga coordinator (or a choreography of events) sequences the steps. No distributed lock, no blocking on coordinator failure.

Choreography (no central coordinator):

Account Service:  publish "AccountDebited" event
Settlement Service: on "AccountDebited" → execute step 2 → publish "PositionUpdated"
Position Service: on "PositionUpdated" → execute step 3 → publish "SettlementComplete"

Each service listens for the event that triggers its step and publishes the event that triggers the next. If a step fails, it publishes a failure event, and earlier services compensate.

Orchestration (central coordinator):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class SettlementSaga {
    void execute(Settlement s) {
        try {
            accountService.debit(s.clientId, s.amount);
            accountService.credit(s.counterpartyId, s.amount);
            positionService.update(s.positions);
            confirmationService.send(s.confirmationId);
        } catch (Exception e) {
            // Compensate in reverse order:
            positionService.revert(s.positions);
            accountService.credit(s.clientId, s.amount);
            accountService.debit(s.counterpartyId, s.amount);
        }
    }
}

Idempotency: The Key to Safe Retries

Sagas require that each step can be retried safely. If the network fails after a service commits but before it responds, the caller will retry. Without idempotency, the step executes twice.

The pattern: accept a client-provided idempotency key with each request. Store (key, result) and return the stored result for duplicates:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
@Transactional
public AccountDebitResult debit(String idempotencyKey, String accountId, Money amount) {
    // Check for duplicate:
    Optional<AccountDebitResult> existing = idempotencyStore.get(idempotencyKey);
    if (existing.isPresent()) {
        return existing.get();  // return same result as original
    }

    // Execute and store:
    AccountDebitResult result = executeDebit(accountId, amount);
    idempotencyStore.save(idempotencyKey, result);
    return result;
}

The idempotency key is typically sagaId + stepName. The store is local to the service. The check and save are within the same transaction — so the result is stored atomically with the operation.

The Outbox Pattern: Reliable Event Publishing

A common saga failure mode: the service commits its local database transaction but fails to publish the event to Kafka/message broker. The saga stalls because the next step never receives the trigger.

The outbox pattern solves this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
@Transactional
void processStep(StepInput input) {
    // Both the business operation and the event go in the same transaction:
    repository.save(updatedEntity(input));
    outboxRepository.save(OutboxEvent.of("StepCompleted", input.sagaId()));
    // If this transaction commits, the event is guaranteed to be published
    // If it rolls back, neither the business change nor the event is persisted
}

// A separate process polls the outbox and publishes to Kafka:
@Scheduled(fixedDelay = 100)
void publishOutboxEvents() {
    List<OutboxEvent> events = outboxRepository.findUnpublished();
    for (OutboxEvent event : events) {
        kafka.send(event.topic(), event.payload());
        outboxRepository.markPublished(event.id());
    }
}

The outbox table is in the same database as the business data. The transaction that saves the business change also saves the event — atomically. A separate publisher reads the outbox and publishes to Kafka. At-least-once delivery is guaranteed; the downstream consumer handles duplicates via idempotency keys.

When 2PC Is Still Worth It

Distributed transactions are not always wrong. When:

Both participants are within the same data center, on the same database cluster
Transaction volumes are low (hundreds per minute, not thousands per second)
The blocking failure mode is acceptable (you’ll catch it quickly and restart the coordinator)
The alternative (saga with compensation logic) is more complex than the benefit justifies

XA transactions over two databases on the same network are reliable and fast enough for many use cases. The problem is using them across datacenters, across high-latency links, or as a substitute for designing for failure.

The principle: sagas are more complex upfront (compensation logic, idempotency, outbox) but more reliable under failure. 2PC is simpler upfront but has failure modes that block the entire system. At scale and under adversarial failure conditions, the saga approach wins. For low-volume internal workflows with predictable failure modes, 2PC may be entirely adequate.

What 2PC Promises and Why It Fails#

The Saga Pattern#

Idempotency: The Key to Safe Retries#

The Outbox Pattern: Reliable Event Publishing#

When 2PC Is Still Worth It#

What 2PC Promises and Why It Fails

The Saga Pattern

Idempotency: The Key to Safe Retries

The Outbox Pattern: Reliable Event Publishing

When 2PC Is Still Worth It