Laputa

Durable Object Write Coalesce

In the blog post Durable Objects: Easy, Fast, Correct — Choose Three, Kenton introduces two critical implementation concepts of Durable Objects: the input gate and the output gate. These mechanisms work in tandem to address data races and enable seamless in-memory caching, ensuring both correctness and performance.

However, I’m particularly intrigued by another feature Kenton highlighted as a bonus of the output gate implementation: automatic write coalescing.

“First, writes are automatically coalesced. That is, if you perform multiple put() or delete() operations without awaiting them or anything else in between, then the operations are automatically grouped together and stored atomically.”

This raises several questions:

These questions motivated me to dive deeper into the implementation details.

A Naive Approach

The ultimate goal is implementing the output gate to block network messages until writes complete—write coalescing is a beneficial side effect.

Output gates: When a storage write operation is in progress, any new outgoing network messages will be held back until the write has completed. We say that these messages are waiting for the “output gate” to open. If the write ultimately fails, the outgoing network messages will be discarded and replaced with errors, while the Durable Object will be shut down and restarted from scratch.

It’s not difficult to think of a simple approach:

  1. When a write operation is detected, lock the output gate immediately.
  2. Register a callback to the write to unlock the gate when the write completes.
  3. Execute the write operation, and the callback will be triggered on completion.

However, Workerd employs a more clever and elegant solution that not only achieves the desired functionality but also optimizes performance by coalescing multiple writes into a single transaction.

Write Detection

The ctx.storage KV store provided to developers is essentially a wrapper around SQLite. Each KV operation becomes a SQL statement that emulates KV behavior. Before executing any SQL statement, Workerd determines whether the statement is read-only. If the statement involves a write operation, a callback is triggered before the SQL execution. This logic resides in the Query::checkRequirements method:

1
2
3
4
5
6
7
8
9
void SqliteDatabase::Query::checkRequirements(size_t size) {
  sqlite3_stmt* statement = getStatement();
  
  KJ_IF_SOME(cb, db.onWriteCallback) {
    if (!sqlite3_stmt_readonly(statement)) {
      cb();
    }
  }
}

Here, db.onWriteCallback is set to ActorSqlite::onWrite, which is registered during the initialization of the ActorSqlite instance. This callback plays a pivotal role in enabling write coalescing.

The onWrite Callback

Here’s a simplified version of the onWrite callback:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
if (currentTxn.is<NoTxn>()) {
  auto txn = kj::heap<ImplicitTxn>(*this);

  commitTasks.add(outputGate.lockWhile(
      kj::evalLater([this, txn = kj::mv(txn)]() mutable -> kj::Promise<void> {
    txn->commit();

    { auto drop = kj::mv(txn); }

    return commitImpl(kj::mv(precommitAlarmState));
  })));
}

Internally, Workerd batches writes using a SQLite transaction, specifically an “implicit” transaction. Each ActorSqlite instance maintains a currentTxn variable, which can be in one of three states:

  1. NoTxn: No transaction is currently in progress.
  2. ExplicitTxn: A transaction explicitly initiated by the user via ctx.storage.transaction.
  3. ImplicitTxn: An internal mechanism used to batch writes automatically.

For understanding write coalescing, focusing on ImplicitTxn suffices. The onWrite callback operates as follows:

  1. Transaction in Progress: If a transaction is already active, the write triggering onWrite is added to the current transaction, enabling coalescing. No further action is needed.
  2. No Active Transaction: If no transaction is in progress, a new “implicit” transaction is created to manage the current and any subsequent writes.
  3. Output Gate Locking: The output gate is locked using lockWhile, which ensures the gate remains locked until the associated promise resolves. The returned promise is stored in commitTasks (a task management container) to handle its lifecycle and exception propagation.
  4. Commit Scheduling: A callback is registered via kj::evalLater, which schedules the transaction commit to occur at the end of the current event loop turn.

Notes on kj::evalLater()

The key to coalescing lies in kj::evalLater. While outputGate.lockWhile and kj::evalLater execute immediately, the lambda passed to kj::evalLater is deferred. It schedules the commit as a KJ event to run at the end of the current event loop turn using breadth-first scheduling.

This deferred execution ensures that multiple writes occurring in the same JavaScript execution context get batched into a single transaction before the commit executes.

When Write Coalescing Happens

To validate our theoretical understanding, let’s conduct some simple tests. First, I added debug logs to observe the batching behavior:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
diff --git a/src/workerd/io/actor-sqlite.c++ b/src/workerd/io/actor-sqlite.c++
index 22a3fe89b..c50f2454a 100644
--- a/src/workerd/io/actor-sqlite.c++
+++ b/src/workerd/io/actor-sqlite.c++
@@ -218,12 +218,15 @@ void ActorSqlite::onCriticalError(
 }
 
 void ActorSqlite::onWrite() {
+  KJ_LOG(INFO, "Calling onWrite");
   if (currentTxn.is<NoTxn>()) {
+    KJ_LOG(INFO, "Creating implicit transaction");
     auto txn = kj::heap<ImplicitTxn>(*this);
 
     commitTasks.add(outputGate.lockWhile(
         kj::evalLater([this, txn = kj::mv(txn)]() mutable -> kj::Promise<void> {
+      KJ_LOG(INFO, "Committing implicit transaction");
       txn->commit();

The first test case is based on Kenton’s example:

1
2
3
ctx.storage.put("key1", "value1");
ctx.storage.put("key2", "value2");
ctx.storage.put("key3", "value3");

The resulting logs:

1
2
3
4
5
Calling onWrite
Creating implicit transaction
Calling onWrite
Calling onWrite
Committing implicit transaction

Here’s what happens:

  1. The first put triggers the creation of a new ImplicitTxn.
  2. The second and third put operations reuse the same ImplicitTxn.
  3. The transaction is committed, batching all three key-value pairs.

The ensureInitialized() Function

Note that the actual logs may include additional entries from ensureInitialized, which is also a write operation invoked before any storage operation to set up the SQLite database if it hasn’t been initialized yet.

Take put for example:

1
2
3
void SqliteKv::put(KeyPtr key, ValuePtr value) {
  ensureInitialized().stmtPut.run(key, value);
}

In this case, the actual logs for the first test case might look like this:

1
2
3
4
5
6
Calling onWrite // ensureInitialized
Creating implicit transaction
Calling onWrite // 1st put
Calling onWrite // 2nd put
Calling onWrite // 3rd put
Committing implicit transaction

This is a one-time setup and doesn’t affect the write coalescing behavior. And for the sake of clarity, we can ignore these logs in our analysis.

Error Resilience

I initially assumed that errors after writes would roll them back. However, I overlooked that once the commit task is scheduled by evalLater(), subsequent JavaScript errors cannot cancel it.

1
2
3
ctx.storage.put("key1", "value1");
ctx.storage.put("key2", "value2");
throw new Error("Intentional error");

The logs confirm this:

1
2
3
4
Calling onWrite
Creating implicit transaction
Calling onWrite
Committing implicit transaction

When Write Coalescing Breaks

As Kenton noted, write coalescing only occurs “without awaiting them or anything else in between”. Let’s test this scenario:

1
2
3
ctx.storage.put("key1", "value1");
let response = await fetch("https://api.example.com");
ctx.storage.put("key2", "value2"); // Will this coalesce?

The logs confirm Kenton’s observation:

1
2
3
4
5
6
Calling onWrite
Creating implicit transaction
Committing implicit transaction
Calling onWrite
Creating implicit transaction
Committing implicit transaction

Breaking it down:

  1. The fetch call performs an asynchronous network operation.
  2. The await suspends JavaScript execution and hands control back to the KJ event loop.
  3. The KJ event loop processes pending tasks, including the evalLater() commit callback.
  4. The transaction commits, and currentTxn resets to NoTxn.
  5. The subsequent put creates a new transaction.

In extreme cases, write coalescing can downgrade to single write per transaction, leading to performance degradation. This demonstrates that to maximize batching, write operations should be grouped together without await or other asynchronous operations in between.

Awaiting Storage Operations

Interestingly, awaiting a storage operation does not break write coalescing:

1
2
3
ctx.storage.put("key1", "value1");
await ctx.storage.put("key2", "value2");
ctx.storage.put("key3", "value3"); // Will this coalesce?

The logs:

1
2
3
4
5
Calling onWrite
Creating implicit transaction
Calling onWrite
Calling onWrite
Committing implicit transaction

This happens because storage operations return already-resolved promises, so V8 doesn’t suspend JavaScript execution and no event loop handoff occurs. Therefore, the evalLater() callback remains pending.

However, relying on this behavior is discouraged. Storage operations returning resolved promises is implementation-specific and may change in future updates, such as backpressure mechanisms, making this behavior unreliable for application logic.

Summary and Best Practices

From exploring Durable Object write coalescing implementation, we learned several key points:

How it works:

Important behaviors to be aware of:

Practical recommendations:

1
2
3
4
5
6
7
8
// Good: Group related writes together
ctx.storage.put("user", userData);
ctx.storage.put("session", sessionData);
ctx.storage.put("timestamp", Date.now());
// All three writes get coalesced

// Then do async I/O
await fetch("/analytics");
1
2
3
4
// Avoid: Mixing storage writes with async operations
ctx.storage.put("user", userData);
await fetch("/analytics");  // This breaks coalescing
ctx.storage.put("session", sessionData);  // Separate transaction

Key takeaways:

Understanding these implementation details helps you write more efficient Durable Object applications and debug coalescing issues when they arise.

#Workerd #DurableObject #KJ #SQLite #V8 #JavaScript