Durable Object Write Coalesce
In the blog post Durable Objects: Easy, Fast, Correct — Choose Three, Kenton introduces two critical implementation concepts of Durable Objects: the input gate and the output gate. These mechanisms work in tandem to address data races and enable seamless in-memory caching, ensuring both correctness and performance.
However, I’m particularly intrigued by another feature Kenton highlighted as a bonus of the output gate implementation: automatic write coalescing.
“First, writes are automatically coalesced. That is, if you perform multiple
put()
ordelete()
operations withoutawait
ing them or anything else in between, then the operations are automatically grouped together and stored atomically.”
This raises several questions:
- How is automatic write coalescing implemented under the hood?
- Are there any caveats developers should be aware of?
- How can we leverage this feature effectively when building Durable Object-based applications?
These questions motivated me to dive deeper into the implementation details.
A Naive Approach
The ultimate goal is implementing the output gate to block network messages until writes complete—write coalescing is a beneficial side effect.
Output gates: When a storage write operation is in progress, any new outgoing network messages will be held back until the write has completed. We say that these messages are waiting for the “output gate” to open. If the write ultimately fails, the outgoing network messages will be discarded and replaced with errors, while the Durable Object will be shut down and restarted from scratch.
It’s not difficult to think of a simple approach:
- When a write operation is detected, lock the output gate immediately.
- Register a callback to the write to unlock the gate when the write completes.
- Execute the write operation, and the callback will be triggered on completion.
However, Workerd employs a more clever and elegant solution that not only achieves the desired functionality but also optimizes performance by coalescing multiple writes into a single transaction.
Write Detection
The ctx.storage
KV store provided to developers is essentially a wrapper around SQLite. Each KV operation becomes a SQL statement that emulates KV behavior. Before executing any SQL statement, Workerd determines whether the statement is read-only. If the statement involves a write operation, a callback is triggered before the SQL execution. This logic resides in the Query::checkRequirements
method:
|
|
Here, db.onWriteCallback
is set to ActorSqlite::onWrite
, which is registered during the initialization of the ActorSqlite
instance. This callback plays a pivotal role in enabling write coalescing.
The onWrite
Callback
Here’s a simplified version of the onWrite
callback:
|
|
Internally, Workerd batches writes using a SQLite transaction, specifically an “implicit” transaction. Each ActorSqlite
instance maintains a currentTxn
variable, which can be in one of three states:
NoTxn
: No transaction is currently in progress.ExplicitTxn
: A transaction explicitly initiated by the user viactx.storage.transaction
.ImplicitTxn
: An internal mechanism used to batch writes automatically.
For understanding write coalescing, focusing on ImplicitTxn
suffices. The onWrite
callback operates as follows:
- Transaction in Progress: If a transaction is already active, the write triggering
onWrite
is added to the current transaction, enabling coalescing. No further action is needed. - No Active Transaction: If no transaction is in progress, a new “implicit” transaction is created to manage the current and any subsequent writes.
- Output Gate Locking: The output gate is locked using
lockWhile
, which ensures the gate remains locked until the associated promise resolves. The returned promise is stored incommitTasks
(a task management container) to handle its lifecycle and exception propagation. - Commit Scheduling: A callback is registered via
kj::evalLater
, which schedules the transaction commit to occur at the end of the current event loop turn.
Notes on kj::evalLater()
The key to coalescing lies in kj::evalLater
. While outputGate.lockWhile
and kj::evalLater
execute immediately, the lambda passed to kj::evalLater
is deferred. It schedules the commit as a KJ event to run at the end of the current event loop turn using breadth-first scheduling.
This deferred execution ensures that multiple writes occurring in the same JavaScript execution context get batched into a single transaction before the commit executes.
When Write Coalescing Happens
To validate our theoretical understanding, let’s conduct some simple tests. First, I added debug logs to observe the batching behavior:
|
|
The first test case is based on Kenton’s example:
|
|
The resulting logs:
|
|
Here’s what happens:
- The first
put
triggers the creation of a newImplicitTxn
. - The second and third
put
operations reuse the sameImplicitTxn
. - The transaction is committed, batching all three key-value pairs.
The ensureInitialized()
Function
Note that the actual logs may include additional entries from ensureInitialized
, which is also a write operation invoked before any storage operation to set up the SQLite database if it hasn’t been initialized yet.
Take put
for example:
|
|
In this case, the actual logs for the first test case might look like this:
|
|
This is a one-time setup and doesn’t affect the write coalescing behavior. And for the sake of clarity, we can ignore these logs in our analysis.
Error Resilience
I initially assumed that errors after writes would roll them back. However, I overlooked that once the commit task is scheduled by evalLater()
, subsequent JavaScript errors cannot cancel it.
|
|
The logs confirm this:
|
|
When Write Coalescing Breaks
As Kenton noted, write coalescing only occurs “without await
ing them or anything else in between”. Let’s test this scenario:
|
|
The logs confirm Kenton’s observation:
|
|
Breaking it down:
- The
fetch
call performs an asynchronous network operation. - The
await
suspends JavaScript execution and hands control back to the KJ event loop. - The KJ event loop processes pending tasks, including the
evalLater()
commit callback. - The transaction commits, and
currentTxn
resets toNoTxn
. - The subsequent
put
creates a new transaction.
In extreme cases, write coalescing can downgrade to single write per transaction, leading to performance degradation. This demonstrates that to maximize batching, write operations should be grouped together without await
or other asynchronous operations in between.
Awaiting Storage Operations
Interestingly, await
ing a storage operation does not break write coalescing:
|
|
The logs:
|
|
This happens because storage operations return already-resolved promises, so V8 doesn’t suspend JavaScript execution and no event loop handoff occurs. Therefore, the evalLater()
callback remains pending.
However, relying on this behavior is discouraged. Storage operations returning resolved promises is implementation-specific and may change in future updates, such as backpressure mechanisms, making this behavior unreliable for application logic.
Summary and Best Practices
From exploring Durable Object write coalescing implementation, we learned several key points:
How it works:
- Write coalescing is achieved by deferring transaction commits using
kj::evalLater
- Multiple writes in the same JavaScript execution context get batched automatically
- Once the commit is scheduled, JavaScript errors won’t cancel it
Important behaviors to be aware of:
await
ing already-resolved promises preserves coalescing because they don’t suspend execution- Any operation that yields to the event loop (like
await fetch()
) breaks coalescing
Practical recommendations:
|
|
|
|
Key takeaways:
- Group related storage operations without intervening
await
calls - Don’t rely on storage operations returning resolved promises - this may change
- Be aware that once writes are queued, they will commit even if errors occur later
- Use explicit transactions when you need guaranteed atomicity across async boundaries
Understanding these implementation details helps you write more efficient Durable Object applications and debug coalescing issues when they arise.