Recovery
Matching engine recovery and snapshots
Implemented package:
packages/matching-engineFiles:
packages/matching-engine/src/recovery.ts
packages/matching-engine/recovery.test.ts
packages/runtime/src/production-workers.tsProblem
The orderbook lives in memory. If the matching engine crashes, open orders would be lost unless the engine can rebuild book state.
Implemented Recovery Strategy
- Periodically serialize the open orderbook.
- Write the snapshot to a temporary file.
- Atomically rename the temporary file to the market snapshot path.
- Store metadata:
- market
- engine sequence
- last Redis stream id
- created time
- On restart, load the latest snapshot.
- Replay engine events after the snapshot sequence.
Snapshot Shape
{
"market": "BTC-PERP",
"engineSequence": 100,
"lastRedisStreamId": "1710000000000-0",
"createdAt": 1700000000000,
"orderBook": {
"market": "BTC-PERP",
"sequence": 100,
"bids": [],
"asks": []
}
}Each price level stores FIFO order snapshots with enough detail to restore:
{
"priceTicks": 650000,
"totalQtyLots": 10,
"orders": [
{
"orderId": "order-1",
"userId": "user-1",
"market": "BTC-PERP",
"side": "buy",
"type": "limit",
"qtyLots": 10,
"remainingQtyLots": 10,
"priceTicks": 650000,
"status": "OPEN",
"timeInForce": "GTC",
"reduceOnly": false,
"postOnly": false,
"createdAt": 1700000000000,
"sequence": 12
}
]
}Replay Events
Replay currently mutates book state for:
order.rested
order.cancelled
trade.executedNo-op replay events:
order.accepted
order.rejected
order.cancel_rejected
order.expiredUsage
import {
createStoredSnapshot,
FileSnapshotStore,
recoverOrderBookFromSnapshot,
} from "matching-engine";
const store = new FileSnapshotStore("./snapshots");
await store.write(
createStoredSnapshot({
orderBook,
lastRedisStreamId: "1710000000000-0",
createdAt: Date.now(),
}),
);
const snapshot = await store.readLatest("BTC-PERP");
if (snapshot) {
const recovered = recoverOrderBookFromSnapshot({
snapshot,
eventsAfterSnapshot,
});
}Redis Worker Responsibility
The production matching worker wires the recovery helper to Redis Streams:
- Read snapshot metadata.
XRANGE/read stream entries afterlastRedisStreamId.- Decode engine events.
- Pass them to
recoverOrderBookFromSnapshot. - Start normal command consumption after replay completes.
When SNAPSHOT_INTERVAL_MS is configured, the production matching worker writes
fresh snapshots and inserts snapshot_metadata rows after successful command
processing.