docs(tree): state root task (#14400)

Co-authored-by: Dan Cline <6798349+Rjected@users.noreply.github.com>
This commit is contained in:
Alexey Shekhirin
2025-02-12 12:28:35 +00:00
committed by GitHub
parent 11eac03f00
commit ac7b5959fb
11 changed files with 403 additions and 0 deletions

View File

@ -0,0 +1,26 @@
flowchart TD
subgraph EngineTask[Engine]
Block
-->|Execute transactions sequentially| Execute[Execute transaction]
--> CollectStateUpdates[Collect all accounts and storage slots that were modified]
end
subgraph TransactionThread[Prewarming thread]
Prewarm[Execute transaction on top of previous block]
--> CollectPrefetchTargets[Collect all accounts and storage slots that were modified]
end
subgraph StateRootTask[State Root Task thread]
StateRootMessage::PrefetchProofs
StateRootMessage::StateUpdate
StateRootMessage::FinishedStateUpdates
StateRootMessage::RootCalculated
end
newPayloadRequest[engine_newPayload request] --> Block
Block -->|Start prewarming each transaction in a separate thread| Prewarm
CollectPrefetchTargets --> StateRootMessage::PrefetchProofs
CollectStateUpdates --> StateRootMessage::StateUpdate
Execute -->|All transactions finished executing| StateRootMessage::FinishedStateUpdates
StateRootMessage::RootCalculated
--> newPayloadResponse[engine_newPayload response]

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

View File

@ -0,0 +1,26 @@
flowchart TD
subgraph MultiProofManager
ParallelProof@{ shape: processes, label: "Start thread with ParallelProof::spawn" }
PendingProofRequests[List of pending proof requests]
subgraph MultiProofManagerCompletion[on_calculation_complete]
HasPendingProofs{{Has pending multiproof requests?}}
end
subgraph MultiProofManagerSpawn[spawn_or_queue]
ProofTargetsCondition{{Proof targets not empty?}}
-->|Not empty, MultiProofTargets| MultiProofManagerLimitReached{{Max in-flight proofs limit reached?}}
end
end
subgraph StateRootTask[StateRootTask]
StateRootMessage::EmptyProof
StateRootMessage::ProofCalculated
end
MultiProofManagerLimitReached -->|Yes, push to pending requests| PendingProofRequests
MultiProofManagerLimitReached -->|No| ParallelProof
HasPendingProofs <--> PendingProofRequests
HasPendingProofs -->|Yes| ParallelProof
ParallelProof --> StateRootMessage::ProofCalculated
ProofTargetsCondition -->|Empty| StateRootMessage::EmptyProof

Binary file not shown.

After

Width:  |  Height:  |  Size: 194 KiB

View File

@ -0,0 +1,21 @@
flowchart TD
subgraph SparseTrieTask[run_sparse_trie]
SparseTrieUpdate([SparseTrieUpdate channel])
SparseTrieUpdate --> SparseTrieUpdateAccumulate[Accumulate updates until the channel is empty]
SparseTrieUpdateAccumulate
--> SparseTrieReveal[Reveal multiproof in Sparse Trie]
--> SparseTrieStateUpdate[Update Sparse Trie with new state]
--> SparseTrieStorageRoots[Calculate sparse storage trie roots]
--> SparseTrieUpdateBelowLevel[Calculate sparse trie hashes below certain level]
SparseTrieUpdateBelowLevel --> SparseTrieUpdateClosed{{Is SparseTrieUpdate channel closed?}}
SparseTrieUpdateClosed -->|Yes| SparseTrieRoot[Calculate sparse trie root]
SparseTrieUpdateClosed -->|No| SparseTrieUpdate
end
subgraph StateRootTask
Incoming[Incoming SparseTrieUpdate messages]
StateRootMessage::RootCalculated
end
Incoming --> SparseTrieUpdate
SparseTrieRoot --> StateRootMessage::RootCalculated

Binary file not shown.

After

Width:  |  Height:  |  Size: 236 KiB

View File

@ -0,0 +1,35 @@
flowchart TD
classDef revealed stroke:green,stroke-width:4px
subgraph Reveal2[0x00010 revealed]
R[Root Branch Node<br/>0x]
B1[Branch Node<br/>0x0]:::revealed
E1[Extension Node<br/>0x00]:::revealed
E2[Extension Node<br/>0x1]
B2[Branch Node<br/>0x0001]:::revealed
L1[Leaf Node<br/>0x00010]:::revealed
L2[Leaf Node<br/>0x10010]
H1[Hash<br/>0x01]:::revealed
H2[Hash<br/>0x00011]:::revealed
R -->|0| B1
R -->|1| E2
B1 -->|0| E1
B1 -->|1| H1
E1 -->|01| B2
B2 -->|0| L1
B2 -->|1| H2
E2 -->|0010| L2
end
subgraph Reveal1[0x10010 revealed]
R1R[Root Branch Node<br/>0x]
R1E2[Extension Node<br/>0x1]:::revealed
R1L2[Leaf Node<br/>0x10010]:::revealed
R1R -->|1| R1E2
R1E2 -->|0010| R1L2
end
subgraph Empty
ER[Root Branch Node<br/>0x]
end

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

View File

@ -0,0 +1,44 @@
flowchart TD
subgraph StateRootTaskMessages[State Root Task messages]
StateRootMessage::StateUpdate
StateRootMessage::PrefetchProofs
StateRootMessage::EmptyProof
StateRootMessage::ProofCalculated
StataRootMessage::FinishedStateUpdates
end
subgraph StateRootTask[State Root Task thread]
DeduplicateProofTargets[Deduplicate proof targets according to the list of already fetched proofs]
GenerateProofTargets[Generate proof targets from state update]
--> DeduplicateProofTargets
NewProof[New proof calculated]
-->|Add new proof| ProofSequencer
--> EndCondition1
ProofSequencer --> ProofSequencerCondition{{Has sequential proofs?}}
EndCondition1{{All updates processed?}}
--> EndCondition2{{All pending proofs requested?}}
--> EndCondition3{{All proofs finished processing?}}
end
subgraph SparseTrieTask[Sparse Trie thread]
SparseTrieUpdate([SparseTrieUpdate channel])
end
subgraph MultiProofManager[MultiProofManager]
MultiProofCompletion[on_calculation_complete]
MultiProofSpawn[spawn_or_queue]
end
StateRootMessage::PrefetchProofs --> DeduplicateProofTargets
StateRootMessage::StateUpdate --> GenerateProofTargets
DeduplicateProofTargets -----> MultiProofSpawn
StateRootMessage::EmptyProof --> NewProof
StateRootMessage::ProofCalculated --> NewProof
NewProof ---> MultiProofCompletion
ProofSequencerCondition -->|Yes, send multiproof and state update| SparseTrieUpdate
StataRootMessage::FinishedStateUpdates --> EndCondition1
EndCondition3 -->|Close SparseTrieUpdate channel| SparseTrieUpdate

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

View File

@ -0,0 +1,251 @@
# State Root Calculation for Engine Payloads
The heart of Reth is the Engine, which is responsible for driving the chain forward.
Each time it receives a new payload ([engine_newPayloadV4](https://github.com/ethereum/execution-apis/blob/main/src/engine/prague.md#engine_newpayloadv4)
at the time of writing this document), it:
1. Does a bunch of validations.
2. Executes the block contained in the payload.
3. Calculates the [MPT](https://ethereum.org/en/developers/docs/data-structures-and-encoding/patricia-merkle-trie/)
root of the new state.
4. Compares the root with the one received in the block header.
5. Considers the block valid.
This document describes the lifecycle of a payload with the focus on state root calculation,
from the moment the payload is received, to the moment we have a new state root.
We will look at the following components:
- [Engine](#engine)
- [State Root Task](#state-root-task)
- [MultiProof Manager](#multiproof-manager)
- [Sparse Trie Task](#sparse-trie-task)
## Engine
![Engine](./mermaid/engine.mmd.png)
It all starts with the `engine_newPayload` request coming from the [Consensus Client](https://ethereum.org/en/developers/docs/nodes-and-clients/#consensus-clients).
We extract the block from the payload, and eventually pass it to the `EngineApiTreeHandler::insert_block_inner`
method which executes the block and calculates the state root.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/mod.rs#L2359-L2362
Let's walk through the steps involved in the process.
First, we spawn the [State Root Task](#state-root-task) thread, which will receive the updates from
execution and calculate the state root. https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/mod.rs#L2449-L2458
Then, we do two things with the block:
1. Start prewarming each transaction in a separate thread ("Prewarming thread" on the above diagram).
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/mod.rs#L2490-L2507
- Each transaction is optimistically executed in parallel with each other on top of the previous block,
but the results are not committed to the database.
- All accounts and storage slots that were accessed are cached in memory, so that the actual execution
can use them instead of going to the database.
- All modified accounts and storage slots are sent as `StateRootMessage::PrefetchProofs`
to the [State Root Task](#state-root-task).
- Some transactions will fail, because they require the previous transactions to be executed first.
It doesn't matter, because we only care about optimistically prewarming the accounts and storage slots
that are accessed, and transactions will be executed in the correct order later anyway.
2. Execute transactions sequentially.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/mod.rs#L2523
- Transactions are executed one after another. Accounts and storage slots accessed during the execution
are looked up in the cache from the previous prewarming step.
- All modified accounts and storage slots are sent as `StateRootMessage::StateUpdate`
to the [State Root Task](#state-root-task).
- When all transactions are executed, the `StateRootMessage::FinishedStateUpdates` is sent
to the [State Root Task](#state-root-task).
Eventually, the Engine will receive the `StateRootMessage::RootCalculated` message from
the [State Root Task](#state-root-task) thread, and send the `engine_newPayload` response.
## State Root Task
![State Root Task](./mermaid/state-root-task.mmd.png)
State Root Task is a component responsible for receiving the state updates from the [Engine](#engine),
issuing requests for generating proofs to the [MultiProof Manager](#multiproof-manager),
updating the sparse trie using the [Sparse Trie Task](#sparse-trie-task),
and finally sending the state root back to the [Engine](#engine).
At its core, it's a state machine that receives messages from other components, and handles them accordingly.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L726
When the State Root Task is spawned, it also spawns the [Sparse Trie Task](#sparse-trie-task) in a separate thread.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L542-L544
### Generating proof targets
State root calculation in the [Sparse Trie Task](#sparse-trie-task) relies on:
1. Revealing nodes in the trie according to [MPT (Merkle Patricia Trie) proofs](https://docs.chainstack.com/docs/deep-dive-into-merkle-proofs-and-eth-getproof-ethereum-rpc-method).
- Revealing means adding the nodes from the proof to the Sparse Trie structure.
See [example](#revealing-example) for a diagram.
2. Updating the trie according to the state updates received from executing the transactions.
Let's look at the first two messages on the diagram: `StateRootMessage::StateUpdate`
and `StateRootMessage::PrefetchProofs`. They are sent from the previous [Engine](#engine) step,
and first used to form the proofs targets.
Proof targets are a list of accounts and storage slots that we send to
the [MultiProof Manager](#multiproof-manager) to generate the MPT proofs.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/trie/common/src/proofs.rs#L20-L21
Before sending them, we first deduplicate the list of targets according to a list of proof targets
that were already fetched.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L1022-L1028
This deduplication step is important, because if two transactions modify the same account or storage slot,
we only need to fetch the MPT proof once.
Then, the proof targets are passed to the [`MultiProofManager::spawn_or_queue`](#multiproof-manager) method.
### Sequencing calculated proofs
When the [MultiProof Manager](#multiproof-manager) finishes calculating the proof, it sends
a message back to the State Root Task. It can be either:
1. `StateRootMessage::EmptyProof` if the deduplication of proof targets resulted in an empty list.
2. `StateRootMessage::ProofCalculated(proof, state)` with the MPT proof calculated for the targets,
along with the state update that the proof was generated for.
On any message, we call the [`MultiProofManager::on_calculation_complete`](#multiproof-manager) method
to signal that the proof calculation is finished.
Some proofs can arrive earlier than others, even though they were requested later. It depends on the number
of proof targets, and also some non-determinism in the database caching.
The issue with this is that we need to ensure that the proofs are sent
to the [Sparse Trie Task](#sparse-trie-task) in the order that they were requested. Because of this,
we introduced a `ProofSequencer` that we add new proofs to.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L666-L672
`ProofSequencer` acts in the following way:
1. Each proof has an associated "sequence number" that determines the original order of state updates.
2. When the proof is calculated, it's added to the `ProofSequencer` with the sequence number
and state update associated with it.
3. If the `ProofSequencer` has a consecutive sequence of proofs without gaps in sequence numbers, it returns this sequence.
Once the `ProofSequencer` returns a sequence of proofs,
we send them along with the state updates to the [Sparse Trie Task](#sparse-trie-task).
### Finishing the calculation
Once all transactions are executed, the [Engine](#engine) sends a `StateRootMessage::FinishStateUpdates` message
to the State Root Task, marking the end of receiving state updates.
Every time we receive a new proof from the [MultiProof Manager](#multiproof-manager), we also check
the following conditions:
1. Are all updates received? (`StateRootMessage::FinishStateUpdates` was sent)
2. Is `ProofSequencer` empty? (no proofs are pending for sequencing)
3. Are all proofs that were sent to the [`MultiProofManager::spawn_or_queue`](#multiproof-manager) finished
calculating and were sent to the [Sparse Trie Task](#sparse-trie-task)?
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L935-L944
When all conditions are met, we close the [State Root Task](#state-root-task) receiver channel,
signaling that no proofs or state updates are coming anymore, and the state root calculation should be finished.
## MultiProof Manager
![MultiProof Manager](./mermaid/multiproof-manager.mmd.png)
MultiProof manager is a component responsible for generating MPT proofs
and sending them back to the [State Root Task](#state-root-task).
### Spawning new proof calculations
The entrypoint is the `spawn_or_queue` method
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L355-L357
It has the following responsibilities:
1. On empty proof targets, immediately send `StateRootMessage::EmptyProof` to the [State Root Task](#state-root-task).
2. If the number of maximum concurrent proof calculations is reached, push the proof request to the pending queue.
- Maximum concurrency is determined as `NUM_THREADS / 2 - 2`.
- For a system with 64 threads, the maximum number of concurrent proof calculations will be `64 / 2 - 2 = 30`.
3. If we can spawn a new proof calculation thread, spawn it using [`ParallelProof`](https://github.com/paradigmxyz/reth/blob/09a6aab9f7dc283e42fd00ce8f179542f8558580/crates/trie/parallel/src/proof.rs#L85),
and send `StateRootMessage::ProofCalculated` to the [State Root Task](#state-root-task) once it's done.
### Exhausting the pending queue
To exhaust the pending queue from the step 2 of the `spawn_or_queue` described above,
the [State Root Task](#state-root-task) calls into another method `on_calculation_complete` every time
a proof is calculated.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L379-L387
Its main purpose is to spawn a new proof calculation thread and do the same as step 3 of the `spawn_or_queue` method
described above.
## Sparse Trie Task
Sparse Trie component is the heart of the new state root calculation logic.
### Sparse Trie primer
- The state trie of Ethereum is very big (150GB+), and we cannot realistically fit it into memory.
- What if instead of loading the entire trie in memory,
we only load the parts that were modified during the block execution (i.e. make the trie "sparse")?
- Such modified parts will have nodes that will be modified,
and nodes that are needed only for calculating the hashes.
- Essentially, this is the same idea as [MPT proofs](https://docs.chainstack.com/docs/deep-dive-into-merkle-proofs-and-eth-getproof-ethereum-rpc-method)
that have only partial information about the sibling nodes, if these nodes aren't part of the
requested path.
- When updating the trie, we first reveal the nodes using the MPT proofs, and then add/update/remove the leaves,
along with the other nodes that need to be modified in the process of leaf update.
#### Revealing Example
![Sparse Trie](./mermaid/sparse-trie.mmd.png)
1. Empty
- Sparse Trie has no revealed nodes, and an empty root
2. `0x10010` revealed
- Child of the root branch node under the nibble `1` is revealed, and it's an extension node placed on the path `0x1`.
- Child of the extension node at path `0x1` with the extension key `0010` is revealed, and it's a leaf node placed on the path `0x10010`.
3. `0x00010` revealed
- Child of the root branch node under the nibble `0` is revealed, and it's a branch node placed on the path `0x0`.
- Child of the branch node at path `0x0` under the nibble `1` is revealed, and it's a hash node placed on the path `0x01`.
- Child of the branch node at path `0x0` under the nibble `0` is revealed, and it's an extension placed on the path `0x00`.
- Child of the extension node at path `0x00` with the extension key `01` is revealed, and it's a branch node placed on the path `0x0001`.
- Child of the branch node at path `0x0001` under the nibble `1` is revealed, and it's a hash node placed on the path `0x00011`.
- Child of the branch node at path `0x0001` under the nibble `0` is revealed, and it's a leaf node placed on the path `0x00010`.
For the implementation details, see [crates/trie/sparse/src/trie.rs](https://github.com/paradigmxyz/reth/blob/09a6aab9f7dc283e42fd00ce8f179542f8558580/crates/trie/sparse/src/trie.rs).
### Sparse Trie updates
![Sparse Trie Task](./mermaid/sparse-trie-task.mmd.png)
The messages to the sparse trie are sent from the [State Root Task](#state-root-task),
and consist of the proof that needs to be revealed, and a list of updates that need to be applied.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L66-L74
We do not reveal the proofs and apply the updates immediately,
but instead accumulate them until the messages channel is empty, and then reveal and apply in bulk.
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L991-L994
When messages are accumulated, we update the Sparse Trie:
1. Reveal the proof
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L1090-L1091
2. For each modified storage trie, apply updates and calculate the roots in parallel
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L1093
3. Update accounts trie
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L1133
4. Calculate keccak hashes of the nodes below the certain level
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L1139
As you can see, we do not calculate the state root hash of the accounts trie
(the one that will be the result of the whole task), but instead calculate only the certain hashes.
This is an optimization that comes from the fact that we will likely update the top 2-3 levels of the trie
in every transaction, so doing that work every time would be wasteful.
Instead, we calculate hashes for most of the levels of the trie, and do the rest of the work
only when we're finishing the calculation.
### Finishing the calculation
Once the messages channel is closed by the [State Root Task](#state-root-task),
we exhaust it, reveal proofs and apply updates, and then calculate the full state root hash
https://github.com/paradigmxyz/reth/blob/2ba54bf1c1f38c7173838f37027315a09287c20a/crates/engine/tree/src/tree/root.rs#L1014
This state root is eventually sent as `StateRootMessage::RootCalculated` to the [Engine](#engine).