mirror of
https://github.com/hl-archive-node/nanoreth.git
synced 2025-12-06 10:59:55 +00:00
Moves code walkthrough book chapters to docs (#629)
* replaced template blocks with code blocks in stages chapter * replaced template blocks with code blocks in network chapter * moved book sections to docs * fix indentation in recover_signer codeblock * remove unnecessary TODO comment in network.md
This commit is contained in:
83
docs/design/goals.md
Normal file
83
docs/design/goals.md
Normal file
@ -0,0 +1,83 @@
|
||||
# Reth Goals
|
||||
|
||||
### Why are we building this client?
|
||||
|
||||
Our goal in building Reth, apart from improving client diversity, is to create a client that delivers maximally along each of the following dimensions:
|
||||
|
||||
- Performance
|
||||
- Configurability
|
||||
- Open-source friendliness
|
||||
|
||||
---
|
||||
|
||||
## Performance
|
||||
|
||||
### Why does performance matter?
|
||||
|
||||
This is a win for everyone:
|
||||
- Average users & developers benefit from RPC performance, leading to more responsive applications and faster feedback.
|
||||
- Home node operators benefit from faster sync times.
|
||||
- Costs are lowered for all operators, whether in terms of storage costs, or being able to serve more requests from the same node.
|
||||
- Searchers are able to run more simulations.
|
||||
|
||||
### What are the performance bottlenecks that need to be addressed?
|
||||
|
||||
**Optimizing state access**
|
||||
|
||||
The pipeline that a given transaction goes through as it’s processed is more or less the following:
|
||||
|
||||
RPC -> EVM -> Cache -> Codec -> DB
|
||||
|
||||
One of our first and foremost goals in Reth is to minimize the latency and maximize the throughput (think: request concurrency) of this pipeline.
|
||||
|
||||
Why? This is a win for everyone. RPC providers meet more impressive SLAs, MEV searchers become more effective, home nodes sync faster, etc.
|
||||
|
||||
The biggest bottleneck in this pipeline is not the execution of the EVM interpreter itself, but rather in accessing state and managing I/O. As such, we think the largest optimizations to be made are closest to the DB layer.
|
||||
|
||||
Ideally, we can achieve such fast runtime operation that we can avoid storing certain things (e.g.?) on the disk, and are able to generate them on the fly, instead - minimizing disk footprint.
|
||||
|
||||
---
|
||||
|
||||
## Configurability
|
||||
|
||||
### Why does configurability matter?
|
||||
|
||||
**Control over tradeoffs**
|
||||
|
||||
Almost any given design choice or optimization to the client comes with its own tradeoffs. As such, our long-term goal is not to make opinionated decisions on behalf of everyone, as some users will be negatively impacted and turned away from what could be a great client.
|
||||
|
||||
**Profiles**
|
||||
|
||||
We aim to facilitate the creation of community-developed configuration presets that are fit to various user profiles, e.g. archive node, RPC provider, MEV searcher, etc.
|
||||
|
||||
**Extension to EVM-compatible L1s and L2s**
|
||||
|
||||
Another consequence of a configurable design is the ability to quickly extend the client to support other EVM-compatible L1s and L2s, enabling innovation while retaining performance.
|
||||
|
||||
### How is Reth made configurable?
|
||||
|
||||
**Modularity & generics**
|
||||
|
||||
We prioritize a modular design for Reth with reasonable (and zero-cost!) abstractions over generic interfaces. We want it to be quick and easy for others to extend or adapt the implementation to their own needs.
|
||||
|
||||
---
|
||||
|
||||
## Open-source friendliness
|
||||
|
||||
### Why does open-source friendliness matter?
|
||||
|
||||
Maintaining a client implementation is *hard*. Bringing in talent and sustaining momentum in workstreams is a known challenge. As such, we take an open-source first approach to ensure that the development of Reth can be carried forward by the community.
|
||||
|
||||
We want to be as deliberate as possible in forming a feedback loop with the Ethereum community, and not only make it easy to contribute to Reth, but in fact actively *encourage* doing so.
|
||||
|
||||
Our goal is that community members with no Rust experience, and no experience running a node, will still be able to meaningfully contribute to the project, and accrue expertise in doing so.
|
||||
|
||||
### How does Reth support open-source contribution?
|
||||
|
||||
**Documentation**
|
||||
|
||||
It goes without saying that verbose and thorough documentation is a must. The docs should provide full context on the design and implementation of the client, as well as the contribution process, and should be accessible to anyone with a basic understanding of Ethereum.
|
||||
|
||||
**Issue tracking**
|
||||
|
||||
Everything that is (and is not) being worked on within the client should be tracked accordingly so that anyone in the community can stay on top of the state of development. This makes it clear what kind of help is needed, and where.
|
||||
333
docs/repo/crates/db.md
Normal file
333
docs/repo/crates/db.md
Normal file
@ -0,0 +1,333 @@
|
||||
# db
|
||||
|
||||
The database is a central component to Reth, enabling persistent storage for data like block headers, block bodies, transactions and more. The Reth database is comprised of key-value storage written to the disk and organized in tables. This chapter might feel a little dense at first, but shortly, you will feel very comfortable understanding and navigating the `db` crate. This chapter will go through the structure of the database, its tables and the mechanics of the `Database` trait.
|
||||
|
||||
<br>
|
||||
|
||||
## Tables
|
||||
|
||||
Within Reth, the database is organized via "tables". A table is any struct that implements the `Table` trait.
|
||||
|
||||
[File: crates/storage/db/src/abstraction/table.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/table.rs#L56-L65)
|
||||
```rust ignore
|
||||
pub trait Table: Send + Sync + Debug + 'static {
|
||||
/// Return table name as it is present inside the MDBX.
|
||||
const NAME: &'static str;
|
||||
/// Key element of `Table`.
|
||||
///
|
||||
/// Sorting should be taken into account when encoding this.
|
||||
type Key: Key;
|
||||
/// Value element of `Table`.
|
||||
type Value: Value;
|
||||
}
|
||||
|
||||
//--snip--
|
||||
pub trait Key: Encode + Decode {}
|
||||
|
||||
//--snip--
|
||||
pub trait Value: Compress + Decompress {}
|
||||
|
||||
```
|
||||
|
||||
The `Table` trait has two generic values, `Key` and `Value`, which need to implement the `Key` and `Value` traits, respectively. The `Encode` trait is responsible for transforming data into bytes so it can be stored in the database, while the `Decode` trait transforms the bytes back into its original form. Similarly, the `Compress` and `Decompress` traits transform the data to and from a compressed format when storing or reading data from the database.
|
||||
|
||||
There are many tables within the node, all used to store different types of data from `Headers` to `Transactions` and more. Below is a list of all of the tables. You can follow [this link](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/tables/mod.rs#L36) if you would like to see the table definitions for any of the tables below.
|
||||
|
||||
- CanonicalHeaders
|
||||
- HeaderTD
|
||||
- HeaderNumbers
|
||||
- Headers
|
||||
- BlockBodies
|
||||
- BlockOmmers
|
||||
- NonCanonicalTransactions
|
||||
- Transactions
|
||||
- TxHashNumber
|
||||
- Receipts
|
||||
- Logs
|
||||
- PlainAccountState
|
||||
- PlainStorageState
|
||||
- Bytecodes
|
||||
- BlockTransitionIndex
|
||||
- TxTransitionIndex
|
||||
- AccountHistory
|
||||
- StorageHistory
|
||||
- AccountChangeSet
|
||||
- StorageChangeSet
|
||||
- TxSenders
|
||||
- Config
|
||||
- SyncStage
|
||||
|
||||
<br>
|
||||
|
||||
## Database
|
||||
|
||||
Reth's database design revolves around it's main [Database trait](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/interfaces/src/db/mod.rs#L33), which takes advantage of [generic associated types](https://blog.rust-lang.org/2022/10/28/gats-stabilization.html) and [a few design tricks](https://sabrinajewson.org/blog/the-better-alternative-to-lifetime-gats#the-better-gats) to implement the database's functionality across many types. Let's take a quick look at the `Database` trait and how it works.
|
||||
|
||||
[File: crates/storage/db/src/abstraction/database.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/database.rs#L19)
|
||||
```rust ignore
|
||||
/// Main Database trait that spawns transactions to be executed.
|
||||
pub trait Database: for<'a> DatabaseGAT<'a> {
|
||||
/// Create read only transaction.
|
||||
fn tx(&self) -> Result<<Self as DatabaseGAT<'_>>::TX, Error>;
|
||||
|
||||
/// Create read write transaction only possible if database is open with write access.
|
||||
fn tx_mut(&self) -> Result<<Self as DatabaseGAT<'_>>::TXMut, Error>;
|
||||
|
||||
/// Takes a function and passes a read-only transaction into it, making sure it's closed in the
|
||||
/// end of the execution.
|
||||
fn view<T, F>(&self, f: F) -> Result<T, Error>
|
||||
where
|
||||
F: Fn(&<Self as DatabaseGAT<'_>>::TX) -> T,
|
||||
{
|
||||
let tx = self.tx()?;
|
||||
|
||||
let res = f(&tx);
|
||||
tx.commit()?;
|
||||
|
||||
Ok(res)
|
||||
}
|
||||
|
||||
/// Takes a function and passes a write-read transaction into it, making sure it's committed in
|
||||
/// the end of the execution.
|
||||
fn update<T, F>(&self, f: F) -> Result<T, Error>
|
||||
where
|
||||
F: Fn(&<Self as DatabaseGAT<'_>>::TXMut) -> T,
|
||||
{
|
||||
let tx = self.tx_mut()?;
|
||||
|
||||
let res = f(&tx);
|
||||
tx.commit()?;
|
||||
|
||||
Ok(res)
|
||||
}
|
||||
}
|
||||
```
|
||||
Any type that implements the `Database` trait can create a database transaction, as well as view or update existing transactions. As an example, lets revisit the `Transaction` struct from the `stages` crate. This struct contains a field named `db` which is a reference to a generic type `DB` that implements the `Database` trait. The `Transaction` struct can use the `db` field to store new headers, bodies and senders in the database. In the code snippet below, you can see the `Transaction::open()` method, which uses the `Database::tx_mut()` function to create a mutable transaction.
|
||||
|
||||
|
||||
[File: crates/stages/src/db.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/db.rs#L95-L98)
|
||||
```rust ignore
|
||||
pub struct Transaction<'this, DB: Database> {
|
||||
/// A handle to the DB.
|
||||
pub(crate) db: &'this DB,
|
||||
tx: Option<<DB as DatabaseGAT<'this>>::TXMut>,
|
||||
}
|
||||
|
||||
//--snip--
|
||||
impl<'this, DB> Transaction<'this, DB>
|
||||
where
|
||||
DB: Database,
|
||||
{
|
||||
//--snip--
|
||||
|
||||
/// Open a new inner transaction.
|
||||
pub fn open(&mut self) -> Result<(), Error> {
|
||||
self.tx = Some(self.db.tx_mut()?);
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `Database` trait also implements the `DatabaseGAT` trait which defines two associated types `TX` and `TXMut`.
|
||||
|
||||
[File: crates/storage/db/src/abstraction/database.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/database.rs#L11)
|
||||
```rust ignore
|
||||
/// Implements the GAT method from:
|
||||
/// https://sabrinajewson.org/blog/the-better-alternative-to-lifetime-gats#the-better-gats.
|
||||
///
|
||||
/// Sealed trait which cannot be implemented by 3rd parties, exposed only for implementers
|
||||
pub trait DatabaseGAT<'a, __ImplicitBounds: Sealed = Bounds<&'a Self>>: Send + Sync {
|
||||
/// RO database transaction
|
||||
type TX: DbTx<'a> + Send + Sync;
|
||||
/// RW database transaction
|
||||
type TXMut: DbTxMut<'a> + DbTx<'a> + Send + Sync;
|
||||
}
|
||||
```
|
||||
|
||||
In Rust, associated types are like generics in that they can be any type fitting the generic's definition, with the difference being that associated types are associated with a trait and can only be used in the context of that trait.
|
||||
|
||||
In the code snippet above, the `DatabaseGAT` trait has two associated types, `TX` and `TXMut`.
|
||||
|
||||
The `TX` type can be any type that implements the `DbTx` trait, which provides a set of functions to interact with read only transactions.
|
||||
|
||||
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L36)
|
||||
```rust ignore
|
||||
/// Read only transaction
|
||||
pub trait DbTx<'tx>: for<'a> DbTxGAT<'a> {
|
||||
/// Get value
|
||||
fn get<T: Table>(&self, key: T::Key) -> Result<Option<T::Value>, Error>;
|
||||
/// Commit for read only transaction will consume and free transaction and allows
|
||||
/// freeing of memory pages
|
||||
fn commit(self) -> Result<bool, Error>;
|
||||
/// Iterate over read only values in table.
|
||||
fn cursor<T: Table>(&self) -> Result<<Self as DbTxGAT<'_>>::Cursor<T>, Error>;
|
||||
/// Iterate over read only values in dup sorted table.
|
||||
fn cursor_dup<T: DupSort>(&self) -> Result<<Self as DbTxGAT<'_>>::DupCursor<T>, Error>;
|
||||
}
|
||||
```
|
||||
|
||||
The `TXMut` type can be any type that implements the `DbTxMut` trait, which provides a set of functions to interact with read/write transactions.
|
||||
|
||||
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L49)
|
||||
```rust ignore
|
||||
/// Read write transaction that allows writing to database
|
||||
pub trait DbTxMut<'tx>: for<'a> DbTxMutGAT<'a> {
|
||||
/// Put value to database
|
||||
fn put<T: Table>(&self, key: T::Key, value: T::Value) -> Result<(), Error>;
|
||||
/// Delete value from database
|
||||
fn delete<T: Table>(&self, key: T::Key, value: Option<T::Value>) -> Result<bool, Error>;
|
||||
/// Clears database.
|
||||
fn clear<T: Table>(&self) -> Result<(), Error>;
|
||||
/// Cursor mut
|
||||
fn cursor_mut<T: Table>(&self) -> Result<<Self as DbTxMutGAT<'_>>::CursorMut<T>, Error>;
|
||||
/// DupCursor mut.
|
||||
fn cursor_dup_mut<T: DupSort>(
|
||||
&self,
|
||||
) -> Result<<Self as DbTxMutGAT<'_>>::DupCursorMut<T>, Error>;
|
||||
}
|
||||
```
|
||||
|
||||
Lets take a look at the `DbTx` and `DbTxMut` traits in action. Revisiting the `Transaction` struct as an example, the `Transaction::get_block_hash()` method uses the `DbTx::get()` function to get a block header hash in the form of `self.get::<tables::CanonicalHeaders>(number)`.
|
||||
|
||||
[File: crates/stages/src/db.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/db.rs#L106)
|
||||
```rust ignore
|
||||
|
||||
impl<'this, DB> Transaction<'this, DB>
|
||||
where
|
||||
DB: Database,
|
||||
{
|
||||
//--snip--
|
||||
|
||||
/// Query [tables::CanonicalHeaders] table for block hash by block number
|
||||
pub(crate) fn get_block_hash(&self, number: BlockNumber) -> Result<BlockHash, StageError> {
|
||||
let hash = self
|
||||
.get::<tables::CanonicalHeaders>(number)?
|
||||
.ok_or(DatabaseIntegrityError::CanonicalHash { number })?;
|
||||
Ok(hash)
|
||||
}
|
||||
//--snip--
|
||||
}
|
||||
|
||||
//--snip--
|
||||
impl<'a, DB: Database> Deref for Transaction<'a, DB> {
|
||||
type Target = <DB as DatabaseGAT<'a>>::TXMut;
|
||||
fn deref(&self) -> &Self::Target {
|
||||
self.tx.as_ref().expect("Tried getting a reference to a non-existent transaction")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `Transaction` struct implements the `Deref` trait, which returns a reference to its `tx` field, which is a `TxMut`. Recall that `TxMut` is a generic type on the `DatabaseGAT` trait, which is defined as `type TXMut: DbTxMut<'a> + DbTx<'a> + Send + Sync;`, giving it access to all of the functions available to `DbTx`, including the `DbTx::get()` function.
|
||||
|
||||
Notice that the function uses a [turbofish](https://techblog.tonsser.com/posts/what-is-rusts-turbofish) to define which table to use when passing in the `key` to the `DbTx::get()` function. Taking a quick look at the function definition, a generic `T` is defined that implements the `Table` trait mentioned at the beginning of this chapter.
|
||||
|
||||
|
||||
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L38)
|
||||
```rust ignore
|
||||
fn get<T: Table>(&self, key: T::Key) -> Result<Option<T::Value>, Error>;
|
||||
```
|
||||
|
||||
This design pattern is very powerful and allows Reth to use the methods available to the `DbTx` and `DbTxMut` traits without having to define implementation blocks for each table within the database.
|
||||
|
||||
Lets take a look at a couple examples before moving on. In the snippet below, the `DbTxMut::put()` method is used to insert values into the `CanonicalHeaders`, `Headers` and `HeaderNumbers` tables.
|
||||
|
||||
[File: crates/storage/provider/src/block.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/provider/src/block.rs#L121-L125)
|
||||
```rust ignore
|
||||
let block_num_hash = BlockNumHash((block.number, block.hash()));
|
||||
tx.put::<tables::CanonicalHeaders>(block.number, block.hash())?;
|
||||
// Put header with canonical hashes.
|
||||
tx.put::<tables::Headers>(block_num_hash, block.header.as_ref().clone())?;
|
||||
tx.put::<tables::HeaderNumbers>(block.hash(), block.number)?;
|
||||
```
|
||||
|
||||
|
||||
This next example uses the `DbTx::cursor()` method to get a `Cursor`. The `Cursor` type provides a way to traverse through rows in a database table, one row at a time. A cursor enables the program to perform an operation (updating, deleting, etc) on each row in the table individually. The following code snippet gets a cursor for a few different tables in the database.
|
||||
|
||||
|
||||
[File: crates/stages/src/stages/execution.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/stages/execution.rs#L93-L101)
|
||||
```rust ignore
|
||||
// Get next canonical block hashes to execute.
|
||||
let mut canonicals = db_tx.cursor::<tables::CanonicalHeaders>()?;
|
||||
// Get header with canonical hashes.
|
||||
let mut headers = db_tx.cursor::<tables::Headers>()?;
|
||||
// Get bodies (to get tx index) with canonical hashes.
|
||||
let mut cumulative_tx_count = db_tx.cursor::<tables::CumulativeTxCount>()?;
|
||||
// Get transaction of the block that we are executing.
|
||||
let mut tx = db_tx.cursor::<tables::Transactions>()?;
|
||||
// Skip sender recovery and load signer from database.
|
||||
let mut tx_sender = db_tx.cursor::<tables::TxSenders>()?;
|
||||
|
||||
```
|
||||
|
||||
We are almost at the last stop in the tour of the `db` crate. In addition to the methods provided by the `DbTx` and `DbTxMut` traits, `DbTx` also inherits the `DbTxGAT` trait, while `DbTxMut` inherits `DbTxMutGAT`. These next two traits provide various associated types related to cursors as well as methods to utilize the cursor types.
|
||||
|
||||
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L12-L17)
|
||||
```rust ignore
|
||||
pub trait DbTxGAT<'a, __ImplicitBounds: Sealed = Bounds<&'a Self>>: Send + Sync {
|
||||
/// Cursor GAT
|
||||
type Cursor<T: Table>: DbCursorRO<'a, T> + Send + Sync;
|
||||
/// DupCursor GAT
|
||||
type DupCursor<T: DupSort>: DbDupCursorRO<'a, T> + DbCursorRO<'a, T> + Send + Sync;
|
||||
}
|
||||
```
|
||||
|
||||
Lets look at an examples of how cursors are used. The code snippet below contains the `unwind` method from the `BodyStage` defined in the `stages` crate. This function is responsible for unwinding any changes to the database if there is an error when executing the body stage within the Reth pipeline.
|
||||
|
||||
[File: crates/stages/src/stages/bodies.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/stages/bodies.rs#L205-L238)
|
||||
```rust ignore
|
||||
/// Unwind the stage.
|
||||
async fn unwind(
|
||||
&mut self,
|
||||
db: &mut Transaction<'_, DB>,
|
||||
input: UnwindInput,
|
||||
) -> Result<UnwindOutput, Box<dyn std::error::Error + Send + Sync>> {
|
||||
let mut tx_count_cursor = db.cursor_mut::<tables::CumulativeTxCount>()?;
|
||||
let mut block_ommers_cursor = db.cursor_mut::<tables::BlockOmmers>()?;
|
||||
let mut transaction_cursor = db.cursor_mut::<tables::Transactions>()?;
|
||||
|
||||
let mut entry = tx_count_cursor.last()?;
|
||||
while let Some((key, count)) = entry {
|
||||
if key.number() <= input.unwind_to {
|
||||
break
|
||||
}
|
||||
|
||||
tx_count_cursor.delete_current()?;
|
||||
entry = tx_count_cursor.prev()?;
|
||||
|
||||
if block_ommers_cursor.seek_exact(key)?.is_some() {
|
||||
block_ommers_cursor.delete_current()?;
|
||||
}
|
||||
|
||||
let prev_count = entry.map(|(_, v)| v).unwrap_or_default();
|
||||
for tx_id in prev_count..count {
|
||||
if transaction_cursor.seek_exact(tx_id)?.is_some() {
|
||||
transaction_cursor.delete_current()?;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
//--snip--
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
This function first grabs a mutable cursor for the `CumulativeTxCount`, `BlockOmmers` and `Transactions` tables.
|
||||
|
||||
The `tx_count_cursor` is used to get the last key value pair written to the `CumulativeTxCount` table and delete key value pair where the cursor is currently pointing.
|
||||
|
||||
The `block_ommers_cursor` is used to get the block ommers from the `BlockOmmers` table at the specified key, and delete the entry where the cursor is currently pointing.
|
||||
|
||||
Finally, the `transaction_cursor` is used to get delete each transaction from the last `TXNumber` written to the database, to the current tx count.
|
||||
|
||||
While this is a brief look at how cursors work in the context of database tables, the chapter on the `libmdbx` crate will go into further detail on how cursors communicate with the database and what is actually happening under the hood.
|
||||
|
||||
<br>
|
||||
|
||||
## Summary
|
||||
This chapter was packed with information, so lets do a quick review. The database is comprised of tables, with each table being a collection of key-value pairs representing various pieces of data in the blockchain. Any struct that implements the `Database` trait can view, update or delete entries in the various tables. The database design leverages nested traits and generic associated types to provide methods to interact with each table in the database.
|
||||
|
||||
<br>
|
||||
|
||||
# Next Chapter
|
||||
|
||||
[Next Chapter]()
|
||||
1093
docs/repo/crates/network.md
Normal file
1093
docs/repo/crates/network.md
Normal file
File diff suppressed because it is too large
Load Diff
165
docs/repo/crates/stages.md
Normal file
165
docs/repo/crates/stages.md
Normal file
@ -0,0 +1,165 @@
|
||||
# Stages
|
||||
|
||||
The `stages` lib plays a central role in syncing the node, maintaining state, updating the database and more. The stages involved in the Reth pipeline are the `HeaderStage`, `BodyStage`, `SenderRecoveryStage`, and `ExecutionStage` (note that this list is non-exhaustive, and more pipeline stages will be added in the near future). Each of these stages are queued up and stored within the Reth pipeline.
|
||||
|
||||
[File: crates/stages/src/pipeline.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/pipeline.rs)
|
||||
```rust,ignore
|
||||
pub struct Pipeline<DB: Database> {
|
||||
stages: Vec<QueuedStage<DB>>,
|
||||
max_block: Option<BlockNumber>,
|
||||
events_sender: MaybeSender<PipelineEvent>,
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
When the node is first started, a new `Pipeline` is initialized and all of the stages are added into `Pipeline.stages`. Then, the `Pipeline::run` function is called, which starts the pipeline, executing all of the stages continuously in an infinite loop. This process syncs the chain, keeping everything up to date with the chain tip.
|
||||
|
||||
Each stage within the pipeline implements the `Stage` trait which provides function interfaces to get the stage id, execute the stage and unwind the changes to the database if there was an issue during the stage execution.
|
||||
|
||||
[File: crates/stages/src/stage.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/stage.rs)
|
||||
```rust,ignore
|
||||
pub trait Stage<DB: Database>: Send + Sync {
|
||||
/// Get the ID of the stage.
|
||||
///
|
||||
/// Stage IDs must be unique.
|
||||
fn id(&self) -> StageId;
|
||||
|
||||
/// Execute the stage.
|
||||
async fn execute(
|
||||
&mut self,
|
||||
tx: &mut Transaction<'_, DB>,
|
||||
input: ExecInput,
|
||||
) -> Result<ExecOutput, StageError>;
|
||||
|
||||
/// Unwind the stage.
|
||||
async fn unwind(
|
||||
&mut self,
|
||||
tx: &mut Transaction<'_, DB>,
|
||||
input: UnwindInput,
|
||||
) -> Result<UnwindOutput, Box<dyn std::error::Error + Send + Sync>>;
|
||||
}
|
||||
```
|
||||
|
||||
To get a better idea of what is happening at each part of the pipeline, lets walk through what is going on under the hood within the `execute()` function at each stage, starting with `HeaderStage`.
|
||||
|
||||
<br>
|
||||
|
||||
## HeaderStage
|
||||
|
||||
<!-- TODO: Cross-link to eth/65 chapter when it's written -->
|
||||
The `HeaderStage` is responsible for syncing the block headers, validating the header integrity and writing the headers to the database. When the `execute()` function is called, the local head of the chain is updated to the most recent block height previously executed by the stage. At this point, the node status is also updated with that block's height, hash and total difficulty. These values are used during any new eth/65 handshakes. After updating the head, a stream is established with other peers in the network to sync the missing chain headers between the most recent state stored in the database and the chain tip. The `HeaderStage` contains a `downloader` attribute, which is a type that implements the `HeaderDownloader` trait. The `stream()` method from this trait is used to fetch headers from the network.
|
||||
|
||||
[File: crates/interfaces/src/p2p/headers/downloader.rs](https://github.com/paradigmxyz/reth/blob/main/crates/interfaces/src/p2p/headers/downloader.rs)
|
||||
```rust,ignore
|
||||
pub trait HeaderDownloader: Downloader {
|
||||
/// Stream the headers
|
||||
fn stream(&self, head: SealedHeader, tip: H256) -> DownloadStream<'_, SealedHeader>;
|
||||
|
||||
/// Validate whether the header is valid in relation to it's parent
|
||||
fn validate(&self, header: &SealedHeader, parent: &SealedHeader) -> DownloadResult<()> {
|
||||
validate_header_download(self.consensus(), header, parent)?;
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `HeaderStage` relies on the downloader stream to return the headers in descending order starting from the chain tip down to the latest block in the database. While other stages in the `Pipeline` start from the most recent block in the database up to the chain tip, the `HeaderStage` works in reverse to avoid [long-range attacks](https://messari.io/report/long-range-attack). When a node downloads headers in ascending order, it will not know if it is being subjected to a long-range attack until it reaches the most recent blocks. To combat this, the `HeaderStage` starts by getting the chain tip from the Consensus Layer, verifies the tip, and then walks backwards by the parent hash. Each value yielded from the stream is a `SealedHeader`.
|
||||
|
||||
[File: crates/primitives/src/header.rs](https://github.com/paradigmxyz/reth/blob/main/crates/primitives/src/header.rs)
|
||||
```rust,ignore
|
||||
pub struct SealedHeader {
|
||||
/// Locked Header fields.
|
||||
header: Header,
|
||||
/// Locked Header hash.
|
||||
hash: BlockHash,
|
||||
}
|
||||
```
|
||||
|
||||
Each `SealedHeader` is then validated to ensure that it has the proper parent. Note that this is only a basic response validation, and the `HeaderDownloader` uses the `validate` method during the `stream`, so that each header is validated according to the consensus specification before the header is yielded from the stream. After this, each header is then written to the database. If a header is not valid or the stream encounters any other error, the error is propagated up through the stage execution, the changes to the database are unwound and the stage is resumed from the most recent valid state.
|
||||
|
||||
This process continues until all of the headers have been downloaded and and written to the database. Finally, the total difficulty of the chain's head is updated and the function returns `Ok(ExecOutput { stage_progress: current_progress, reached_tip: true, done: true })`, signaling that the header sync has completed successfully.
|
||||
|
||||
<br>
|
||||
|
||||
## BodyStage
|
||||
|
||||
Once the `HeaderStage` completes successfully, the `BodyStage` will start execution. The body stage downloads block bodies for all of the new block headers that were stored locally in the database. The `BodyStage` first determines which block bodies to download by checking if the block body has an ommers hash and transaction root.
|
||||
|
||||
An ommers hash is the Keccak 256-bit hash of the ommers list portion of the block. If you are unfamiliar with ommers blocks, you can [click here to learn more](https://ethereum.org/en/glossary/#ommer). Note that while ommers blocks were important for new blocks created during Ethereum's proof of work chain, Ethereum's proof of stake chain selects exactly one block proposer at a time, causing ommers blocks not to be needed in post-merge Ethereum.
|
||||
|
||||
The transactions root is a value that is calculated based on the transactions included in the block. To derive the transactions root, a [merkle tree](https://blog.ethereum.org/2015/11/15/merkling-in-ethereum) is created from the block's transactions list. The transactions root is then derived by taking the Keccak 256-bit hash of the root node of the merkle tree.
|
||||
|
||||
When the `BodyStage` is looking at the headers to determine which block to download, it will skip the blocks where the `header.ommers_hash` and the `header.transaction_root` are empty, denoting that the block is empty as well.
|
||||
|
||||
Once the `BodyStage` determines which block bodies to fetch, a new `bodies_stream` is created which downloads all of the bodies from the `starting_block`, up until the `target_block` specified. Each time the `bodies_stream` yields a value, a `SealedBlock` is created using the block header, the ommers hash and the newly downloaded block body.
|
||||
|
||||
[File: crates/primitives/src/block.rs](https://github.com/paradigmxyz/reth/blob/main/crates/primitives/src/block.rs)
|
||||
```rust,ignore
|
||||
pub struct SealedBlock {
|
||||
/// Locked block header.
|
||||
pub header: SealedHeader,
|
||||
/// Transactions with signatures.
|
||||
pub body: Vec<TransactionSigned>,
|
||||
/// Ommer/uncle headers
|
||||
pub ommers: Vec<SealedHeader>,
|
||||
}
|
||||
```
|
||||
|
||||
The new block is then pre-validated, checking that the ommers hash and transactions root in the block header are the same in the block body. Following a successful pre-validation, the `BodyStage` loops through each transaction in the `block.body`, adding the transaction to the database. This process is repeated for every downloaded block body, with the `BodyStage` returning `Ok(ExecOutput { stage_progress: highest_block, reached_tip: true, done })` signaling it successfully completed.
|
||||
|
||||
<br>
|
||||
|
||||
## SenderRecoveryStage
|
||||
|
||||
Following a successful `BodyStage`, the `SenderRecoveryStage` starts to execute. The `SenderRecoveryStage` is responsible for recovering the transaction sender for each of the newly added transactions to the database. At the beginning of the execution function, all of the transactions are first retrieved from the database. Then the `SenderRecoveryStage` goes through each transaction and recovers the signer from the transaction signature and hash. The transaction hash is derived by taking the Keccak 256-bit hash of the RLP encoded transaction bytes. This hash is then passed into the `recover_signer` function.
|
||||
|
||||
[File: crates/primitives/src/transaction/signature.rs](https://github.com/paradigmxyz/reth/blob/main/crates/primitives/src/transaction/signature.rs)
|
||||
```rust,ignore
|
||||
pub(crate) fn recover_signer(&self, hash: H256) -> Option<Address> {
|
||||
let mut sig: [u8; 65] = [0; 65];
|
||||
|
||||
self.r.to_big_endian(&mut sig[0..32]);
|
||||
self.s.to_big_endian(&mut sig[32..64]);
|
||||
sig[64] = self.odd_y_parity as u8;
|
||||
|
||||
secp256k1::recover(&sig, hash.as_fixed_bytes()).ok()
|
||||
}
|
||||
```
|
||||
|
||||
In an [ECDSA (Elliptic Curve Digital Signature Algorithm) signature](https://wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm), the "r", "s", and "v" values are three pieces of data that are used to mathematically verify the authenticity of a digital signature. ECDSA is a widely used algorithm for generating and verifying digital signatures, and it is often used in cryptocurrencies like Ethereum.
|
||||
|
||||
The "r" is the x-coordinate of a point on the elliptic curve that is calculated as part of the signature process. The "s" is the s-value that is calculated during the signature process. It is derived from the private key and the message being signed. Lastly, the "v" is the "recovery value" that is used to recover the public key from the signature, which is derived from the signature and the message that was signed. Together, the "r", "s", and "v" values make up an ECDSA signature, and they are used to verify the authenticity of the signed transaction.
|
||||
|
||||
Once the transaction signer has been recovered, the signer is then added to the database. This process is repeated for every transaction that was retrieved, and similarly to previous stages, `Ok(ExecOutput { stage_progress: max_block_num, done: true, reached_tip: true })` is returned to signal a successful completion of the stage.
|
||||
|
||||
<br>
|
||||
|
||||
## ExecutionStage
|
||||
|
||||
Finally, after all headers, bodies and senders are added to the database, the `ExecutionStage` starts to execute. This stage is responsible for executing all of the transactions and updating the state stored in the database. For every new block header added to the database, the corresponding transactions have their signers attached to them and `reth_executor::executor::execute_and_verify_receipt()` is called, pushing the state changes resulting from the execution to a `Vec`.
|
||||
|
||||
[File: crates/stages/src/stages/execution.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/stages/execution.rs)
|
||||
```rust,ignore
|
||||
reth_executor::executor::execute_and_verify_receipt(
|
||||
header,
|
||||
&recovered_transactions,
|
||||
ommers,
|
||||
&self.config,
|
||||
state_provider,
|
||||
)
|
||||
```
|
||||
|
||||
After all headers and their corresponding transactions have been executed, all of the resulting state changes are applied to the database, updating account balances, account bytecode and other state changes. After applying all of the execution state changes, if there was a block reward, it is applied to the validator's account.
|
||||
|
||||
At the end of the `execute()` function, a familiar value is returned, `Ok(ExecOutput { done: is_done, reached_tip: true, stage_progress: last_block })` signaling a successful completion of the `ExecutionStage`.
|
||||
|
||||
<br>
|
||||
|
||||
# Next Chapter
|
||||
|
||||
Now that we have covered all of the stages that are currently included in the `Pipeline`, you know how the Reth client stays synced with the chain tip and updates the database with all of the new headers, bodies, senders and state changes. While this chapter provides an overview on how the pipeline stages work, the following chapters will dive deeper into the database, the networking stack and other exciting corners of the Reth codebase. Feel free to check out any parts of the codebase mentioned in this chapter, and when you are ready, the next chapter will dive into the `database`.
|
||||
|
||||
[Next Chapter]()
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user