Breaking changes (#5191)

Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
Co-authored-by: Roman Krasiuk <rokrassyuk@gmail.com>
Co-authored-by: joshieDo <ranriver@protonmail.com>
Co-authored-by: joshieDo <93316087+joshieDo@users.noreply.github.com>
Co-authored-by: Matthias Seitz <matthias.seitz@outlook.de>
Co-authored-by: Oliver Nordbjerg <hi@notbjerg.me>
Co-authored-by: Thomas Coratger <thomas.coratger@gmail.com>
This commit is contained in:
Alexey Shekhirin
2024-02-29 12:37:28 +00:00
committed by GitHub
parent 025fa5f038
commit 6b5b6f7a40
252 changed files with 10154 additions and 6327 deletions

View File

@ -35,30 +35,30 @@ The `Table` trait has two generic values, `Key` and `Value`, which need to imple
There are many tables within the node, all used to store different types of data from `Headers` to `Transactions` and more. Below is a list of all of the tables. You can follow [this link](https://github.com/paradigmxyz/reth/blob/1563506aea09049a85e5cc72c2894f3f7a371581/crates/storage/db/src/tables/mod.rs#L161-L188) if you would like to see the table definitions for any of the tables below.
- CanonicalHeaders
- HeaderTD
- HeaderTerminalDifficulties
- HeaderNumbers
- Headers
- BlockBodyIndices
- BlockOmmers
- BlockWithdrawals
- TransactionBlock
- TransactionBlocks
- Transactions
- TxHashNumber
- TransactionHashNumbers
- Receipts
- PlainAccountState
- PlainStorageState
- Bytecodes
- AccountHistory
- StorageHistory
- AccountChangeSet
- StorageChangeSet
- AccountsHistory
- StoragesHistory
- AccountChangeSets
- StorageChangeSets
- HashedAccount
- HashedStorage
- HashedStorages
- AccountsTrie
- StoragesTrie
- TxSenders
- SyncStage
- SyncStageProgress
- TransactionSenders
- StageCheckpoints
- StageCheckpointProgresses
- PruneCheckpoints
<br>
@ -137,7 +137,6 @@ The `Database` defines two associated types `TX` and `TXMut`.
[File: crates/storage/db/src/abstraction/database.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/database.rs#L11)
The `TX` type can be any type that implements the `DbTx` trait, which provides a set of functions to interact with read only transactions.
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L36)
@ -149,7 +148,7 @@ pub trait DbTx: Send + Sync {
type Cursor<T: Table>: DbCursorRO<T> + Send + Sync;
/// DupCursor type for this read-only transaction
type DupCursor<T: DupSort>: DbDupCursorRO<T> + DbCursorRO<T> + Send + Sync;
/// Get value
fn get<T: Table>(&self, key: T::Key) -> Result<Option<T::Value>, Error>;
/// Commit for read only transaction will consume and free transaction and allows

View File

@ -94,10 +94,6 @@ This process continues until all of the headers have been downloaded and written
<br>
## TotalDifficultyStage
* TODO: explain stage
<br>
## BodyStage
Once the `HeaderStage` completes successfully, the `BodyStage` will start execution. The body stage downloads block bodies for all of the new block headers that were stored locally in the database. The `BodyStage` first determines which block bodies to download by checking if the block body has an ommers hash and transaction root.

View File

@ -2,24 +2,24 @@
## Abstractions
* We created a [Database trait abstraction](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/interfaces/src/db/mod.rs) using Rust Stable GATs which frees us from being bound to a single database implementation. We currently use MDBX, but are exploring [redb](https://github.com/cberner/redb) as an alternative.
* We then iterated on [`Transaction`](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/stages/src/db.rs#L14-L19) as a non-leaky abstraction with helpers for strictly-typed and unit-tested higher-level database abstractions.
- We created a [Database trait abstraction](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/interfaces/src/db/mod.rs) using Rust Stable GATs which frees us from being bound to a single database implementation. We currently use MDBX, but are exploring [redb](https://github.com/cberner/redb) as an alternative.
- We then iterated on [`Transaction`](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/stages/src/db.rs#L14-L19) as a non-leaky abstraction with helpers for strictly-typed and unit-tested higher-level database abstractions.
## Codecs
* We want Reth's serialized format to be able to trade off read/write speed for size, depending on who the user is.
* To achieve that, we created the [Encode/Decode/Compress/Decompress traits](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/interfaces/src/db/table.rs#L9-L36) to make the (de)serialization of database `Table::Key` and `Table::Values` generic.
* This allows for [out-of-the-box benchmarking](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/db/benches/encoding_iai.rs#L5) (using [Criterion](https://github.com/bheisler/criterion.rs) and [Iai](https://github.com/bheisler/iai))
* It also enables [out-of-the-box fuzzing](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/interfaces/src/db/codecs/fuzz/mod.rs) using [trailofbits/test-fuzz](https://github.com/trailofbits/test-fuzz).
* We implemented that trait for the following encoding formats:
* [Ethereum-specific Compact Encoding](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/codecs/derive/src/compact/mod.rs): A lot of Ethereum datatypes have unnecessary zeros when serialized, or optional (e.g. on empty hashes) which would be nice not to pay in storage costs.
* [Erigon](https://github.com/ledgerwatch/erigon/blob/12ee33a492f5d240458822d052820d9998653a63/docs/programmers_guide/db_walkthrough.MD) achieves that by having a `bitfield` set on Table "PlainState which adds a bitfield to Accounts.
* Akula expanded it for other tables and datatypes manually. It also saved some more space by storing the length of certain types (U256, u64) using the [`modular_bitfield`](https://docs.rs/modular-bitfield/latest/modular_bitfield/) crate, which compacts this information.
* We generalized it for all types, by writing a derive macro that autogenerates code for implementing the trait. It, also generates the interfaces required for fuzzing using ToB/test-fuzz:
* [Scale Encoding](https://github.com/paritytech/parity-scale-codec)
* [Postcard Encoding](https://github.com/jamesmunns/postcard)
* Passthrough (called `no_codec` in the codebase)
* We made implementation of these traits easy via a derive macro called [`main_codec`](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/codecs/derive/src/lib.rs#L15) that delegates to one of Compact (default), Scale, Postcard or Passthrough encoding. This is [derived on every struct we need](https://github.com/search?q=repo%3Aparadigmxyz%2Freth%20%22%23%5Bmain_codec%5D%22&type=code), and lets us experiment with different encoding formats without having to modify the entire codebase each time.
- We want Reth's serialized format to be able to trade off read/write speed for size, depending on who the user is.
- To achieve that, we created the [Encode/Decode/Compress/Decompress traits](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/interfaces/src/db/table.rs#L9-L36) to make the (de)serialization of database `Table::Key` and `Table::Values` generic.
- This allows for [out-of-the-box benchmarking](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/db/benches/encoding_iai.rs#L5) (using [Criterion](https://github.com/bheisler/criterion.rs) and [Iai](https://github.com/bheisler/iai))
- It also enables [out-of-the-box fuzzing](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/interfaces/src/db/codecs/fuzz/mod.rs) using [trailofbits/test-fuzz](https://github.com/trailofbits/test-fuzz).
- We implemented that trait for the following encoding formats:
- [Ethereum-specific Compact Encoding](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/codecs/derive/src/compact/mod.rs): A lot of Ethereum datatypes have unnecessary zeros when serialized, or optional (e.g. on empty hashes) which would be nice not to pay in storage costs.
- [Erigon](https://github.com/ledgerwatch/erigon/blob/12ee33a492f5d240458822d052820d9998653a63/docs/programmers_guide/db_walkthrough.MD) achieves that by having a `bitfield` set on Table "PlainState which adds a bitfield to Accounts.
- Akula expanded it for other tables and datatypes manually. It also saved some more space by storing the length of certain types (U256, u64) using the [`modular_bitfield`](https://docs.rs/modular-bitfield/latest/modular_bitfield/) crate, which compacts this information.
- We generalized it for all types, by writing a derive macro that autogenerates code for implementing the trait. It, also generates the interfaces required for fuzzing using ToB/test-fuzz:
- [Scale Encoding](https://github.com/paritytech/parity-scale-codec)
- [Postcard Encoding](https://github.com/jamesmunns/postcard)
- Passthrough (called `no_codec` in the codebase)
- We made implementation of these traits easy via a derive macro called [`main_codec`](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/codecs/derive/src/lib.rs#L15) that delegates to one of Compact (default), Scale, Postcard or Passthrough encoding. This is [derived on every struct we need](https://github.com/search?q=repo%3Aparadigmxyz%2Freth%20%22%23%5Bmain_codec%5D%22&type=code), and lets us experiment with different encoding formats without having to modify the entire codebase each time.
### Table layout
@ -58,11 +58,11 @@ Transactions {
u64 TxNumber "PK"
TransactionSignedNoHash Data
}
TxHashNumber {
TransactionHashNumbers {
B256 TxHash "PK"
u64 TxNumber
}
TransactionBlock {
TransactionBlocks {
u64 MaxTxNumber "PK"
u64 BlockNumber
}
@ -83,31 +83,31 @@ PlainStorageState {
B256 StorageKey "PK"
U256 StorageValue
}
AccountHistory {
AccountsHistory {
B256 Account "PK"
BlockNumberList BlockNumberList "List of transitions where account was changed"
}
StorageHistory {
StoragesHistory {
B256 Account "PK"
B256 StorageKey "PK"
BlockNumberList BlockNumberList "List of transitions where account storage entry was changed"
}
AccountChangeSet {
AccountChangeSets {
u64 BlockNumber "PK"
B256 Account "PK"
ChangeSet AccountChangeSet "Account before transition"
ChangeSet AccountChangeSets "Account before transition"
}
StorageChangeSet {
StorageChangeSets {
u64 BlockNumber "PK"
B256 Account "PK"
B256 StorageKey "PK"
ChangeSet StorageChangeSet "Storage entry before transition"
ChangeSet StorageChangeSets "Storage entry before transition"
}
HashedAccount {
HashedAccounts {
B256 HashedAddress "PK"
Account Data
}
HashedStorage {
HashedStorages {
B256 HashedAddress "PK"
B256 HashedStorageKey "PK"
U256 StorageValue
@ -121,17 +121,17 @@ StoragesTrie {
StoredNibblesSubKey NibblesSubKey "PK"
StorageTrieEntry Node
}
TxSenders {
TransactionSenders {
u64 TxNumber "PK"
Address Sender
}
TxHashNumber ||--|| Transactions : "hash -> tx id"
TransactionBlock ||--|{ Transactions : "tx id -> block number"
TransactionHashNumbers ||--|| Transactions : "hash -> tx id"
TransactionBlocks ||--|{ Transactions : "tx id -> block number"
BlockBodyIndices ||--o{ Transactions : "block number -> tx ids"
Headers ||--o{ AccountChangeSet : "each block has zero or more changesets"
Headers ||--o{ StorageChangeSet : "each block has zero or more changesets"
AccountHistory }|--|{ AccountChangeSet : index
StorageHistory }|--|{ StorageChangeSet : index
Headers ||--o{ AccountChangeSets : "each block has zero or more changesets"
Headers ||--o{ StorageChangeSets : "each block has zero or more changesets"
AccountsHistory }|--|{ AccountChangeSets : index
StoragesHistory }|--|{ StorageChangeSets : index
Headers ||--o| BlockOmmers : "each block has 0 or more ommers"
BlockBodyIndices ||--|| Headers : "index"
HeaderNumbers |o--|| Headers : "block hash -> block number"
@ -139,8 +139,8 @@ CanonicalHeaders |o--|| Headers : "canonical chain block number -> block hash"
Transactions ||--|| Receipts : "each tx has a receipt"
PlainAccountState }o--o| Bytecodes : "an account can have a bytecode"
PlainAccountState ||--o{ PlainStorageState : "an account has 0 or more storage slots"
Transactions ||--|| TxSenders : "a tx has exactly 1 sender"
Transactions ||--|| TransactionSenders : "a tx has exactly 1 sender"
PlainAccountState ||--|| HashedAccount : "hashed representation"
PlainStorageState ||--|| HashedStorage : "hashed representation"
PlainAccountState ||--|| HashedAccounts : "hashed representation"
PlainStorageState ||--|| HashedStorages : "hashed representation"
```