From 5d3041b10d5567b8acc1b0bc1d5a6d8d8e44c267 Mon Sep 17 00:00:00 2001
From: Nicholas Wehr <33910651+wwwehr@users.noreply.github.com>
Date: Thu, 21 Aug 2025 16:24:25 -0700
Subject: [PATCH 1/6] Update README.md

---
 README.md | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index eac81548e..1ffc8c00d 100644
--- a/README.md
+++ b/README.md
@@ -16,7 +16,19 @@ Building NanoReth from source requires Rust and Cargo to be installed:
 
 ## How to run (mainnet)
 
-1) `$ aws s3 sync s3://hl-mainnet-evm-blocks/ ~/evm-blocks --request-payer requester # one-time` - this will backfill the existing blocks from HyperLiquid's EVM S3 bucket.
+The current state of the block files comprise of millions of small objects totalling over 20 Gigs and counting. The "requester pays" option means you will need a configured aws environment, and you could incur charges which varies according to destination (ec2 versus local).
+
+1) this will backfill the existing blocks from HyperLiquid's EVM S3 bucket:
+
+    ```shell                                  
+    aws s3 sync s3://hl-mainnet-evm-blocks/ ~/evm-blocks \
+      --request-payer requester \
+      --exact-timestamps \                  
+      --size-only \                        
+      --page-size 1000 \                        
+      --only-show-errors
+    ```
+    > consider using this [rust based s3 tool wrapper](https://github.com/wwwehr/hl-evm-block-sync) alternative to optimize your download experience
 
 2) `$ make install` - this will install the NanoReth binary.
 
@@ -65,12 +77,25 @@ $ reth node --http --http.addr 0.0.0.0 --http.api eth,ots,net,web3 \
 
 Testnet is supported since block 21304281.
 
+> [!NOTE]
+> To run testnet locally, you will need:
+> - [ ] [git lfs](https://git-lfs.com/)
+> - [ ] [rust toolchain](https://rustup.rs/)
+
 ```sh
 # Get testnet genesis at block 21304281
 $ cd ~
 $ git clone https://github.com/sprites0/hl-testnet-genesis
+$ git lfs pull
 $ zstd --rm -d ~/hl-testnet-genesis/*.zst
 
+# Now return to where you have cloned this project to continue
+$ cd -
+
+# prepare your rust toolchain
+$ rustup install 1.82 # (this corresponds with rust version in our Cargo.toml)
+$ rustup default 1.82
+
 # Init node
 $ make install
 $ reth init-state --without-evm --chain testnet --header ~/hl-testnet-genesis/21304281.rlp \

From 8c6ea1ae7a19e4c50c5293c1e2e2f3c5881c8f4b Mon Sep 17 00:00:00 2001
From: Nicholas Wehr <33910651+wwwehr@users.noreply.github.com>
Date: Tue, 26 Aug 2025 19:01:41 -0700
Subject: [PATCH 2/6] Update README.md

Co-authored-by: sprites0 <lovelysprites@gmail.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 1ffc8c00d..13a4aa6f4 100644
--- a/README.md
+++ b/README.md
@@ -86,7 +86,7 @@ Testnet is supported since block 21304281.
 # Get testnet genesis at block 21304281
 $ cd ~
 $ git clone https://github.com/sprites0/hl-testnet-genesis
-$ git lfs pull
+$ git -C hl-testnet-genesis lfs pull
 $ zstd --rm -d ~/hl-testnet-genesis/*.zst
 
 # Now return to where you have cloned this project to continue

From 21e7c718eaa65f998cb7e50123e0b9cfe3563e83 Mon Sep 17 00:00:00 2001
From: Nicholas Wehr <33910651+wwwehr@users.noreply.github.com>
Date: Tue, 26 Aug 2025 19:02:19 -0700
Subject: [PATCH 3/6] Update README.md

Co-authored-by: sprites0 <lovelysprites@gmail.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 13a4aa6f4..09cf61cfe 100644
--- a/README.md
+++ b/README.md
@@ -18,7 +18,7 @@ Building NanoReth from source requires Rust and Cargo to be installed:
 
 The current state of the block files comprise of millions of small objects totalling over 20 Gigs and counting. The "requester pays" option means you will need a configured aws environment, and you could incur charges which varies according to destination (ec2 versus local).
 
-1) this will backfill the existing blocks from HyperLiquid's EVM S3 bucket:
+1) this will backfill the existing blocks from Hyperliquid's EVM S3 bucket:
 
     ```shell                                  
     aws s3 sync s3://hl-mainnet-evm-blocks/ ~/evm-blocks \

From e9dcff401568e0fded54a46bd847d5ad7a38fbc9 Mon Sep 17 00:00:00 2001
From: Nicholas Wehr <github@biochx.com>
Date: Tue, 26 Aug 2025 19:04:18 -0700
Subject: [PATCH 4/6] readme nits

---
 README.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/README.md b/README.md
index 09cf61cfe..b0ffae45e 100644
--- a/README.md
+++ b/README.md
@@ -25,7 +25,6 @@ The current state of the block files comprise of millions of small objects total
       --request-payer requester \
       --exact-timestamps \                  
       --size-only \                        
-      --page-size 1000 \                        
       --only-show-errors
     ```
     > consider using this [rust based s3 tool wrapper](https://github.com/wwwehr/hl-evm-block-sync) alternative to optimize your download experience

From bf51dc83e59d0163bdbe3757075d69958374a642 Mon Sep 17 00:00:00 2001
From: Nicholas Wehr <github@biochx.com>
Date: Thu, 28 Aug 2025 11:27:31 -0700
Subject: [PATCH 5/6] incorporated s3 sync tool from external github repo

---
 etc/evm-block-sync/README.md        |  57 ++++++++++
 etc/evm-block-sync/s3sync-runner.sh | 158 ++++++++++++++++++++++++++++
 2 files changed, 215 insertions(+)
 create mode 100644 etc/evm-block-sync/README.md
 create mode 100755 etc/evm-block-sync/s3sync-runner.sh

diff --git a/etc/evm-block-sync/README.md b/etc/evm-block-sync/README.md
new file mode 100644
index 000000000..ffb7cecb0
--- /dev/null
+++ b/etc/evm-block-sync/README.md
@@ -0,0 +1,57 @@
+# 🚀 S3Sync Runner
+
+Fastest way to pull down evm block files from s3
+
+This script automates syncing **massive S3 object stores** in a **safe, resumable, and time-tracked way**. The traditional `s3 sync` is just wayy to slow.
+
+## Features
+
+- ✅ Auto-installs [nidor1998/s3sync](https://github.com/nidor1998/s3sync) (latest release) into `~/.local/bin`  
+- ✅ Sequential per-prefix syncs (e.g., `21000000/`, `22000000/`, …)  
+- ✅ Per-prefix timing: `22000000 took 12 minutes!`  
+- ✅ Total runtime summary at the end  
+- ✅ Designed for **tiny files at scale** (EVM block archives)  
+- ✅ Zero-config bootstrap — just run the script  
+
+## Quick Start
+
+```bash
+chmod +x s3sync-runner.sh
+./s3sync-runner.sh
+```
+
+> Skipping to relevant block section
+```bash
+./s3sync-runner.sh --start-at 30000000
+```
+
+The script will:
+* Install or update s3sync into ~/.local/bin
+* Discover top-level prefixes in your S3 bucket
+* Sync them one at a time, printing elapsed minutes
+
+## Configuration
+
+Edit the top of s3sync-runner.sh if needed:
+```bash
+BUCKET="hl-testnet-evm-blocks"   # could be hl-mainnet-evm-blocks
+REGION="ap-northeast-1"          # hardcoded bucket region
+DEST="$HOME/evm-blocks-testnet"  # local target directory (this is what nanoreth will look at)
+WORKERS=512                      # worker threads per sync (lotsa workers needs lotsa RAM)
+```
+
+## Example Output
+```bash
+[2025-08-20 20:01:02] START  21000000
+[2025-08-20 20:13:15] 21000000 took 12 minutes!
+[2025-08-20 20:13:15] START  22000000
+[2025-08-20 20:26:40] 22000000 took 13 minutes!
+[2025-08-20 20:26:40] ALL DONE in 25 minutes.
+```
+
+## Hackathon Context
+
+This runner was built as part of the Hyperliquid DEX Hackathon to accelerate:
+* ⛓️ Blockchain archive node ingestion
+* 📂 EVM block dataset replication
+* 🧩 DEX ecosystem data pipelines
diff --git a/etc/evm-block-sync/s3sync-runner.sh b/etc/evm-block-sync/s3sync-runner.sh
new file mode 100755
index 000000000..ae99654b4
--- /dev/null
+++ b/etc/evm-block-sync/s3sync-runner.sh
@@ -0,0 +1,158 @@
+#!/usr/bin/env bash
+# @author Niko Wehr (wwwehr)
+set -euo pipefail
+
+# ---- config ----
+BUCKET="hl-testnet-evm-blocks"
+REGION="ap-northeast-1"
+DEST="${HOME}/evm-blocks-testnet"
+WORKERS=512
+S3SYNC="${HOME}/.local/bin/s3sync"
+START_AT=""   # default: run all
+# ----------------
+
+# parse args
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --start-at)
+      START_AT="$2"
+      shift 2
+      ;;
+    *)
+      echo "Unknown arg: $1" >&2
+      exit 1
+      ;;
+  esac
+done
+
+now(){ date +"%F %T"; }
+log(){ printf '[%s] %s\n' "$(now)" "$*"; }
+die(){ log "ERROR: $*"; exit 1; }
+trap 'log "Signal received, exiting."; exit 2' INT TERM
+
+need(){ command -v "$1" >/dev/null 2>&1 || die "missing dependency: $1"; }
+
+install_s3sync_latest() {
+  need curl
+  GHAPI="https://api.github.com/repos/nidor1998/s3sync/releases/latest"
+
+  os="$(uname | tr '[:upper:]' '[:lower:]')"
+  arch_raw="$(uname -m)"
+  case "$arch_raw" in
+    x86_64|amd64) arch_tag="x86_64" ;;
+    aarch64|arm64) arch_tag="aarch64" ;;
+    *) die "unsupported arch: ${arch_raw}" ;;
+  esac
+
+  # Map OS → asset prefix
+  case "$os" in
+    linux)   prefix="s3sync-linux-glibc2.28-${arch_tag}" ;;
+    darwin)  prefix="s3sync-macos-${arch_tag}" ;;
+    msys*|mingw*|cygwin*|windows) prefix="s3sync-windows-${arch_tag}" ;;
+    *) die "unsupported OS: ${os}" ;;
+  esac
+
+  # Fetch latest release JSON (unauthenticated)
+  json="$(curl -fsSL "$GHAPI")" || die "failed to query GitHub API"
+
+  # Pick URLs for tarball and checksum
+  tar_url="$(printf '%s' "$json" | awk -F'"' '/browser_download_url/ {print $4}' | grep -F "${prefix}.tar.gz" | head -n1)"
+  sum_url="$(printf '%s' "$json" | awk -F'"' '/browser_download_url/ {print $4}' | grep -F "${prefix}.sha256" | head -n1)"
+  [[ -n "$tar_url" ]] || die "could not find asset for prefix: ${prefix}"
+  [[ -n "$sum_url" ]] || die "could not find checksum for prefix: ${prefix}"
+
+  mkdir -p "${HOME}/.local/bin"
+  tmpdir="$(mktemp -d)"; trap 'rm -rf "$tmpdir"' EXIT
+  tar_path="${tmpdir}/s3sync.tar.gz"
+  sum_path="${tmpdir}/s3sync.sha256"
+
+  log "Downloading: $tar_url"
+  curl -fL --retry 5 --retry-delay 1 -o "$tar_path" "$tar_url"
+  curl -fL --retry 5 --retry-delay 1 -o "$sum_path" "$sum_url"
+
+  # Verify checksum
+  want_sum="$(cut -d: -f2 <<<"$(sed -n 's/^sha256:\(.*\)$/\1/p' "$sum_path" | tr -d '[:space:]')" || true)"
+  [[ -n "$want_sum" ]] || want_sum="$(awk '{print $1}' "$sum_path" || true)"
+  [[ -n "$want_sum" ]] || die "could not parse checksum file"
+  got_sum="$(sha256sum "$tar_path" | awk '{print $1}')"
+  [[ "$want_sum" == "$got_sum" ]] || die "sha256 mismatch: want $want_sum got $got_sum"
+
+  # Extract and install
+  tar -xzf "$tar_path" -C "$tmpdir"
+  binpath="$(find "$tmpdir" -maxdepth 2 -type f -name 's3sync' | head -n1)"
+  [[ -x "$binpath" ]] || die "s3sync binary not found in archive"
+  chmod +x "$binpath"
+  mv -f "$binpath" "$S3SYNC"
+  log "s3sync installed at $S3SYNC"
+}
+
+
+# --- deps & install/update ---
+need aws
+install_s3sync_latest
+[[ ":$PATH:" == *":$HOME/.local/bin:"* ]] || export PATH="$HOME/.local/bin:$PATH"
+mkdir -p "$DEST"
+
+# list prefixes
+log "Listing top-level prefixes in s3://${BUCKET}/"
+mapfile -t PREFIXES < <(
+  aws s3 ls "s3://${BUCKET}/" --region "$REGION" --request-payer requester \
+  | awk '/^ *PRE /{print $2}' | sed 's:/$::' | grep -E '^[0-9]+$' || true
+)
+((${#PREFIXES[@]})) || die "No prefixes found."
+
+# mark initial status
+declare -A RESULTS
+if [[ ! -n "$START_AT" ]]; then
+  skipping=0
+else
+  skipping=1
+fi
+for p in "${PREFIXES[@]}"; do
+  if [[ -n "$START_AT" && "$p" == "$START_AT" ]]; then
+    skipping=0
+  fi
+  if (( skipping )); then
+    RESULTS["$p"]="-- SKIPPED"
+  else
+    RESULTS["$p"]="-- TODO"
+  fi
+done
+
+total_start=$(date +%s)
+
+for p in "${PREFIXES[@]}"; do
+  if [[ "${RESULTS[$p]}" == "-- SKIPPED" ]]; then
+    continue
+  fi
+  src="s3://${BUCKET}/${p}/"
+  dst="${DEST}/${p}/"
+  mkdir -p "$dst"
+
+  log "START  ${p}"
+  start=$(date +%s)
+
+  "$S3SYNC" \
+    --source-request-payer \
+    --source-region "$REGION" \
+    --worker-size "$WORKERS" \
+    --max-parallel-uploads "$WORKERS" \
+    "$src" "$dst"
+
+  end=$(date +%s)
+  mins=$(( (end - start + 59) / 60 ))
+  RESULTS["$p"]="$mins minutes"
+
+  # Print status table so far
+  echo "---- Status ----"
+  for k in "${PREFIXES[@]}"; do
+    echo "$k ${RESULTS[$k]}"
+  done
+  echo "----------------"
+done
+
+total_end=$(date +%s)
+total_mins=$(( (total_end - total_start + 59) / 60 ))
+
+echo "ALL DONE in $total_mins minutes."
+

From fa9f8fc5df042b4e06084b728a3c3aea75552578 Mon Sep 17 00:00:00 2001
From: Nicholas Wehr <33910651+wwwehr@users.noreply.github.com>
Date: Thu, 28 Aug 2025 11:34:18 -0700
Subject: [PATCH 6/6] Updated top level README.md

updated instructions and prefer the s3sync-runner tool
---
 README.md | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index b0ffae45e..3eaa99ff1 100644
--- a/README.md
+++ b/README.md
@@ -20,6 +20,13 @@ The current state of the block files comprise of millions of small objects total
 
 1) this will backfill the existing blocks from Hyperliquid's EVM S3 bucket:
 
+    > use our rust based s3 tool wrapper to optimize your download experience - [read more](./etc/evm-block-sync/README.md)
+    ```shell
+    chmod +x ./etc/evm-block-sync/s3sync-runner.sh
+    ./etc/evm-block-sync/s3sync-runner.sh
+    ```
+
+    > or use the conventional [aws cli](https://aws.amazon.com/cli/)
     ```shell                                  
     aws s3 sync s3://hl-mainnet-evm-blocks/ ~/evm-blocks \
       --request-payer requester \
@@ -27,17 +34,17 @@ The current state of the block files comprise of millions of small objects total
       --size-only \                        
       --only-show-errors
     ```
-    > consider using this [rust based s3 tool wrapper](https://github.com/wwwehr/hl-evm-block-sync) alternative to optimize your download experience
 
-2) `$ make install` - this will install the NanoReth binary.
 
-3) Start NanoReth which will begin syncing using the blocks in `~/evm-blocks`:
+1) `$ make install` - this will install the NanoReth binary.
+
+2) Start NanoReth which will begin syncing using the blocks in `~/evm-blocks`:
 
     ```sh
     $ reth node --http --http.addr 0.0.0.0 --http.api eth,ots,net,web3 --ws --ws.addr 0.0.0.0 --ws.origins '*' --ws.api eth,ots,net,web3 --ingest-dir ~/evm-blocks --ws.port 8545
     ```
 
-4) Once the node logs stops making progress this means it's caught up with the existing blocks.
+3) Once the node logs stops making progress this means it's caught up with the existing blocks.
 
     Stop the NanoReth process and then start Goofys: `$ goofys --region=ap-northeast-1 --requester-pays hl-mainnet-evm-blocks evm-blocks`