Analysis date: 2026-05-10 · Chain height: 4,216,395 · Block cache size: ~225 MB · Growth rate: ~28 MB/year
This document records the root cause of Nerva’s sync slowdown, the full set of mitigations considered (including why several intuitive ones are dead ends), and the actionable options ranked by effort and impact. The problem is hardware-independent — at 4.2M blocks the block cache exceeds any consumer or server CPU’s L3 cache. Every new node syncing from scratch is affected regardless of hardware.
1 — The Problem
The slowdown is structural and hardware-independent at current chain height.
The block cache is ~225 MB today. No consumer CPU has an L3 cache large enough to hold it. Every node syncing from scratch spends the vast majority of its sync time in the slow zone — on a typical laptop this means the slow zone starts as early as block 143,000, leaving 4+ million blocks to process at ~12 ms/block. Observed full sync time: under 1 day on high-end hardware (64 MB L3); ~1 week on slower machines. The degradation is monotonic — it only gets worse as the chain grows (~28 MB/year).
The block cache is ~225 MB today. No consumer CPU has an L3 cache large enough to hold it. Every node syncing from scratch spends the vast majority of its sync time in the slow zone — on a typical laptop this means the slow zone starts as early as block 143,000, leaving 4+ million blocks to process at ~12 ms/block. Observed full sync time: under 1 day on high-end hardware (64 MB L3); ~1 week on slower machines. The degradation is monotonic — it only gets worse as the chain grows (~28 MB/year).
What happens during sync
For every block being verified, Nerva’s CNA PoW functionget_cna_v5_data performs 4,096 random reads from the block cache. The reads are spread uniformly across all blocks from 0 to height−256, seeded by the block’s nonce via HC128.
The block cache is a flat in-memory array — one 56-byte entry per block. Cache size formula:
height × 56 bytesAt current chain height:
4,216,395 × 56 = ~225 MB — larger than every mainstream CPU’s L3 cache.When the cache fits in L3, all 4,096 reads are fast (~30 ns each). Once it overflows, reads go to RAM (~80 ns each). With 4,096 reads per block, the difference is significant and compounds across millions of blocks. The daemon’s other working set (LMDB page cache, heap, code) consumes ~3–5 MB of L3 on top of the block cache, so the effective overflow threshold is slightly below the raw L3 size.
Overflow height by CPU — at current chain tip every row is already in the slow zone
| CPU L3 cache | Overflow height | Fast blocks | Slow blocks (of 4,216,395) | Slow zone % | Typical hardware |
|---|---|---|---|---|---|
| 8 MB | ~143,000 | 143k | 4,073,395 | 97% | Budget / older laptop |
| 12 MB | ~214,000 | 214k | 4,002,395 | 95% | Mid-range laptop |
| 16 MB | ~286,000 | 286k | 3,930,395 | 93% | Gaming laptop / low desktop |
| 32 MB | ~571,000 | 571k | 3,645,395 | 86% | Mid-range desktop (Ryzen/Intel) |
| 64 MB | ~1,143,000 | 1,143k | 3,073,395 | 73% | High-end desktop (Ryzen 5000+) |
| 96 MB | ~1,714,000 | 1,714k | 2,502,395 | 59% | EPYC / Threadripper server |
| 128 MB | ~2,286,000 | 2,286k | 1,930,395 | 46% | High-end server |
| 225 MB+ | ~4,216,000 | 4,216k | ~0 | 0% | Does not exist in consumer hardware |
Measured performance (confirmed via diag3 logging on 64 MB L3 CPU):
Fast zone (cache in L3): ~3.2 ms/block · Slow zone (cache in RAM): ~12 ms/block · Ratio: ~3.7× slower
Estimated full sync at 12 ms/block for 4,216,395 blocks = ~14 hours CPU time at 100% utilisation in the slow zone alone, but real-world sync is further slowed by LMDB I/O, block download, and signature verification, pushing total sync to under a day on high-end hardware; ~1 week on slower machines.
Fast zone (cache in L3): ~3.2 ms/block · Slow zone (cache in RAM): ~12 ms/block · Ratio: ~3.7× slower
Estimated full sync at 12 ms/block for 4,216,395 blocks = ~14 hours CPU time at 100% utilisation in the slow zone alone, but real-world sync is further slowed by LMDB I/O, block download, and signature verification, pushing total sync to under a day on high-end hardware; ~1 week on slower machines.
2 — What Was Already Fixed (branch: fix/slow-daemon-sync)
These commits improved sync performance across the board but do not eliminate the L3 overflow for historical sync.
They reduce per-block overhead around the PoW call, not the PoW call itself.
| Commit | What it fixed | Blocks affected |
|---|---|---|
a15f996 | Reuse batch write transaction in build_block_cache — fixed sync failure at HF7 (block 173,500) | All |
f993407 | Use block cache for CNA v2 hash lookups — avoided 5 extra random LMDB reads per block | Pre-HF5 |
87df789 | Skip short-term weight limit recomputation for mid-batch blocks during sync | All |
701e9ec | Replace hash-based cache validation with height comparison in long-term weight median | All |
4dc45b6 | Eliminate redundant long-term block weight median computation post-HF12 | Post-HF12 |
3 — The Sliding Window Idea
What it is
Instead ofget_cna_v5_data reading uniformly from [0, height-256],
a tiered design would send 95% of reads to the last 100,000 blocks (the “recent window”) and only 5% to full history. The recent 100k blocks occupy
100,000 × 56 = 5.6 MB — comfortably inside any modern CPU’s L3.
Mining and keeping-up-with-tip becomes fast regardless of chain height. This was proposed as part of HF13. The math
| Window size N | Hot cache size | Pool-breakable? |
|---|---|---|
| 10,000 blocks | 560 KB | Yes — trivial broadcast |
| 100,000 blocks | 5.6 MB | Probably yes (delta broadcast) |
| 500,000 blocks | 28 MB | Borderline |
| Full history | 225 MB+ growing | No — impractical to broadcast |
4 — Why the Sliding Window Cannot Fix Historical Sync
The sliding window is a PoW algorithm change. It changes which blocks
Applying the sliding window read pattern to an old block produces a different hash for the same nonce — one the miner never solved for. It would fail the difficulty check. PoW verification must reproduce the exact hash that was valid when the block was mined. This cannot be overridden by any database migration.
get_cna_v5_data reads,
which changes the hash output. Every block was mined by finding a nonce such that:hash(block_blob + old_read_pattern_data) < difficulty_targetApplying the sliding window read pattern to an old block produces a different hash for the same nonce — one the miner never solved for. It would fail the difficulty check. PoW verification must reproduce the exact hash that was valid when the block was mined. This cannot be overridden by any database migration.
| Block range | Algorithm used during sync | L3 behavior | Sliding window helps? |
|---|---|---|---|
| 0 → HF13 height | Old: uniform [0, height-256] | Cache is 225 MB — entirely in RAM on all consumer hardware | No — hash would be wrong |
| HF13 height → tip | New: 95% last 100k blocks | Only ~5.6 MB hot | Yes — full fix |
5 — Dead Ends Considered
Dead end 1 — Re-mine all historical blocks under the new algorithm
Mechanically possible (same transactions, new nonces, same difficulty targets). But every block hash changes. Old nodes reject the new chain as invalid. All existing software, explorers, and users must switch simultaneously. This is a new coin that happens to share transaction history with Nerva. Not viable for a live network.
Mechanically possible (same transactions, new nonces, same difficulty targets). But every block hash changes. Old nodes reject the new chain as invalid. All existing software, explorers, and users must switch simultaneously. This is a new coin that happens to share transaction history with Nerva. Not viable for a live network.
Dead end 2 — Keep both chains temporarily; switch at HF13
The re-mined chain has different block hashes than the original chain. They cannot converge at HF13 — a node can only be on one chain at a time. Old nodes reject re-mined blocks; new nodes (on re-mined chain) reject old blocks. There is no HF mechanism that makes two chains with different block histories merge. This is still just launching a new coin.
The re-mined chain has different block hashes than the original chain. They cannot converge at HF13 — a node can only be on one chain at a time. Old nodes reject re-mined blocks; new nodes (on re-mined chain) reject old blocks. There is no HF mechanism that makes two chains with different block histories merge. This is still just launching a new coin.
Dead end 3 — Database migration that stores a “new-algorithm” form of old blocks
PoW is verified by recomputing the hash and checking it against difficulty. There is no database representation that changes which algorithm is applied — the algorithm is determined by the block height and the active hard fork rules, not by what is stored. Storing anything other than the original block data would be ignored by the verification logic.
PoW is verified by recomputing the hash and checking it against difficulty. There is no database representation that changes which algorithm is applied — the algorithm is determined by the block height and the active hard fork rules, not by what is stored. Storing anything other than the original block data would be ignored by the verification logic.
6 — Viable Solutions
Option A — Hardcoded assume-valid height ~10 lines of code NO FORK
Add a constant
Impact: eliminates 100% of the L3 overflow for all historical blocks on a first-time sync.
Trust: user trusts the Nerva binary (same trust model as running the daemon at all).
Risk: a malicious release could include invalid historical transactions — mitigated by open source and community verification.
Add a constant
ASSUME_VALID_HEIGHT to cryptonote_config.h.
During sync, any block below this height skips get_block_longhash entirely — zero random reads.
Everything else (transactions, amounts, double-spends) is still verified in full.
The constant is updated with each release. Bitcoin has used this since 2017 (assumevalid). Impact: eliminates 100% of the L3 overflow for all historical blocks on a first-time sync.
Trust: user trusts the Nerva binary (same trust model as running the daemon at all).
Risk: a malicious release could include invalid historical transactions — mitigated by open source and community verification.
// cryptonote_config.h
#define ASSUME_VALID_HEIGHT 4200000 // update each release
// blockchain.cpp — before get_block_longhash
const bool quicksync_verified = m_quicksync.check_block(blockchain_height, id);
const bool assume_valid = (blockchain_height < ASSUME_VALID_HEIGHT);
if (!quicksync_verified && !assume_valid)
{
get_block_longhash(...);
if (!check_hash(proof_of_work, current_diffic)) { ... }
} Option B — Precomputed PoW hash file ~50 lines + generator tool NO FORK
Generate a file storing the 32-byte
File size:
Difference from QuickSync: QuickSync skips PoW entirely (trusts the hash is valid). This file still verifies the PoW — it just avoids recomputing the hash. Lower trust requirement.
Difference from assume-valid: This checks the hash against difficulty; assume-valid skips the check entirely.
Impact: eliminates the L3 overflow for first-time sync. File must be regenerated and distributed regularly.
Generate a file storing the 32-byte
proof_of_work result for every block (computed once, centrally).
During sync, load the stored hash and call check_hash(stored, difficulty) directly — skipping get_block_longhash but still verifying the hash against the difficulty target.File size:
4,216,395 × 32 bytes ≈ 128 MB (grows ~46 KB/day)Difference from QuickSync: QuickSync skips PoW entirely (trusts the hash is valid). This file still verifies the PoW — it just avoids recomputing the hash. Lower trust requirement.
Difference from assume-valid: This checks the hash against difficulty; assume-valid skips the check entirely.
Impact: eliminates the L3 overflow for first-time sync. File must be regenerated and distributed regularly.
Option C — Keep QuickSync file current 0 code changes NO FORK
The
Impact: eliminates 100% of the L3 overflow. Requires user to download the file separately.
Trust: highest — PoW is completely skipped; user trusts the distributor’s file.
Gap: needs auto-download capability and a regular update cadence.
The
nerva-quicksync-export tool already exists. Generate a fresh quicksync.raw covering
blocks 0 to (tip − 100) and host it for download. When loaded, m_quicksync.check_block() returns true
and get_block_longhash is skipped entirely. Infrastructure already complete — this is a distribution/ops problem, not a code problem.Impact: eliminates 100% of the L3 overflow. Requires user to download the file separately.
Trust: highest — PoW is completely skipped; user trusts the distributor’s file.
Gap: needs auto-download capability and a regular update cadence.
Option D — Warp sync / state snapshot Weeks of engineering NO FORK
Export a trusted LMDB state snapshot at a recent height. New nodes download the snapshot and skip all historical blocks — syncing only from the snapshot height to the current tip. Monero has a related feature (bootstrap daemon).
Impact: strongest possible — new nodes sync in minutes regardless of chain length.
Trust: node trusts the snapshot (UTXO set, key images, etc.).
Effort: requires exporting full chain state (not just block hashes), snapshot format, import logic, and verification.
Export a trusted LMDB state snapshot at a recent height. New nodes download the snapshot and skip all historical blocks — syncing only from the snapshot height to the current tip. Monero has a related feature (bootstrap daemon).
Impact: strongest possible — new nodes sync in minutes regardless of chain length.
Trust: node trusts the snapshot (UTXO set, key images, etc.).
Effort: requires exporting full chain state (not just block hashes), snapshot format, import logic, and verification.
Option E — HF13 sliding window (post-fork only) Planned HF13
The tiered algorithm (95% recent / 5% full) activates at HF13. All blocks mined after HF13 only need the recent 100k blocks warm in L3 (~5.6 MB). Mining and keeping-up-with-tip become fast indefinitely.
Impact on first-time sync: none for pre-HF13 blocks. Full fix for post-HF13 blocks.
Impact on mining: full fix — miners never need more than 5.6 MB in L3 after fork.
Note: every sync of a new node still has to grind through all pre-HF13 blocks the slow way, unless combined with one of Options A–D.
The tiered algorithm (95% recent / 5% full) activates at HF13. All blocks mined after HF13 only need the recent 100k blocks warm in L3 (~5.6 MB). Mining and keeping-up-with-tip become fast indefinitely.
Impact on first-time sync: none for pre-HF13 blocks. Full fix for post-HF13 blocks.
Impact on mining: full fix — miners never need more than 5.6 MB in L3 after fork.
Note: every sync of a new node still has to grind through all pre-HF13 blocks the slow way, unless combined with one of Options A–D.
7 — Comparison Summary
| Option | Fixes first-time historical sync? | Fixes mining/tip? | Code effort | Trust model | Hard fork? |
|---|---|---|---|---|---|
| A — Assume-valid height | Yes — 100% | No (PoW still runs for new blocks) | ~10 lines | Binary (same as running the daemon) | No |
| B — Precomputed PoW file | Yes — 100% | No | ~50 lines + generator | File distributor; still checks difficulty | No |
| C — Updated QuickSync file | Yes — 100% | No | 0 (infra only) | File distributor; skips PoW check entirely | No |
| D — Warp sync / snapshot | Yes — skips entirely | No | Weeks | Snapshot distributor (full UTXO state) | No |
| E — HF13 sliding window | No (pre-HF13 blocks unchanged) | Yes — permanently | Moderate | Consensus (network vote) | Yes — HF13 |
| A + E combined | Yes | Yes | Moderate | Binary + consensus | Yes — HF13 |
Recommended path: Implement Option A (assume-valid height) now — ~10 lines, ships before HF13,
fixes first-time sync immediately for all users. Bundle with HF13 sliding window (Option E) for a complete,
permanent solution: historical sync is fast via assume-valid, and post-HF13 sync is fast via the tiered algorithm.
QuickSync (Option C) should also be kept current as a parallel distribution channel.