Skip to content

Conversation

@7layermagik
Copy link

Summary

Integrates the solana-snapshot-finder-go library to provide intelligent snapshot source discovery and selection. This replaces the previous snapshot download logic with a two-stage speed testing algorithm that finds the fastest available snapshot sources.

Key Features

Intelligent Node Discovery

  • Two-stage algorithm: fast parallel triage followed by sustained speed testing
  • Filters nodes by Solana version, RTT, TCP connectivity, and snapshot age
  • Ranks nodes by actual download speed, not just network latency
  • Configurable fallback to try multiple ranked nodes if the best one fails

HTTP Streaming Mode

  • Stream snapshots directly from HTTP to the processing pipeline
  • Optional disk saving while streaming using io.TeeReader (parallel processing + save)
  • When save_to_disk=false (default): no disk space required for snapshot files
  • When save_to_disk=true: saves snapshots to download_path while streaming

Smart Incremental Snapshot Selection

  • Prioritizes freshness (end slot) over pure speed for incrementals
  • Tries the same source as the full snapshot first (fastest path)
  • Falls back to cluster-wide search if the primary source doesn't have a matching incremental
  • Minimum speed threshold (min_incremental_speed_mbs) to filter out slow nodes

Retry and Cleanup Logic

  • Automatic retry with re-discovery when incremental download fails mid-way
  • Partial download cleanup on Ctrl+C or errors (no orphaned files)
  • Up to 3 retry attempts with fresh source discovery between attempts

Configuration

New [snapshot] section in mithril.toml with comprehensive options:

[snapshot]
    # Save snapshots to disk while streaming
    save_to_disk = false
    download_path = "/data/snapshots"
    
    # Stage 1: Fast parallel triage
    stage1_warm_kib = 512
    stage1_window_kib = 512
    stage1_windows = 4
    stage1_timeout_ms = 3000
    
    # Stage 2: Sustained speed test
    stage2_top_k = 8
    stage2_warm_sec = 2
    stage2_measure_sec = 2
    stage2_min_ratio = 0.6
    
    # Node filtering
    max_rtt_ms = 200
    min_node_version = "2.2.0"
    
    # Fallback resilience
    max_snapshot_url_attempts = 3

See docs/snapshot.md for complete documentation.

Dependencies

Requires the updated solana-snapshot-finder-go library with library mode support (PR #6).

Test plan

  • Run verify-live with default config (HTTP streaming, no disk save)
  • Run with save_to_disk = true to verify parallel save works
  • Test Ctrl+C during download to verify cleanup
  • Verify Stage 1 and Stage 2 logging output
  • Test incremental retry by simulating a mid-download failure

7layermagik and others added 3 commits December 25, 2025 22:53
…urce discovery

Integrates the solana-snapshot-finder-go library to provide intelligent
snapshot source discovery and selection using a two-stage speed testing
algorithm. Adds HTTP streaming support with optional disk saving.

## Features

### Intelligent Node Discovery
- Two-stage algorithm: fast parallel triage + sustained speed testing
- Filters nodes by version, RTT, TCP connectivity, and snapshot age
- Ranks nodes by actual download speed, not just latency

### HTTP Streaming Mode
- Stream snapshots directly from HTTP to processing pipeline
- Optional disk saving while streaming (io.TeeReader)
- No disk space required when save_to_disk=false (default)

### Incremental Snapshot Selection
- Prioritizes freshness (end slot) over pure speed
- Tries same source as full snapshot first
- Falls back to cluster-wide search if needed
- Minimum speed threshold to filter slow nodes

### Retry and Cleanup Logic
- Automatic retry with re-discovery on incremental download failure
- Partial download cleanup on Ctrl+C or errors
- Configurable fallback to try multiple ranked nodes

## Configuration

New `[snapshot]` section in mithril.toml with options for:
- Stage 1/2 speed test parameters
- Node filtering (version, RTT, TCP timeout)
- Snapshot age thresholds
- Fallback resilience settings
- Save-to-disk options

## Files Changed

- pkg/snapshotdl/snapshotdl.go - Main integration layer
- pkg/snapshot/build_db_with_incr.go - Retry loop, cleanup logic
- pkg/snapshot/build_db.go - Cleanup logic for full snapshots
- pkg/snapshot/bufmonreader.go - HTTP streaming with save support
- pkg/config/config.go - SnapshotConfig struct
- mithril.example.toml - Documentation for all options
- docs/snapshot.md - Comprehensive documentation
Updates the dependency to include the two-stage node discovery,
TOML config, and library mode changes.
Snapshot cleanup fix:
- Added cleanAccountsDbDir() that removes all accountsdb artifacts before
  snapshot processing to prevent corruption from previous incomplete runs
- Prevents "integer divide by zero" panic in fastcache when leftover
  corrupted MMAP files exist from Ctrl+C or partial downloads
- Cleans: accounts/, mithril_db, mithril_db_log_shards/, bankhash_db,
  largest_file_id, bank_hash, manifest

Stage 2 tuning:
- Changed default warmup and measure to 3 seconds (was 2)
- 3 seconds is better for home internet (more variable bandwidth)
- Datacenter setups can use 1-2 seconds for faster discovery

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@7layermagik 7layermagik merged commit 92abfe0 into dev Dec 27, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants