Pagination beyond uint64 #417

milanatshopify · 2025-11-21T18:13:55Z

This commit adds support for binary pagination keys (UUID, VARBINARY, CHAR types) to Ghostferry, previously limited to uint64/BIGINT columns.

This is the AI summary of changes.

Key Changes

New PaginationKey Interface (pagination_key.go - 212 new lines)

Introduced PaginationKey interface with two implementations:
- Uint64Key - existing uint64 pagination (unchanged behavior)
- BinaryKey - new support for UUID, VARBINARY, CHAR columns
Provides lexicographic comparison for binary keys
Includes NumericPosition() for progress approximation (0.0-1.0 float)

State Serialization (state_tracker.go)

Fully backward compatible JSON serialization with type information:
{"type": "uint64", "value": 123}
{"type": "binary", "value": "deadbeef..."}
Can resume from old uint64-only states
New state_serialization_test.go (331 lines) validates mixed uint64/binary states

Updated Components

cursor.go: Pagination iteration handles both key types
batch_writer.go: Row batches carry proper pagination key types
inline_verifier.go/iterative_verifier.go: Hash calculation adapted for binary keys
compression_verifier.go: Decompression logic updated for binary pagination
data_iterator.go: Iterator tracks both uint64 and binary keys

Documentation (config.go)

Updated CascadingPaginationColumnConfig docs to reflect numeric/binary support
IMPORTANT: Added explicit warnings about uniqueness requirement for pagination columns

Test Coverage (+836 new test lines)

Comprehensive pagination_key_test.go (505 lines)
Integration tests for interrupt/resume with binary keys
Mixed uint64/binary key scenarios

Areas Requiring Careful Review

🔴 Critical: Uniqueness Requirement

The pagination column MUST have unique values for data integrity. The algorithm uses:
WHERE pagination_key > last_key ORDER BY pagination_key LIMIT batch_size

If duplicates exist at batch boundaries, rows will be skipped → data loss during migration.

⚠️ Code does NOT validate uniqueness (see table_schema_cache.go:260-291)
⚠️ Only validates column type (numeric/binary)
✅ Documentation added in config.go:735-744 warning about this requirement

Recommendation: Consider adding a validation step during initialization that checks for unique constraints on the pagination
column, or at minimum log a warning if no unique constraint is detected.

🟡 REST API Conversion

The /api/status endpoint converts all pagination keys to uint64 (progress.go:13-20, ferry.go:1027):

Binary keys use NumericPosition() → approximation for progress/ETA
Status API remains backward compatible
Progress percentages less precise for UUID-keyed tables (but acceptable for monitoring)

🟢 Binary Key Comparison

BinaryKey.Compare() uses bytes.Compare() for lexicographic ordering:

✅ Matches MySQL's binary collation behavior
✅ Works correctly with UUID v1/v4, VARBINARY, CHAR
⚠️ Verify this matches your database's collation for CHAR types if not using BINARY collation

🟢 State Resume Logic

State serialization is well-tested but verify:

SerializableState.MarshalJSON() / UnmarshalJSON() custom implementations (state_tracker.go:45-90)
Mixed states with both uint64 and binary keys
Empty binary keys and zero uint64 keys (edge cases tested)

Testing

Run all Go tests

./bin/gotestsum --format short-verbose ./test/go

Run interrupt/resume integration tests with binary keys

ruby -Itest test/integration/interrupt_resume_test.rb

Key test files:

test/go/pagination_key_test.go - Core pagination key logic
test/go/state_serialization_test.go - State dump/resume scenarios
test/integration/interrupt_resume_test.rb - End-to-end with binary keys

Migration Path

For existing Ghostferry users:

✅ No config changes required for existing uint64 pagination
✅ Can resume interrupted migrations (backward compatible)
✅ New tables with binary pagination keys work automatically if FallbackColumn is configured

Files changed: 28 files, +1920 insertions, -280 deletions

Copilot

Pull request overview

This PR extends Ghostferry's pagination capability beyond int64/BIGINT columns to support binary pagination keys (UUID, VARBINARY, CHAR types). The implementation introduces a PaginationKey interface with two implementations: Uint64Key (existing behavior) and BinaryKey (new support). State serialization is fully backward compatible, allowing resumption from old uint64-only states.

Key Changes:

New PaginationKey interface with Uint64Key and BinaryKey implementations supporting lexicographic comparison and progress approximation
Backward-compatible state serialization with type information ({"type": "uint64", "value": 123} / {"type": "binary", "value": "deadbeef..."})
Updated components (cursor, batch_writer, verifiers, data_iterator) to handle both key types with comprehensive test coverage (+836 test lines)

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
pagination_key.go	New file implementing PaginationKey interface with Uint64Key and BinaryKey types
state_tracker.go	Custom JSON marshaling/unmarshaling for backward-compatible state serialization
table_schema_cache.go	Updated to accept binary/string column types for pagination, modified maxPaginationKey function
cursor.go	Pagination iteration updated to handle both uint64 and binary keys with type-specific extraction
data_iterator.go	Iterator tracking updated for both key types with fingerprint generation
batch_writer.go	Row batch handling updated to extract and track pagination keys by type
inline_verifier.go	Fingerprint checking adapted for binary keys with string-based hash storage
iterative_verifier.go	Hash calculation and verification updated for mixed key types
compression_verifier.go	Decompression logic updated to handle binary pagination keys
ferry.go	Progress API converts all keys to uint64 via NumericPosition() for backward compatibility
filter.go	CopyFilter interface updated to accept PaginationKey parameter
sharding/filter.go	ShardedCopyFilter updated to use PaginationKey.SQLValue()
dml_events.go	DML event pagination key extraction returns string representation
target_verifier.go	Error message formatting updated for string pagination keys
row_batch.go	Fingerprints map changed from uint64 to string keys
config.go	Documentation updated with uniqueness requirement warnings
test/go/*.go	Comprehensive unit tests for pagination keys, state serialization, and verifiers
test/integration/*.rb	Integration tests for UUID table interrupt/resume scenarios
test/helpers/db_helper.rb	Helper functions for UUID table seeding and validation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pagination_key.go

state_tracker.go

cursor.go

batch_writer.go

table_schema_cache.go

test/helpers/db_helper.rb

driv3r · 2025-11-26T15:52:01Z

test/helpers/db_helper.rb


+  def generate_uuid_bytes
+    uuid_string = SecureRandom.uuid
+    uuid_string.gsub("-", "").scan(/../).map { |x| x.hex.chr }.join


there's a dedicated set of methods for packing and unpacking such things, this might be easier to understand

[SecureRandom.uuid].pack("H32")

doc

Removing - first...

test/helpers/db_helper.rb

grodowski

Not a finished review, but these are the two things I noticed so far

grodowski · 2025-12-09T10:32:33Z

pagination_key.go

+}
+
+// NumericPosition calculates a rough float position.
+func (k BinaryKey) NumericPosition() float64 {


BinaryKey.NumericPosition is used for progress tracking and progress deltas (where we subtract those), it seems the way it's currently implemented it will report correct progress with a UUID v7 (or anything that has a monotonic values in those first bits), but progress reporting will be unpredictable with random data?

Should this be explained in docs or comments? Do we need an alternate progress tracking mode?

grodowski · 2025-12-09T11:21:03Z

test/go/data_iterator_sorter_test.go

 	tables, _ := ghostferry.LoadTables(t.Ferry.SourceDB, tableFilter, nil, nil, nil, nil)

-	t.unsortedTables = make(map[*ghostferry.TableSchema]uint64, len(tables))
+	t.unsortedTables = make(map[*ghostferry.TableSchema]ghostferry.PaginationKey, len(tables))


Would it make sense to validate the collation to be binary on the PK column here or in LoadTables? Some quick analysis with claude showed that a non-binary collation is unlikely to cause data loss and will just fail the move but the errors could be unintuitive to debug.

ghostferry/cursor.go

Lines 140 to 145 in b01bcec

if paginationKeypos.Compare(c.lastSuccessfulPaginationKey) <= 0 {

tx.Rollback()

err = fmt.Errorf("new paginationKeypos %s <= lastSuccessfulPaginationKey %s", paginationKeypos.String(), c.lastSuccessfulPaginationKey.String())

c.logger.WithError(err).Errorf("last successful paginationKey position did not advance")

return err

}

It seems to be theoretically possible to either lose data or fail with last successful paginationKey position did not advance when go bytes.Compare differs from MySQL collation.

Validation to fail earlier and with a clear message added on a new branch: https://github.com/Shopify/ghostferry/compare/uuid-as-id...grodowski/uuid-as-id?expand=1

Ruby integration test with claude to prove the did not advance condition

def test_safety_check_catches_ordering_mismatch source_db.query("CREATE DATABASE IF NOT EXISTS #{DEFAULT_DB}") source_db.query("CREATE TABLE #{DEFAULT_FULL_TABLE_NAME} (id VARCHAR(10) COLLATE utf8mb4_unicode_ci NOT NULL PRIMARY KEY, data TEXT)") 200.times { |i| source_db.query("INSERT INTO #{DEFAULT_FULL_TABLE_NAME} VALUES ('b#{i.to_s.rjust(3, '0')}', 'data#{i}')") } 200.times { |i| source_db.query("INSERT INTO #{DEFAULT_FULL_TABLE_NAME} VALUES ('C#{i.to_s.rjust(3, '0')}', 'upper#{i}')") } source_db.query("INSERT INTO #{DEFAULT_FULL_TABLE_NAME} VALUES ('z999', 'sentinel')") target_db.query("CREATE DATABASE IF NOT EXISTS #{DEFAULT_DB}") target_db.query("CREATE TABLE #{DEFAULT_FULL_TABLE_NAME} (id VARCHAR(10) COLLATE utf8mb4_unicode_ci NOT NULL PRIMARY KEY, data TEXT)") ghostferry = new_ghostferry(MINIMAL_GHOSTFERRY) ghostferry.run_expecting_failure error_logs = ghostferry.logrus_lines.values.flatten.select { |line| line["level"] == "error" } assert error_logs.any? { |line| line["msg"]&.include?("paginationKey position did not advance") }, "Expected safety check error, got: #{error_logs.inspect}" end

The potential data loss: when the last batch keys are all < than previous batch.

def test_silent_data_loss source_db.query("CREATE DATABASE IF NOT EXISTS #{DEFAULT_DB}") source_db.query("CREATE TABLE #{DEFAULT_FULL_TABLE_NAME} (id VARCHAR(10) COLLATE utf8mb4_unicode_ci NOT NULL PRIMARY KEY, data TEXT)") 200.times { |i| source_db.query("INSERT INTO #{DEFAULT_FULL_TABLE_NAME} VALUES ('b#{i.to_s.rjust(3, '0')}', 'data#{i}')") } 200.times { |i| source_db.query("INSERT INTO #{DEFAULT_FULL_TABLE_NAME} VALUES ('C#{i.to_s.rjust(3, '0')}', 'upper#{i}')") } target_db.query("CREATE DATABASE IF NOT EXISTS #{DEFAULT_DB}") target_db.query("CREATE TABLE #{DEFAULT_FULL_TABLE_NAME} (id VARCHAR(10) COLLATE utf8mb4_unicode_ci NOT NULL PRIMARY KEY, data TEXT)") ghostferry = new_ghostferry(MINIMAL_GHOSTFERRY) ghostferry.run source_count = source_db.query("SELECT COUNT(*) as count FROM #{DEFAULT_FULL_TABLE_NAME}").first["count"].to_i target_count = target_db.query("SELECT COUNT(*) as count FROM #{DEFAULT_FULL_TABLE_NAME}").first["count"].to_i assert_equal 400, source_count assert_equal 200, target_count, "Expected silent data loss: 200 Cherry rows never copied" end

Please poke holes in this.

UUID as ID: validate collation

grodowski · 2025-12-16T16:53:11Z

sharding/test/trivial_integration_test.go


 func setupSingleTableDatabase(f *testhelpers.TestFerry, sourceDB, targetDB *sql.DB) {
-	testhelpers.SeedInitialData(sourceDB, "gftest", "table1", 1000)
+	testhelpers.SeedInitialData(sourceDB, "gftest", "table1", 100)


What's the reason behind lowering? Could hide some issues since it's less than default batch size

… batch size

ilikeorangutans · 2025-12-17T14:28:36Z

pagination_key.go

+	SQLValue() interface{}
+	Compare(other PaginationKey) int
+	NumericPosition() float64
+	String() string
+	MarshalJSON() ([]byte, error)
+	IsMax() bool
+}


A few comments would be helpful, it's not quite clear from the names what they refer to.

ilikeorangutans · 2025-12-17T14:34:54Z

cursor.go

+		switch c.paginationKeyColumn.Type {
+		case schema.TYPE_NUMBER, schema.TYPE_MEDIUM_INT:
+			var value uint64
+			value, err = lastRowData.GetUint64(paginationKeyIndex)
+			if err != nil {
+				logger.WithError(err).Error("failed to get uint64 paginationKey value")
+				return
+			}
+			paginationKeypos = NewUint64Key(value)
+
+		case schema.TYPE_BINARY, schema.TYPE_STRING:
+			valueInterface := lastRowData[paginationKeyIndex]
+
+			var valueBytes []byte
+			switch v := valueInterface.(type) {
+			case []byte:
+				valueBytes = v
+			case string:
+				valueBytes = []byte(v)
+			default:
+				err = fmt.Errorf("expected binary pagination key to be []byte or string, got %T", valueInterface)
+				logger.WithError(err).Error("failed to get binary paginationKey value")
+				return
+			}
+
+			paginationKeypos = NewBinaryKey(valueBytes)
+
+		default:
+			// Fallback for other integer types
+			var value uint64
+			value, err = lastRowData.GetUint64(paginationKeyIndex)
+			if err != nil {
+				logger.WithError(err).Error("failed to get uint64 paginationKey value")
+				return
+			}
+			paginationKeypos = NewUint64Key(value)


There's several similar copies of this block, it might be worth creating a function NewPaginationKeyFromValue(input interface{}) (PaginationKey, error) in pagination.go to encapsulate all this.

milanatshopify requested a review from Copilot November 21, 2025 18:14

Copilot started reviewing on behalf of milanatshopify November 21, 2025 18:14 View session

Copilot finished reviewing on behalf of milanatshopify November 21, 2025 18:16

Copilot AI reviewed Nov 21, 2025

View reviewed changes

milanatshopify changed the title ~~Pagination beyond int64~~ Pagination beyond uint64 Nov 21, 2025

milanatshopify force-pushed the uuid-as-id branch from 7a37eb4 to 92fbc0d Compare November 21, 2025 20:23

milanatshopify self-assigned this Nov 24, 2025

jameszhu-shopify mentioned this pull request Nov 25, 2025

Composite key pagination support for ghostferry #420

Draft

driv3r reviewed Nov 26, 2025

View reviewed changes

milanatshopify force-pushed the uuid-as-id branch from 92fbc0d to 1cd6fdb Compare November 28, 2025 18:20

Pagination beyond int64

2f1dff8

milanatshopify force-pushed the uuid-as-id branch from 1cd6fdb to 2f1dff8 Compare November 28, 2025 18:39

grodowski reviewed Dec 9, 2025

View reviewed changes

grodowski and others added 4 commits December 15, 2025 14:54

Add note about progress reporting with non-int pk

dd42876

Enforce binary collation on varchar columns

b01bcec

Link discussion comment with Ruby tests

9dd8db0

Merge pull request #422 from Shopify/grodowski/uuid-as-id

e30f92f

UUID as ID: validate collation

grodowski reviewed Dec 16, 2025

View reviewed changes

grodowski and others added 2 commits December 16, 2025 18:00

Revert integration test data size to test with more rows than default…

fa250d9

… batch size

Merge branch 'main' into uuid-as-id

1f096dd

ilikeorangutans reviewed Dec 17, 2025

View reviewed changes

	if paginationKeypos.Compare(c.lastSuccessfulPaginationKey) <= 0 {
	tx.Rollback()
	err = fmt.Errorf("new paginationKeypos %s <= lastSuccessfulPaginationKey %s", paginationKeypos.String(), c.lastSuccessfulPaginationKey.String())
	c.logger.WithError(err).Errorf("last successful paginationKey position did not advance")
	return err
	}

Pagination beyond uint64 #417

Are you sure you want to change the base?

Pagination beyond uint64 #417

Conversation

milanatshopify commented Nov 21, 2025

Key Changes

Areas Requiring Careful Review

Testing

Run all Go tests

Run interrupt/resume integration tests with binary keys

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

grodowski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants