Skip to content

Conversation

@grodowski
Copy link
Member

@grodowski grodowski commented Apr 3, 2025

Closes https://github.com/Shopify/db-mobility/issues/778, relates to #381

Allows enabling the super_read_only variable for source DBs. Since this brings the cursor closer to gh-ost implementation (https://github.com/github/gh-ost/blob/801ebabe343621fc06c92667fdb404f24bc605c0/go/sql/builder.go#L216-L218), I chose gh-ost's retry logic (60 times, 1s linear backoff) as a starting point. Had to introduce a 10ms sleep in the IntegrationTest setup to ensure it's more realistic, which raises some questions about likelihood of erroring after retries are exhausted 🤔

Feedback welcome, do not hesitate to share ideas on how to tophat better than with unit tests.

@grodowski grodowski self-assigned this Apr 3, 2025
Why? Mostly to allow enabling the super_read_only variable for source DBs. Since this brings the cursor closer to gh-ost implementation (https://github.com/github/gh-ost/blob/801ebabe343621fc06c92667fdb404f24bc605c0/go/sql/builder.go#L216-L218), I chose gh-ost's retry logic (60 times, 1s linear backoff) as a starting point. Had to introduce a 10ms sleep in the IntegrationTest setup to ensure it's more realistic.
@grodowski grodowski force-pushed the grodowski/replace-row-lock-with-for-share-nowait branch from ff5b2b6 to e8ffb0f Compare April 4, 2025 09:09
panic(err)
}

time.Sleep(10 * time.Millisecond)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this to make the DataWriter more realistic, some integration tests (example) start it with NumberOfWriters: 4 and that's too much for NOWAIT to work without sometimes taking way too long to obtain a lock.

I'm wondering if this is a signal to consider just a FOR SHARE lock 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FOR SHARE is equivalent to what we're doing right now, so it's not a step down, right?

I mean, we haven't seen problems with ghostferry holding on to locks so far, AFAIK, so it wouldn't be the worst thing to keep doing it. However, if the goal is the minimize production impact than not holding on to locks would be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also start with FOR SHARE to unblock super_read_only and then investigate NOWAIT later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I agree that FOR SHARE is a safer option, but also don't have new ideas on how to de-risk FOR SHARE NOWAIT further. I think it's safe enough in a way that if the cursor fails to obtain a lock (which I doubt, the logic is equal to gh-ost's iterator now, which isn't even configurable), we can guarantee that any data will remain unaffected and ghostferry may just fail to run.

Do you have any objections against shipping with NOWAIT (or how we could test it further while shipping FOR SHARE first)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to go with NOWAIT. 👍


if c.DBReadRetries == 0 {
c.DBReadRetries = 5
c.DBReadRetries = 60
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implies 60 * 1s retry schedule. I thought it's a reasonable starting point, similarly to how gh-ost does it (60 and 1s)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just double-checked that the config is currently @ default in core (components/shop_mover/app/utils/podding/shop_mover/ghostferry/config.rb)


if c.RowLock {
selectBuilder = selectBuilder.Suffix("FOR UPDATE")
mySqlVersion, err := c.DB.QueryMySQLVersion()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought it's best to just ask the DB 🙃

@grodowski grodowski changed the title [Do not merge] Replace cursor RowLock with FOR SHARE NOWAIT Replace cursor RowLock with FOR SHARE NOWAIT Apr 4, 2025
@grodowski grodowski marked this pull request as ready for review April 4, 2025 14:49
@grodowski grodowski merged commit 0b22b7b into main Apr 7, 2025
9 checks passed
@grodowski grodowski deleted the grodowski/replace-row-lock-with-for-share-nowait branch April 7, 2025 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants