-
Notifications
You must be signed in to change notification settings - Fork 76
Replace cursor RowLock with FOR SHARE NOWAIT #386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace cursor RowLock with FOR SHARE NOWAIT #386
Conversation
Why? Mostly to allow enabling the super_read_only variable for source DBs. Since this brings the cursor closer to gh-ost implementation (https://github.com/github/gh-ost/blob/801ebabe343621fc06c92667fdb404f24bc605c0/go/sql/builder.go#L216-L218), I chose gh-ost's retry logic (60 times, 1s linear backoff) as a starting point. Had to introduce a 10ms sleep in the IntegrationTest setup to ensure it's more realistic.
ff5b2b6 to
e8ffb0f
Compare
| panic(err) | ||
| } | ||
|
|
||
| time.Sleep(10 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this to make the DataWriter more realistic, some integration tests (example) start it with NumberOfWriters: 4 and that's too much for NOWAIT to work without sometimes taking way too long to obtain a lock.
I'm wondering if this is a signal to consider just a FOR SHARE lock 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FOR SHARE is equivalent to what we're doing right now, so it's not a step down, right?
I mean, we haven't seen problems with ghostferry holding on to locks so far, AFAIK, so it wouldn't be the worst thing to keep doing it. However, if the goal is the minimize production impact than not holding on to locks would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also start with FOR SHARE to unblock super_read_only and then investigate NOWAIT later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I agree that FOR SHARE is a safer option, but also don't have new ideas on how to de-risk FOR SHARE NOWAIT further. I think it's safe enough in a way that if the cursor fails to obtain a lock (which I doubt, the logic is equal to gh-ost's iterator now, which isn't even configurable), we can guarantee that any data will remain unaffected and ghostferry may just fail to run.
Do you have any objections against shipping with NOWAIT (or how we could test it further while shipping FOR SHARE first)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to go with NOWAIT. 👍
|
|
||
| if c.DBReadRetries == 0 { | ||
| c.DBReadRetries = 5 | ||
| c.DBReadRetries = 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just double-checked that the config is currently @ default in core (components/shop_mover/app/utils/podding/shop_mover/ghostferry/config.rb)
|
|
||
| if c.RowLock { | ||
| selectBuilder = selectBuilder.Suffix("FOR UPDATE") | ||
| mySqlVersion, err := c.DB.QueryMySQLVersion() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought it's best to just ask the DB 🙃
Closes https://github.com/Shopify/db-mobility/issues/778, relates to #381
Allows enabling the
super_read_onlyvariable for source DBs. Since this brings the cursor closer togh-ostimplementation (https://github.com/github/gh-ost/blob/801ebabe343621fc06c92667fdb404f24bc605c0/go/sql/builder.go#L216-L218), I chose gh-ost's retry logic (60 times, 1s linear backoff) as a starting point. Had to introduce a 10ms sleep in theIntegrationTestsetup to ensure it's more realistic, which raises some questions about likelihood of erroring after retries are exhausted 🤔Feedback welcome, do not hesitate to share ideas on how to tophat better than with unit tests.