-
Notifications
You must be signed in to change notification settings - Fork 155
Add HIP 0001 ring buffer proposal for Hyperlight I/O #1112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| ### Risks and Mitigations | ||
|
|
||
| **Risk**: Malicious guest corrupting the queue **Mitigation**: Do not expose low level queue API to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that not exposing it to the guest does anything. The guest is fully untrusted and so we need to do the mitigation suggested after this, that we serialize all data as known types and do assertions as suggested.
| and `len` as raw bytes providing 12 bytes of inline storage. We should asses if any of flatbuffer | ||
| schema serialized data can actually fit into small inline data. | ||
|
|
||
| **4. Descriptor Chaining - scatter gather list** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it proposed to implement this now or as future option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be implemented in the first draft.
| example `ioeventfd` for kvm). This is especially useful for streaming scenarios where the guest can | ||
| continue processing while the host consumes data asynchronously. | ||
|
|
||
| **3. Inline Descriptors** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this proposed to be implemented now or as possible improvement in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to give it a try in the first draft, but really if we're gonna keep it or not depends on two factors: 1) can we actually fit any meaningful data into inline slot, e.g. can serialized function call with single argument fit? 2) the performance improvement for such calls is substantial. Both are hard to assess without actual implementation.
| 2. Device writes up to buffer length | ||
| 3. Device sets actual written length in descriptor | ||
| 4. If `actual_length > buffer_length`, device sets a `TRUNCATED` flag, | ||
| 5. Driver can re-submit with larger buffer if needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can image this would be possible performance issue? Can we provide a metric for when this occurs and a way to improve initial buffer size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I would imagine that this only happens when you try to transport a "big" chunk of data for which the base case should be using stream rather then pushing everything at once. That being said the protocol should support full round trip but really the mitigation of this problem should be using streams.
proposals/0001-rng-buf/README.md
Outdated
| Snapshotting requires that the descriptor table has no in-flight guest-to-host requests and any | ||
| attempt to snapshot a sandbox with such pending requests will result in a snapshot failure. | ||
|
|
||
| ### Difference from spec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from what spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh from from virtio spec, let's link the mention above to here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this to mention virtio.
| ### Snapshotting | ||
|
|
||
| Snapshotting requires that the descriptor table has no in-flight guest-to-host requests and any | ||
| attempt to snapshot a sandbox with such pending requests will result in a snapshot failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ensure this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can query the queue for number of inflight messages during snapshotting and fail if it reports positive number. Because we have to track the number of free slots in the queue getting number of inflight messages is just length of the queue minus free slots.
|
|
||
| ``` | ||
|
|
||
| ### Test Plan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth outlining the implemenation plan a bit? Will this come in a couple different PRs or behind feature flags, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added Implementation Plan section.
jsturtevant
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks for such a detailed write up. Left a few minor comments but otherwise this seems like it would set up the project to work well for the future
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
2318367 to
25f5d22
Compare
Rendered