-
Notifications
You must be signed in to change notification settings - Fork 84
proof of concept: unify refcounting logic #578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Tests are passing. The usage of |
|
... ah wait
... should be able to recast them all as *uintptr with unsafe.Pointer, and then store it as []*uintptr? 🤔 |
|
Tested that the derived pointers are nilled correctly by modifying the MessageReader test, definitely is working. It's also guaranteed to be safe:
|
|
In principle, doing this and then moving Refcount.(Retain|Release) into a disabled file is enough to switch off refcounting. I'd also like to see if we can move the Refcount initialization into an inline-able function call, though - that would preserve this behavior when it's enabled, but also drop the cost of initializing the refcount information when it's not needed... |
|
And, tested and confirmed: this prototype completely drops refcounting from the |
|
And, tested with MessageReader dependency on Message. Test passes, and if the explicit dependency is dropped, the test fails due to a memory leak. |
|
There's probably a way to replace the unsafe.Pointer in storage for dependencies with **Refcount, at least. I'm not sure I like this approach; it forces dependencies to have a pointer stored, which effectively forces heap escapes, which I've been trying to avoid. Might be worth splitting optional/mutable/dynamic dependencies (which need to be able to have the pointer change) and immutable/static dependencies, waiting on review before I make any more changes though |
|
|
|
I'm not a fan of this approach. I dislike the users having to call multiple functions as opposed to just having In theory, a buffer need only keep track of its parent (if it has one) and doesn't need to keep track of any other buffers which were sliced off from it. I don't understand the need/desire for adding the pointers that you're doing as opposed to just having the i.e. Why isn't it just something like: //go:build !norc
type RefCount struct {
ref atomic.Int64
cleanup func()
}
func (r *RefCount) Retain() {
...
}
func (r *RefCount) Release() {
...
}
/////////////////////
//go:build norc
type RefCount struct {}
func (*RefCount) Retain() {}
func (*RefCount) Release() {} |
| } | ||
| m.refCount.Add(1) | ||
| m.ReferenceBuffer(&m.meta, &m.body) | ||
| m.ReferenceDerived(unsafe.Pointer(&m.msg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't need to track the m.msg, the point of the refcounting is to track allocations made by the memory.Allocator, the flatbuffer message object is never allocated by the memory.Allocator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The msg pointer is derived from the meta Buffer, which is itself allocated by the Allocator. ReferenceDerived is not for refcounting; it basically means "when this object goes free, nil out the target to prevent use-after-free."
Without it, accessing the message after calling Release could potentially access other memory allocated by the Allocator - if integrating with C, this could theoretically allow access to other heap allocations. Note that the previous code for Message.Release contains this:
if msg.refCount.Add(-1) == 0 {
msg.meta.Release()
msg.body.Release()
msg.msg = nil
msg.meta = nil
msg.body = nil
}it's releasing the two subresources, but also nilling the one that depends on one. This is just to replace the msg = nil line, basically.
| type Refcount struct { | ||
| count atomic.Int64 | ||
| dependencies []unsafe.Pointer | ||
| buffers []**Buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why **Buffer instead of *Buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll explain in a dedicated comment going over the whole design
| // Must only be called once per object. Defines buffers that are referenced | ||
| // by this object. When this object is unreferenced, all such buffers will | ||
| // be deallocated immediately. | ||
| func (r *Refcount) ReferenceBuffer(b ...**Buffer) { | ||
| r.buffers = b | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean "When this object is cleaned up" or "released"? Because Go is garbage collected and we don't have destructors, then just being unreferenced won't deallocate the buffers immediately, they'll get deallocated when the GC gets around to it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this is for the last call to Refcount.Release(): that call will deallocate the buffers immediately - or, rather, it will call Allocator.Free() immediately. IMO these should be seen as the same thing.
|
I'm just going to write up the design notes as one place to explain all the weirdness:
So, in trying to separate the concerns of reference counting and object management, and trying to make it all declarative and only called on object initialization - keeping the refcount graph static, even as the objects in it may be dynamic! - I settled on using two-level-pointers. The address of |
Using a cleanup function (or closure) may well be the best approach. I can play around with that instead and see if we like that more? It means not needing to manually track the graph at all, because the function is invoked on release time but defined at creation time; it can take a closure of the object itself, and check whether fields needs unreferenced without any of the typing shenanigans.. |
One worry I have: I'll need to check if the compiler can optimize out the construction of the closure, since it won't actually be used. The current approach generates zero code when disabled, I want to make sure we can maintain that property... |
|
That's a good thing to check. My opinion here is the desire to make things as simple as possible and avoid having to deal with the dependency graph like that. In addition, the closure approach would also make the transition to add cleanup much easier |
Agreed. It's likely worth it even if it does; more overhead than this approach maybe, but less code / correctness concerns, and still a savings over the current implementation :)
Yeah, that's a really good point. We'd basically need two flags - the question is, is it one for "do any tracking" and one for "refcount vs addcleanup", or is it one for refcounting and one for addcleanup, and the combination is invalid? |
A step towards the addcleanup work: we want to be able to turn refcounting off entirely. Rather than move all ~200 release+retain functions into files with build tags, IMO it makes sense to unify all refcounting logic into memory.Refcount, embed that everywhere else, and then we can just turn that off with switching to AddCleanup.
This is a proof of concept using that with just ipc/Message to obtain feedback before finishing the work.