Skip to content

Add on-disk node resolution for HNSW #347

@kevmo314

Description

@kevmo314

The goal of this feature is to tie the HNSW in-memory map of nodes to adjacencies to disk. This will need a few steps:

  • Create a HNSWAdjacencyPage struct that is of type:
type HNSWAdjacencyPage [16][8]uint32

Note that this is exactly 4kB by design.

  • Create an adjacency manager that has the following ops:
    • GetAdjacencies(x) [8]uint32
    • AddAdjacency(x, y)
    • RemoveAdjacency(x, y)
    • AddNode(x) <-- this function is new, it will do nothing in the map case but we need to wire it up.

Observe that this essentially wraps a map for an in-memory structure so it will replace the current friends lookups directly. For this item, just implement it as a thin wrapper around the existing map.

  • Update the adjacency manager to create a constructor that takes in a B+ tree and update the operations:

    • For GetAdjacencies, look up the memory pointer in the B+ tree and read the 8 * 4 bytes following to get the 8 adjacencies
    • For AddAdjacencies, look up the memory pointer and set the next zero element, writing it back to disk.
    • For RemoveAdjacencies, ...
  • Update AddNode(x) to:

    • Check if it's in the B+ tree, if it is, do nothing
    • If it's not, find the newest HNSWAdjancencyPage and add a new block of 4 bytes representing its adjacencies, then add this to the B+ tree.

On the last item, it may be beneficial to write a page manager sort of like how we have the page file, or maybe it's too much boilerplate. Up to you. :)

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions