Skip to content

Medium priority improvements: overflow, NEON, HNSW layers, allocations #49

@ruvnet

Description

@ruvnet

Priority: Medium

Collection of medium-priority improvements identified during code review.

1. Integer overflow risk in node ID (theoretical)

  • Location: HNSW node allocation
  • Risk: Node IDs could overflow on very large indexes (billions of vectors)
  • Fix: Use checked arithmetic or u64

2. Missing NEON Manhattan distance

  • Location: src/distance/simd.rs
  • Issue: Manhattan (L1) distance not implemented for ARM NEON
  • Impact: Falls back to scalar on ARM

3. Hardcoded 32-layer max in HNSW

  • Location: HNSW index implementation
  • Issue: Maximum 32 layers hardcoded
  • Impact: May limit index depth for very large datasets
  • Fix: Make configurable or use dynamic allocation

4. Memory allocation in search hot path

  • Location: HNSW/IVFFlat search functions
  • Issue: .clone() calls during search allocate memory
  • Impact: GC pressure, cache misses
  • Fix: Use arena allocator or pre-allocated buffers

5. Fragile typmod array parsing

  • Location: Vector type parsing
  • Issue: Manual parsing of dimension from typmod
  • Impact: Could fail on edge cases
  • Fix: Use proper PostgreSQL typmod utilities

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions