fix(postgres): extract HNSW dimensions from column typmod instead of hardcoding 128 #68
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
dimensions = 128in HNSW index build functionsruvector(384), the index correctly uses 384 dimensionsProblem
The
hnsw_build()function atcrates/ruvector-postgres/src/index/hnsw_am.rs:272hardcoded:This caused all HNSW indexes to be treated as 128-dimensional, regardless of actual vector size.
Solution
Extract dimensions from PostgreSQL's type modifier system:
When users declare
ruvector(384), PostgreSQL stores384inatttypmod.Changes
hnsw_build(): Extract dimensions from typmod, error if not specifiedhnsw_buildempty(): Also extract dimensions for empty index creationmax_layersfield, lifetime errors, JsonB type mismatches)Error Handling
If the column is declared without explicit dimensions (e.g.,
ruvectorinstead ofruvector(384)), users get a helpful error:Test Plan
cargo check --lib)ruvector(384)column - should use 384 dimensionsruvectorcolumn - should error with guidanceCI Failures (Pre-existing Issues)
The CI failures are not caused by this PR. They are pre-existing bugs in the CI workflow:
Non-existent features: The workflow references
hybrid-searchandfiltered-searchfeatures that don't exist inCargo.toml. There's even a comment in the crate:# Note: hybrid-search and filtered-search are planned for future releasesMissing Docker image:
ruvector-postgres:testis referenced but never builtBenchmark permissions: Fork PRs can't comment on upstream PRs due to GitHub Actions permissions
These issues exist on the
mainbranch and should be addressed in a separate PR.Architectural Recommendation: Test Structure
While fixing test compilation errors, I noticed a systemic pattern where
#[pg_test]tests passVec<T>directly to functions that expectJsonB. This makes tests awkward and couples test code to PostgreSQL serialization.Recommendation: Refactor PostgreSQL operator functions to have two layers:
Benefits:
_impl()functions directly with native Rust types - no JsonB wrapping#[pg_test]) test the full PostgreSQL interface through SQLThis pattern should be applied to:
gnn/operators.rs,attention/operators.rs,learning/operators.rs, and similar files. This would be a good follow-up refactoring task.