Skip to content

Add incoming edges debug API for HNSW#918

Open
meiravgri wants to merge 1 commit intomainfrom
meiravg_hnsw_inc_edges_debug_v2
Open

Add incoming edges debug API for HNSW#918
meiravgri wants to merge 1 commit intomainfrom
meiravg_hnsw_inc_edges_debug_v2

Conversation

@meiravgri
Copy link
Collaborator

@meiravgri meiravgri commented Mar 17, 2026

Add debug APIs to inspect and optimize incoming unidirectional edges in HNSW graphs.

Changes

  • shrinkIncomingEdges() - Reclaim unused capacity in incoming edge vectors
  • getHNSWElementIncomingEdges() - Get incoming edge counts per level for a label
  • shrinkAllIncomingEdges() - Shrink all incoming edge vectors in the index

C API Exports

  • VecSimDebug_GetElementIncomingEdgesInHNSWGraph
  • VecSimDebug_ReleaseElementIncomingEdgesInHNSWGraph
  • VecSimDebug_ShrinkIncomingEdgesInHNSWGraph

Includes tiered index support and unit test.


Pull Request opened by Augment Code with guidance from the PR author


Note

Medium Risk
Adds new debug/maintenance APIs that traverse and mutate HNSW graph edge-storage (shrink_to_fit) under locks; risk is mainly around memory/perf characteristics and potential contention, not query correctness.

Overview
Adds a new HNSW debug capability to inspect incoming unidirectional edges by returning per-level counts for a given label (getHNSWElementIncomingEdges), with tiered-index support and C API exports (VecSimDebug_GetElementIncomingEdgesInHNSWGraph + releaser).

Introduces a memory reclamation path for incoming-edge storage: ElementLevelData::shrinkIncomingEdges() and index-wide shrinkAllIncomingEdges() (plus tiered wrapper and C API VecSimDebug_ShrinkIncomingEdgesInHNSWGraph) to reclaim unused vector capacity that can accumulate after deletion repairs, with a unit test validating allocation decreases and idempotence.

Written by Cursor Bugbot for commit 7dc2b4b. This will update automatically on new commits. Configure here.

- Add shrinkIncomingEdges() to ElementLevelData to reclaim unused capacity
- Add getHNSWElementIncomingEdges() to get incoming edge counts per level
- Add shrinkAllIncomingEdges() to shrink all incoming edge vectors in index
- Add C API exports: VecSimDebug_GetElementIncomingEdgesInHNSWGraph,
  VecSimDebug_ReleaseElementIncomingEdgesInHNSWGraph,
  VecSimDebug_ShrinkIncomingEdgesInHNSWGraph
- Add tiered index wrappers
- Add unit test for shrinkIncomingEdges
@jit-ci
Copy link

jit-ci bot commented Mar 17, 2026

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

}
}
return total_saved;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing locks in shrinkAllIncomingEdges causes data races

High Severity

shrinkAllIncomingEdges reads curElementCount, iterates all elements, and mutates incoming edge vectors via shrink_to_fit (which reallocates memory) without holding indexDataGuard or per-node locks. In contrast, the read-only getHNSWElementIncomingEdges and getHNSWElementNeighbors both acquire indexDataGuard shared lock and lockNodeLinks. A concurrent insert/delete could modify curElementCount or the same incoming-edges vector, causing undefined behavior (e.g., use-after-free). The tiered wrapper only takes mainIndexGuard shared but still skips the inner indexDataGuard.

Additional Locations (1)
Fix in Cursor Fix in Web

@codecov
Copy link

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 18.18182% with 108 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.76%. Comparing base (8d37cf1) to head (7dc2b4b).

Files with missing lines Patch % Lines
src/VecSim/vec_sim_debug.cpp 11.95% 81 Missing ⚠️
src/VecSim/algorithms/hnsw/hnsw.h 32.00% 17 Missing ⚠️
src/VecSim/algorithms/hnsw/hnsw_tiered.h 0.00% 10 Missing ⚠️

❌ Your project check has failed because the head coverage (95.76%) is below the adjusted base coverage (96.14%). You can increase the head coverage or adjust the Removed Code Behavior.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #918      +/-   ##
==========================================
- Coverage   97.14%   95.76%   -1.39%     
==========================================
  Files         129      129              
  Lines        7557     7689     +132     
==========================================
+ Hits         7341     7363      +22     
- Misses        216      326     +110     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant