Skip to content

Fix race condition in VecSimTieredIndex::debugInfoIterator during background indexing#919

Open
JoanFM wants to merge 2 commits intomainfrom
joan-fix-flaky-test
Open

Fix race condition in VecSimTieredIndex::debugInfoIterator during background indexing#919
JoanFM wants to merge 2 commits intomainfrom
joan-fix-flaky-test

Conversation

@JoanFM
Copy link
Contributor

@JoanFM JoanFM commented Mar 19, 2026

Describe the changes in the pull request

Fixes a race condition that causes a crash when debugInfoIterator() is called on a tiered SVS index while background indexing is running.

Root Cause Analysis

The crash occurs because VecSimTieredIndex::debugInfoIterator() calls backendIndex->debugInfoIterator() without holding any locks. During async background indexing, the backend SVS index's internal state (impl_) can be modified by the background thread while the main thread is accessing it through debugInfo()indexLabelCount()impl_->size().

Stack trace from the crash:

SVSIndex::indexLabelCount
  ← SVSIndex::debugInfo
  ← SVSIndex::debugInfoIterator  
  ← VecSimTieredIndex::debugInfoIterator (NO LOCK HELD)
  ← TieredSVSIndex::debugInfoIterator
  ← VecsimInfo

The interleaved log messages showed that background indexing was running (rounds 2-12 of 40) when VecsimInfo was called, confirming the race.

Fix

Acquire shared locks (flatIndexGuard and mainIndexGuard) before calling frontendIndex->debugInfoIterator() and backendIndex->debugInfoIterator(). This follows the same lock ordering pattern used elsewhere in the codebase (e.g., in debugInfo() and indexLabelCount()).

Main objects this PR modified

  1. VecSimTieredIndex::debugInfoIterator() in vec_sim_tiered_index.h

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

Note

Medium Risk
Touches concurrency/locking around tiered index introspection; while scoped to debug info, incorrect locking could still introduce contention or deadlocks under load.

Overview
Fixes a crash/race in VecSimTieredIndex::debugInfoIterator() by acquiring shared_locks on flatIndexGuard and mainIndexGuard when generating the frontend/backend sub-index debug iterators.

Locks are taken per sub-index (and released immediately after) to keep the iterator construction thread-safe during background indexing while minimizing lock contention.

Written by Cursor Bugbot for commit 409ffff. This will update automatically on new commits. Configure here.

@JoanFM JoanFM changed the title proposal to fix flaky test Fix race condition in VecSimTieredIndex::debugInfoIterator during background indexing Mar 19, 2026
@jit-ci
Copy link

jit-ci bot commented Mar 19, 2026

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

@JoanFM JoanFM requested review from GuyAv46 and meiravgri and removed request for GuyAv46 March 19, 2026 16:33
@codecov
Copy link

codecov bot commented Mar 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.98%. Comparing base (3dd2dc2) to head (409ffff).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #919   +/-   ##
=======================================
  Coverage   96.98%   96.98%           
=======================================
  Files         129      129           
  Lines        7567     7569    +2     
=======================================
+ Hits         7339     7341    +2     
  Misses        228      228           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant