Skip to content

Race condition in make_pending_calls in free-threaded build #122201

@colesbury

Description

@colesbury

Bug report

make_pending_calls uses a mutex and the the handling_thread field to ensure that only one thread per-interpreter is handling pending calls at a time:

cpython/Python/ceval_gil.c

Lines 911 to 928 in 41a91bd

/* Only one thread (per interpreter) may run the pending calls
at once. In the same way, we don't do recursive pending calls. */
PyMutex_Lock(&pending->mutex);
if (pending->handling_thread != NULL) {
/* A pending call was added after another thread was already
handling the pending calls (and had already "unsignaled").
Once that thread is done, it may have taken care of all the
pending calls, or there might be some still waiting.
To avoid all threads constantly stopping on the eval breaker,
we clear the bit for this thread and make sure it is set
for the thread currently handling the pending call. */
_Py_set_eval_breaker_bit(pending->handling_thread, _PY_CALLS_TO_DO_BIT);
_Py_unset_eval_breaker_bit(tstate, _PY_CALLS_TO_DO_BIT);
PyMutex_Unlock(&pending->mutex);
return 0;
}
pending->handling_thread = tstate;
PyMutex_Unlock(&pending->mutex);

However, the clearing of handling_thread is done outside of the mutex:

cpython/Python/ceval_gil.c

Lines 959 to 960 in 41a91bd

pending->handling_thread = NULL;
return 0;

There are two problems with this (for the free-threaded build):

  • It's a data race because there's a read of handling_thread (in the mutex) concurrently with a write (outside the mutex)
  • The logic in that sets _PY_CALLS_TO_DO_BIT on pending->handling_thread is subject to a time-of-check to time-of-use hazard: pending->handling_thread may be non-NULL when evaluating the if-statement, but then cleared before setting the eval breaker bit.

Relevant unit test

TSan catches this race when running test_subthreads_can_handle_pending_calls from test_capi.test_misc.

Suggested Fix

We should set pending->handling_thread = NULL; while holding pending->mutex, at least in the free-threaded build

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.13bugs and security fixes3.14bugs and security fixestopic-free-threadingtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions