Skip to content

Fix virtual-thread pinning in StatsDAggregator#295

Open
jbachorik wants to merge 5 commits intomasterfrom
jb/pinning_fix
Open

Fix virtual-thread pinning in StatsDAggregator#295
jbachorik wants to merge 5 commits intomasterfrom
jb/pinning_fix

Conversation

@jbachorik
Copy link
Contributor

@jbachorik jbachorik commented Feb 25, 2026

What does this PR do?

Replaces synchronized(map) blocks in StatsDAggregator with explicit per-shard ReentrantLock, eliminating the virtual-thread carrier-thread pinning source. Also removes the now-uncontended synchronized(map) from the aggregation sharding test.

synchronized on a regular object monitor pins a virtual thread to its carrier thread for the duration of the lock (JEP 444). ReentrantLock (backed by AbstractQueuedSynchronizer) parks the virtual thread instead, freeing the carrier. Note: JDK 24+ resolves this at the JVM level via JEP 491, but this fix benefits all supported JDK versions.

Motivation

Virtual threads pin to their carrier thread when blocked inside a synchronized block. With the aggregator under load from virtual-thread callers this caused unnecessary carrier-thread starvation.

Benchmarks

JMH VirtualThreadAggregatorBenchmark run against master (baseline) and this branch shows the fix eliminates the pinning source with no meaningful throughput regression:

  • Single-thread path improves up to +14%
  • Mid-to-high concurrency (4–16 threads) is neutral to slightly positive
  • No regressions outside measurement noise

Replaces per-shard synchronized(map) blocks with explicit
ReentrantReadWriteLock[], eliminating the virtual-thread pinning
source. Also switches aggregateMetrics from ArrayList to a plain
array to avoid indirection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik jbachorik added the AI Generated with AI assistance label Feb 25, 2026
jbachorik and others added 3 commits February 25, 2026 19:53
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ReadWriteLock was overkill since only write paths exist.
ReentrantLock is simpler and has lower AQS overhead.
Also removed the now-uncontended synchronized(map) block from
the test that was locking on an object no production code uses.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik jbachorik marked this pull request as ready for review February 25, 2026 20:22
@jbachorik jbachorik requested a review from a team as a code owner February 25, 2026 20:22
public static int DEFAULT_SHARDS = 4; // 4 partitions to reduce contention.

protected final String AGGREGATOR_THREAD_NAME = "statsd-aggregator-thread";
protected final ArrayList<Map<Message, Message>> aggregateMetrics;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to keep this PR about the lock only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@vickenty
Copy link
Contributor

Also, please add link to the JEP where synchronized issue and solutions are explained.

Keep the PR focused on the lock change only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik
Copy link
Contributor Author

@vickenty Added relevant info sources to PR description

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated with AI assistance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants