Skip to content

fix performance regression in the nvexec maxwell examples#1699

Merged
ericniebler merged 1 commit intoNVIDIA:mainfrom
ericniebler:fix-cuda-stream-scheduler-performance-regression
Dec 4, 2025
Merged

fix performance regression in the nvexec maxwell examples#1699
ericniebler merged 1 commit intoNVIDIA:mainfrom
ericniebler:fix-cuda-stream-scheduler-performance-regression

Conversation

@ericniebler
Copy link
Collaborator

a performance regression was introduced by #1683 in the nvexec maxwell_gpu* examples. this pr restores the performance to what it was previously.

$ ./examples/nvexec/maxwell_gpu_s --iterations=1000 --N=512 --run-cuda --run-stdpar --run-stream-scheduler
                  method, elapsed [s],   BW [GB/s]
              GPU (cuda),       0.016,     752.458
   GPU (snr cuda stream),       0.016,     725.169
            GPU (stdpar),       0.034,     342.540

@ericniebler ericniebler merged commit 57a0317 into NVIDIA:main Dec 4, 2025
34 of 35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments