[chore] Use sccache in builds by mdboom · Pull Request #1156 · NVIDIA/cuda-python

mdboom · 2025-10-17T16:45:56Z

This is an alternative to #1154. Experimenting here and we should pick only one.

copy-pr-bot · 2025-10-17T16:46:00Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

mdboom · 2025-10-17T16:46:04Z

/ok to test

leofang · 2025-10-17T17:23:05Z

Looking at the post action cleanup log, we don't seem to be building the cache at all. I suspect on Linux it has to do with cibuildwheel launching a manylinux container, and if so we need to mount sccache into the container. But this does not explain the Windows situation, which does not use container and instead we build wheels on the bare VM.

leofang · 2025-10-17T17:24:35Z

Do we need to convince setuptools to prepend sccache to CC and CXX?

mdboom · 2025-10-17T17:32:04Z

Do we need to convince setuptools to prepend sccache to CC and CXX?

Yeah, it looks like this action (unlike the ccache-based one) doesn't install the symlinks. The docs have examples setting CC and CXX. I'll try that and see if setuptools picks it up (I think it normally does).

mdboom · 2025-10-17T17:34:30Z

/ok to test

mdboom · 2025-10-17T17:35:37Z

/ok to test

mdboom · 2025-10-17T17:47:38Z

/ok to test

mdboom · 2025-10-17T17:59:33Z

/ok to test

mdboom · 2025-10-17T18:05:40Z

/ok to test

mdboom · 2025-10-17T18:17:18Z

/ok to test

ci/tools/env-vars

mdboom · 2025-10-17T18:34:22Z

/ok to test

mdboom · 2025-10-17T18:48:06Z

/ok to test

.github/workflows/build-wheel.yml

cpcloud

Only question is about a possibly-unused variable, but that's not blocking.

leofang · 2025-10-23T03:37:00Z

/ok to test 6be8df7

leofang · 2025-10-23T03:50:01Z

It seems neither renaming (6be8df7) nor wrapping (8f6f008) works on Windows...

leofang · 2025-10-23T03:54:15Z

/ok to test 1eb6a1b

leofang · 2025-10-23T04:36:31Z

I am thinking we should purge all unsuccessful Windows attempts and get this merged.

mdboom · 2025-10-23T14:48:28Z

I am thinking we should purge all unsuccessful Windows attempts and get this merged.

I agree.

mdboom · 2025-10-23T15:31:20Z

/ok to test

copy-pr-bot · 2025-10-23T15:31:23Z

/ok to test

@mdboom, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

mdboom · 2025-10-23T15:32:57Z

/ok to test 6cfe70d

mdboom · 2025-10-23T16:22:16Z

This is working with the Windows stuff removed.

However, I'm not sure how to confirm it's actually working. The stats at the end of the log (I guess) are from the host so are basically irrelevant.

But I'm also not seeing any speed improvement. For example:

This run from this PR, presumably with a hot cache, took 227s to build cuda_bindings
This run from another PR, took 211s to build cuda_bindings

I think the difference is within the noise, but we aren't seeing a notable speedup? If it is really caching, maybe the remote caching overhead of large files is erasing any benefits?

leofang · 2025-10-23T16:37:15Z

I saw Linux builds decreased from ~9 mins to 5 mins as of commit c10dc60 (link). Let me check what happened since then.

The cache stats need to be accessed in every cibuildwheel's test step (example), not at the end of the build workflows. As already noted, the stats on host is irrelevant.

leofang · 2025-10-23T16:42:53Z

@mdboom Something caused cache miss in the most recent CI run:

  Compile requests                      41
  Compile requests executed             21
  Cache hits                             2
  Cache hits (C/C++)                     2
  Cache misses                          19
  Cache misses (C/C++)                  19
  Cache hits rate                     9.52 %
  Cache hits rate (C/C++)             9.52 %
  Cache timeouts                         0
  Cache read errors                      0
  Forced recaches                        0
  Cache write errors                     0
  Cache errors                           0
  Compilations                          19
  Compilation failures                   0
  Non-cacheable compilations             0
  Non-cacheable calls                    0
  Non-compilation calls                 20
  Unsupported compiler calls             0
  Average cache write                0.360 s
  Average compiler                  20.770 s
  Average cache read hit             0.173 s
  Failed distributed compilations        0
  Cache location                  ghac, name: c53e4962b46713a17b429097c76c3a9dfe6c94ba4556b3b544ce89beb6872fc9, prefix: /sccache/
  Version (client)                0.12.0

https://github.com/NVIDIA/cuda-python/actions/runs/18753688569/job/53499960916?pr=1156#step:17:1336

I dunno what happened but I see many new caches generated in the last hour or so:
https://github.com/NVIDIA/cuda-python/actions/caches?query=sort%3Acreated-desc

Let me kick off another run to ensure we get cache hits.

leofang · 2025-10-23T16:43:39Z

/ok to test 6cfe70d

greptile-apps

_{1 file reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

mdboom · 2025-10-23T16:45:54Z

Let me kick off another run to ensure we get cache hits.

Cool. Maybe the cache is somehow keyed off the contents of the build-wheel.yml file. It wouldn't be a crazy thing to do.

leofang · 2025-10-23T16:50:58Z

It wouldn't be a crazy thing to do.

Sounds crazy to me lol... I thought it's only relevant to compiler versions, flags, args, and .cpp file contents, if the cache can be invalidated easily due to workflow changes, it is a bit annoying. GitHub Cache space is only 10 GB and we've been managing it carefully so far. It's a scarce resource.

Anyway, cache hits are back: https://github.com/NVIDIA/cuda-python/actions/runs/18755592039/job/53506633614?pr=1156#step:17:1334

mdboom · 2025-10-23T16:59:31Z

Anyway, cache hits are back: https://github.com/NVIDIA/cuda-python/actions/runs/18755592039/job/53506633614?pr=1156#step:17:1334

Great. And it looks like a cuda_bindings build is now 55s vs. 220s (approx). So, around 3 minutes saved (+ the time saved on the 2 cuda_core builds). Not bad.

cpcloud · 2025-10-23T18:16:45Z

I tinkered with sccache locally, and the remaining 55s is mostly spent in the linker, which is a whole other can of worms :)

github-actions · 2025-10-23T19:21:24Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

[chore] Use sccache in builds

b73cb81

mdboom requested a review from cpcloud October 17, 2025 16:45

This comment has been minimized.

Sign in to view

mdboom mentioned this pull request Oct 17, 2025

[chore] Enable ccache in builds #1154

Closed

Set compiler env vars

b4c2849

Add the other special envvar

7f6c222

Try adding vars to cibuildwheel

2d8d2c8

Try a different way to pass vars fo cibuildwheel

577fa3d

Try another approach

5605da0

Adjust SCCACHE_DIR inside the container

41f1359

cpcloud reviewed Oct 17, 2025

View reviewed changes

ci/tools/env-vars Outdated Show resolved Hide resolved

Specify a cache dir

4d3b284

Try something different

5b15052

cpcloud reviewed Oct 17, 2025

View reviewed changes

.github/workflows/build-wheel.yml Outdated Show resolved Hide resolved

cpcloud previously approved these changes Oct 17, 2025

View reviewed changes

Remove explicit SCCACHE_DIR manipulation

a0e0603

mdboom dismissed cpcloud’s stale review via a0e0603 October 17, 2025 19:20

debug

6be8df7

fix sccache dir

1eb6a1b

Remove windows experimentations

6cfe70d

leofang approved these changes Oct 23, 2025

View reviewed changes

leofang added this to the cuda-python 13-next, 12-next milestone Oct 23, 2025

leofang closed this Oct 23, 2025

leofang reopened this Oct 23, 2025

greptile-apps bot reviewed Oct 23, 2025

View reviewed changes

leofang merged commit 848ecd2 into NVIDIA:main Oct 23, 2025
119 of 128 checks passed

leofang mentioned this pull request Oct 24, 2025

Revisit sccache for Windows #1185

Closed

2 tasks

XuehaiPan mentioned this pull request Oct 29, 2025

[Feature Request][CI] Enable ccache or sccache for cibuildwheel for manylinux tile-ai/tilelang#1155

Closed

1 task

leofang mentioned this pull request Oct 30, 2025

CI: Investigate using build caches such as sccache-action #1061

Closed

mdboom deleted the use-sccache branch December 9, 2025 16:12

Conversation

mdboom commented Oct 17, 2025

Uh oh!

copy-pr-bot bot commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

This comment has been minimized.

leofang commented Oct 17, 2025

Uh oh!

leofang commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

mdboom commented Oct 17, 2025

Uh oh!

Uh oh!

cpcloud left a comment

Choose a reason for hiding this comment

Uh oh!

leofang commented Oct 23, 2025

Uh oh!

leofang commented Oct 23, 2025

Uh oh!

leofang commented Oct 23, 2025

Uh oh!

leofang commented Oct 23, 2025

Uh oh!

mdboom commented Oct 23, 2025

Uh oh!

mdboom commented Oct 23, 2025

Uh oh!

copy-pr-bot bot commented Oct 23, 2025

Uh oh!

mdboom commented Oct 23, 2025

Uh oh!

mdboom commented Oct 23, 2025

Uh oh!

leofang commented Oct 23, 2025

Uh oh!

leofang commented Oct 23, 2025

Uh oh!

leofang commented Oct 23, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

mdboom commented Oct 23, 2025

Uh oh!

leofang commented Oct 23, 2025

Uh oh!

mdboom commented Oct 23, 2025

Uh oh!

cpcloud commented Oct 23, 2025

Uh oh!

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!