Skip to content

Conversation

@leo-pony
Copy link
Contributor

@leo-pony leo-pony commented Oct 24, 2025

Add Ascend NPU CANN backend support

In my local host, i test with my build command cann.Dockerfile could been succesfully builded, and started, as following see:
Test running successfully,

  1. Testing Commands:
    Build and start modelrunner
docker build -t docker/docker-model-backend-llamacpp:latest-cann -f llamacpp/native/cann.Dockerfile .
make docker-run LLAMA_SERVER_VERSION=latest BASE_IMAGE=ascendai/cann:8.2.rc2-910b-ubuntu22.04-py3.11  LLAMA_SERVER_VARIANT=cann

On another terminal run LLM model and inference job:

MODEL_RUNNER_HOST=http://0.0.0.0:8080  docker model run ai/qwen3:0.6B-F16 hi
  1. test result snapshot:
    Start model_runner successfully:
617a8ea7-a5ab-4934-82cd-1cf30a071eba 2d40d3eb-fa0e-4ba0-ae26-4c5e5a9c8de2

Inference successfully with Ascend NPU:
c46ff371-a67a-4664-b959-73f1beaa11af

Ascend NPU resource using:
09ef9e96-28f4-4fc5-8627-0f949140777c

  1. build log:
#18 55.53 -- Set runtime path of "install/bin/com.docker.llama-server" to ""
#18 DONE 55.6s

#19 [builder 12/15] RUN rm install/bin/*.py
#19 DONE 0.5s

#20 [builder 13/15] RUN rm -r install/lib/cmake
#20 DONE 0.6s

#21 [builder 14/15] RUN rm -r install/lib/pkgconfig
#21 DONE 0.6s

#22 [builder 15/15] RUN rm -r install/include
#22 DONE 0.6s

#23 [final 2/2] COPY --from=builder /llama-server/install /com.docker.llama-server.native.linux.cann8.2.rc2-910b.aarch64
#23 CACHED

#24 exporting to image
#24 exporting layers done
#24 writing image sha256:a81d08a1d02cd6e8ca95e75d657ea9496992446013296e83052807da84f21f04 done
#24 naming to docker.io/docker/model-runner-cann:latest done
#24 DONE 0.0s

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @leo-pony, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates comprehensive support for the Ascend NPU CANN backend, allowing the application to utilize Huawei's Ascend processors for accelerated AI model execution. The changes encompass updates to the command-line interface for GPU selection, enhanced GPU runtime detection, dynamic mounting of necessary Ascend driver paths within Docker containers, and modifications to the build process for CANN-specific controller images. This significantly broadens the application's hardware compatibility for AI workloads.

Highlights

  • CLI Command Extension: The --gpu flag in install-runner, reinstall-runner, restart-runner, and start-runner commands now accepts 'cann' as a valid option, enabling users to specify Ascend NPU support.
  • CANN Runtime Detection: Introduced GPUSupportCANN and a new HasCANNRuntime function within the gpu package to accurately detect the presence and availability of the Ascend NPU CANN runtime on the host system.
  • Dynamic Driver Mounting: Implemented a mechanism to dynamically identify and bind essential Ascend NPU driver paths (e.g., '/usr/local/dcmi', '/usr/local/bin/npu-smi') from the host to the Docker container, ensuring proper hardware access.
  • Docker Runtime Configuration: The Docker container creation process has been updated to automatically set the Docker runtime to 'cann' when Ascend NPU support is selected, optimizing container execution for Ascend hardware.
  • Controller Image Variant: The logic for selecting the appropriate controller image has been extended to include a 'cann' variant, ensuring that the correct, optimized image is utilized for Ascend NPU environments.
  • Dockerfile Updates: The llamacpp/native/cann.Dockerfile has been updated to leverage a newer CANN version (8.2.rc2-910b), switch to an ubuntu22.04 base image, and incorporate specific environment setup steps required for the Ascend toolkit during the build process.
  • Documentation Updates: All relevant CLI command documentation files (both YAML and Markdown formats) have been revised to clearly reflect the newly added 'cann' GPU support option, providing up-to-date user guidance.
  • Test Log Artifact: A test.log file was added, containing output from Go test runs. This appears to be an incidental artifact from development or testing.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/release.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • In tryGetBindAscendMounts, you’re calling fmt.Errorf but not using the returned error—switch to log.Printf or handle the error so the failure is actually reported.
  • Currently only libraries and version.info are mounted; you may also need to detect and bind relevant Ascend device nodes (e.g. /dev/ascend*, sockets, etc.) so the container can access the hardware.
  • In the Dockerfile’s build stage, make sure uname -m is defined before use and remove trailing whitespace on line-continuation backslashes to avoid unexpected shell parsing issues.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In tryGetBindAscendMounts, you’re calling fmt.Errorf but not using the returned error—switch to log.Printf or handle the error so the failure is actually reported.
- Currently only libraries and version.info are mounted; you may also need to detect and bind relevant Ascend device nodes (e.g. /dev/ascend*, sockets, etc.) so the container can access the hardware.
- In the Dockerfile’s build stage, make sure uname -m is defined before use and remove trailing whitespace on line-continuation backslashes to avoid unexpected shell parsing issues.

## Individual Comments

### Comment 1
<location> `llamacpp/native/cann.Dockerfile:43-51` </location>
<code_context>
+RUN --mount=type=cache,target=/root/.ccache \
+    cann_in_sys_path=/usr/local/Ascend/ascend-toolkit; \
+    cann_in_user_path=$HOME/Ascend/ascend-toolkit; \
+    if [ -f "${cann_in_sys_path}/set_env.sh" ]; then \
+        source ${cann_in_sys_path}/set_env.sh; \
+        export LD_LIBRARY_PATH=${cann_in_sys_path}/latest/lib64:${cann_in_sys_path}/latest/${uname_m}-linux/devlib:${LD_LIBRARY_PATH} ; \
+    elif [ -f "${cann_in_user_path}/set_env.sh" ]; then \
+        source "$HOME/Ascend/ascend-toolkit/set_env.sh"; \
+        export LD_LIBRARY_PATH=${cann_in_user_path}/latest/lib64:${cann_in_user_path}/latest/${uname_m}-linux/devlib:${LD_LIBRARY_PATH}; \ 
+    else \
+        echo "No Ascend Toolkit found"; \
+        exit 1; \
+    fi && \
+    cmake --build build --config Release && \
</code_context>

<issue_to_address>
**issue (bug_risk):** Potential issue with undefined variable 'uname_m'.

Please define 'uname_m' before using it in LD_LIBRARY_PATH to prevent runtime errors.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@leo-pony leo-pony force-pushed the ascend_cann_support branch from d75798c to a997f19 Compare October 24, 2025 09:44
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for Ascend NPU CANN backend. The changes are comprehensive, touching CLI commands, documentation, GPU detection, and container setup. Overall, the implementation is solid, but I've identified a few issues. There are inconsistencies in the updated documentation, a bug in the new Dockerfile that would cause the build to fail, an unhandled error, and a test log file that appears to have been committed by mistake. My review includes specific suggestions to address these points.

@leo-pony leo-pony force-pushed the ascend_cann_support branch 5 times, most recently from 9f029ad to a70a6c2 Compare October 25, 2025 07:58
@leo-pony
Copy link
Contributor Author

@ericcurtin I tested it on my local Ascend NPU host and everything seems to be working fine. Could you please help review the code?

Copy link
Contributor

@ericcurtin ericcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are conflicts and a build to fix.

But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@leo-pony
Copy link
Contributor Author

There are conflicts and a build to fix.

But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@ericcurtin Thank you for your review! However, as I need to attend meetings all day these days, I'll address the CI issue this Thursday or Friday.

@leo-pony
Copy link
Contributor Author

leo-pony commented Oct 30, 2025

There are conflicts and a build to fix.

But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@ericcurtin
Could you please help me re-trigger CI?

  1. I rebaseed the latest code.
  2. Fixed the issue where CI failed due to a failure to perform a consistency check using docsgen when 'make docs' after document modifications.

Others:
The reason why docker_model_version.yaml and cmd/cli/docs/reference/model.md were also modified is that docsgen optimized these two documents and generated the modified content.

@ericcurtin
Copy link
Contributor

There are conflicts and a build to fix.
But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@ericcurtin Could you please help me re-trigger CI?

  1. I rebaseed the latest code.
  2. Fixed the issue where CI failed due to a failure to perform a consistency check using docsgen when 'make docs' after document modifications.

Others: The reason why docker_model_version.yaml and cmd/cli/docs/reference/model.md were also modified is that docsgen optimized these two documents and generated the modified content.

It looks great, hopefully CI goes green

@leo-pony leo-pony force-pushed the ascend_cann_support branch from 60f0954 to 0a66d2f Compare October 30, 2025 12:36
@ericcurtin
Copy link
Contributor

@leo-pony there's a minor conflict (again 😢 ), but once we fix that, I think it's merge time

Signed-off-by: leo-pony <nengjunma@outlook.com>
Signed-off-by: leo-pony <nengjunma@outlook.com>
@leo-pony leo-pony force-pushed the ascend_cann_support branch from 0a66d2f to b10f108 Compare October 30, 2025 12:57
@leo-pony
Copy link
Contributor Author

There are conflicts and a build to fix.
But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@ericcurtin Thank you for your review! However, as I need to attend meetings all day these days, I'll address the CI issue this Thursday or Friday.

@ericcurtin Thanks for you quickly review! Please help trigger the CI once more.
I have handled the code conflicts, and It seems online CI and my local docsgen running result is different, I reverted the modify of docker_model.yaml and model.md

04201a44-d6e3-4a8a-b62f-b9131114e14e

@ericcurtin ericcurtin merged commit a431268 into docker:main Oct 30, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants