Add Ascend NPU CANN backend support #307

leo-pony · 2025-10-24T09:42:56Z

Add Ascend NPU CANN backend support

In my local host, i test with my build command cann.Dockerfile could been succesfully builded, and started, as following see:
Test running successfully,

Testing Commands:
Build and start modelrunner

docker build -t docker/docker-model-backend-llamacpp:latest-cann -f llamacpp/native/cann.Dockerfile .
make docker-run LLAMA_SERVER_VERSION=latest BASE_IMAGE=ascendai/cann:8.2.rc2-910b-ubuntu22.04-py3.11  LLAMA_SERVER_VARIANT=cann

On another terminal run LLM model and inference job:

MODEL_RUNNER_HOST=http://0.0.0.0:8080  docker model run ai/qwen3:0.6B-F16 hi

test result snapshot:
Start model_runner successfully:

Inference successfully with Ascend NPU:

Ascend NPU resource using:

build log:

#18 55.53 -- Set runtime path of "install/bin/com.docker.llama-server" to ""
#18 DONE 55.6s

#19 [builder 12/15] RUN rm install/bin/*.py
#19 DONE 0.5s

#20 [builder 13/15] RUN rm -r install/lib/cmake
#20 DONE 0.6s

#21 [builder 14/15] RUN rm -r install/lib/pkgconfig
#21 DONE 0.6s

#22 [builder 15/15] RUN rm -r install/include
#22 DONE 0.6s

#23 [final 2/2] COPY --from=builder /llama-server/install /com.docker.llama-server.native.linux.cann8.2.rc2-910b.aarch64
#23 CACHED

#24 exporting to image
#24 exporting layers done
#24 writing image sha256:a81d08a1d02cd6e8ca95e75d657ea9496992446013296e83052807da84f21f04 done
#24 naming to docker.io/docker/model-runner-cann:latest done
#24 DONE 0.0s

gemini-code-assist · 2025-10-24T09:43:33Z

Summary of Changes

Hello @leo-pony, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates comprehensive support for the Ascend NPU CANN backend, allowing the application to utilize Huawei's Ascend processors for accelerated AI model execution. The changes encompass updates to the command-line interface for GPU selection, enhanced GPU runtime detection, dynamic mounting of necessary Ascend driver paths within Docker containers, and modifications to the build process for CANN-specific controller images. This significantly broadens the application's hardware compatibility for AI workloads.

Highlights

CLI Command Extension: The --gpu flag in install-runner, reinstall-runner, restart-runner, and start-runner commands now accepts 'cann' as a valid option, enabling users to specify Ascend NPU support.
CANN Runtime Detection: Introduced GPUSupportCANN and a new HasCANNRuntime function within the gpu package to accurately detect the presence and availability of the Ascend NPU CANN runtime on the host system.
Dynamic Driver Mounting: Implemented a mechanism to dynamically identify and bind essential Ascend NPU driver paths (e.g., '/usr/local/dcmi', '/usr/local/bin/npu-smi') from the host to the Docker container, ensuring proper hardware access.
Docker Runtime Configuration: The Docker container creation process has been updated to automatically set the Docker runtime to 'cann' when Ascend NPU support is selected, optimizing container execution for Ascend hardware.
Controller Image Variant: The logic for selecting the appropriate controller image has been extended to include a 'cann' variant, ensuring that the correct, optimized image is utilized for Ascend NPU environments.
Dockerfile Updates: The llamacpp/native/cann.Dockerfile has been updated to leverage a newer CANN version (8.2.rc2-910b), switch to an ubuntu22.04 base image, and incorporate specific environment setup steps required for the Ascend toolkit during the build process.
Documentation Updates: All relevant CLI command documentation files (both YAML and Markdown formats) have been revised to clearly reflect the newly added 'cann' GPU support option, providing up-to-date user guidance.
Test Log Artifact: A test.log file was added, containing output from Go test runs. This appears to be an incidental artifact from development or testing.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/release.yml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

In tryGetBindAscendMounts, you’re calling fmt.Errorf but not using the returned error—switch to log.Printf or handle the error so the failure is actually reported.
Currently only libraries and version.info are mounted; you may also need to detect and bind relevant Ascend device nodes (e.g. /dev/ascend*, sockets, etc.) so the container can access the hardware.
In the Dockerfile’s build stage, make sure uname -m is defined before use and remove trailing whitespace on line-continuation backslashes to avoid unexpected shell parsing issues.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In tryGetBindAscendMounts, you’re calling fmt.Errorf but not using the returned error—switch to log.Printf or handle the error so the failure is actually reported.
- Currently only libraries and version.info are mounted; you may also need to detect and bind relevant Ascend device nodes (e.g. /dev/ascend*, sockets, etc.) so the container can access the hardware.
- In the Dockerfile’s build stage, make sure uname -m is defined before use and remove trailing whitespace on line-continuation backslashes to avoid unexpected shell parsing issues.

## Individual Comments

### Comment 1
<location> `llamacpp/native/cann.Dockerfile:43-51` </location>
<code_context>
+RUN --mount=type=cache,target=/root/.ccache \
+    cann_in_sys_path=/usr/local/Ascend/ascend-toolkit; \
+    cann_in_user_path=$HOME/Ascend/ascend-toolkit; \
+    if [ -f "${cann_in_sys_path}/set_env.sh" ]; then \
+        source ${cann_in_sys_path}/set_env.sh; \
+        export LD_LIBRARY_PATH=${cann_in_sys_path}/latest/lib64:${cann_in_sys_path}/latest/${uname_m}-linux/devlib:${LD_LIBRARY_PATH} ; \
+    elif [ -f "${cann_in_user_path}/set_env.sh" ]; then \
+        source "$HOME/Ascend/ascend-toolkit/set_env.sh"; \
+        export LD_LIBRARY_PATH=${cann_in_user_path}/latest/lib64:${cann_in_user_path}/latest/${uname_m}-linux/devlib:${LD_LIBRARY_PATH}; \ 
+    else \
+        echo "No Ascend Toolkit found"; \
+        exit 1; \
+    fi && \
+    cmake --build build --config Release && \
</code_context>

<issue_to_address>
**issue (bug_risk):** Potential issue with undefined variable 'uname_m'.

Please define 'uname_m' before using it in LD_LIBRARY_PATH to prevent runtime errors.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

llamacpp/native/cann.Dockerfile

gemini-code-assist

Code Review

This pull request adds support for Ascend NPU CANN backend. The changes are comprehensive, touching CLI commands, documentation, GPU detection, and container setup. Overall, the implementation is solid, but I've identified a few issues. There are inconsistencies in the updated documentation, a bug in the new Dockerfile that would cause the build to fail, an unhandled error, and a test log file that appears to have been committed by mistake. My review includes specific suggestions to address these points.

leo-pony · 2025-10-25T08:02:28Z

@ericcurtin I tested it on my local Ascend NPU host and everything seems to be working fine. Could you please help review the code?

ericcurtin

There are conflicts and a build to fix.

But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

cmd/cli/pkg/standalone/containers.go

leo-pony · 2025-10-27T13:43:49Z

There are conflicts and a build to fix.

But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@ericcurtin Thank you for your review! However, as I need to attend meetings all day these days, I'll address the CI issue this Thursday or Friday.

leo-pony · 2025-10-30T08:53:16Z

There are conflicts and a build to fix.

But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@ericcurtin
Could you please help me re-trigger CI?

I rebaseed the latest code.
Fixed the issue where CI failed due to a failure to perform a consistency check using docsgen when 'make docs' after document modifications.

Others:
The reason why docker_model_version.yaml and cmd/cli/docs/reference/model.md were also modified is that docsgen optimized these two documents and generated the modified content.

ericcurtin · 2025-10-30T11:05:07Z

There are conflicts and a build to fix.
But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@ericcurtin Could you please help me re-trigger CI?

I rebaseed the latest code.

Fixed the issue where CI failed due to a failure to perform a consistency check using docsgen when 'make docs' after document modifications.

Others: The reason why docker_model_version.yaml and cmd/cli/docs/reference/model.md were also modified is that docsgen optimized these two documents and generated the modified content.

It looks great, hopefully CI goes green

ericcurtin · 2025-10-30T12:41:30Z

@leo-pony there's a minor conflict (again 😢 ), but once we fix that, I think it's merge time

Signed-off-by: leo-pony <nengjunma@outlook.com>

leo-pony · 2025-10-30T13:00:39Z

There are conflicts and a build to fix.
But looks great to me! Worth comparing against the recent MUSA PR to see if we forgot anything

@ericcurtin Thank you for your review! However, as I need to attend meetings all day these days, I'll address the CI issue this Thursday or Friday.

@ericcurtin Thanks for you quickly review! Please help trigger the CI once more.
I have handled the code conflicts, and It seems online CI and my local docsgen running result is different, I reverted the modify of docker_model.yaml and model.md

sourcery-ai bot reviewed Oct 24, 2025

View reviewed changes

llamacpp/native/cann.Dockerfile Show resolved Hide resolved

leo-pony force-pushed the ascend_cann_support branch from d75798c to a997f19 Compare October 24, 2025 09:44

gemini-code-assist bot reviewed Oct 24, 2025

View reviewed changes

leo-pony force-pushed the ascend_cann_support branch 5 times, most recently from 9f029ad to a70a6c2 Compare October 25, 2025 07:58

ericcurtin approved these changes Oct 26, 2025

View reviewed changes

yeahdongcn reviewed Oct 27, 2025

View reviewed changes

cmd/cli/pkg/standalone/containers.go Show resolved Hide resolved

This was referenced Oct 27, 2025

fix(llamacpp): correct Docker ARG variable syntax in cann.Dockerfile #323

Draft

Add CANN support #295

Closed

leo-pony force-pushed the ascend_cann_support branch from a70a6c2 to 7cc66ca Compare October 30, 2025 03:07

ericcurtin approved these changes Oct 30, 2025

View reviewed changes

leo-pony force-pushed the ascend_cann_support branch from 60f0954 to 0a66d2f Compare October 30, 2025 12:36

leo-pony added 2 commits October 30, 2025 20:46

Add CANN backend support

15ef76f

Signed-off-by: leo-pony <nengjunma@outlook.com>

fix the CI doc diff error

b10f108

Signed-off-by: leo-pony <nengjunma@outlook.com>

leo-pony force-pushed the ascend_cann_support branch from 0a66d2f to b10f108 Compare October 30, 2025 12:57

ericcurtin merged commit a431268 into docker:main Oct 30, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ascend NPU CANN backend support #307

Add Ascend NPU CANN backend support #307

Uh oh!

leo-pony commented Oct 24, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Oct 24, 2025

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

leo-pony commented Oct 25, 2025

Uh oh!

ericcurtin left a comment

Uh oh!

Uh oh!

leo-pony commented Oct 27, 2025

Uh oh!

leo-pony commented Oct 30, 2025 •

edited

Loading

Uh oh!

ericcurtin commented Oct 30, 2025

Uh oh!

ericcurtin commented Oct 30, 2025

Uh oh!

leo-pony commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Ascend NPU CANN backend support #307

Add Ascend NPU CANN backend support #307

Uh oh!

Conversation

leo-pony commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Oct 24, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

leo-pony commented Oct 25, 2025

Uh oh!

ericcurtin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leo-pony commented Oct 27, 2025

Uh oh!

leo-pony commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ericcurtin commented Oct 30, 2025

Uh oh!

ericcurtin commented Oct 30, 2025

Uh oh!

leo-pony commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leo-pony commented Oct 24, 2025 •

edited

Loading

leo-pony commented Oct 30, 2025 •

edited

Loading