[ET Device Support] MethodMeta: expose per-buffer device placement API#18465
[ET Device Support] MethodMeta: expose per-buffer device placement API#18465Gasoonjia wants to merge 1 commit intogh/gasoonjia/148/basefrom
Conversation
Add memory_planned_buffer_device(index) to MethodMeta, returning the
Device (type + index) for each planned memory buffer. This reads from
the non_const_buffer_device field in the serialized ExecutionPlan.
For CPU-only programs (or legacy PTE files without non_const_buffer_device),
all buffers default to Device{CPU, 0}. The sparse list only stores entries
for non-CPU buffers, so the lookup scans for a matching buffer_idx.
This API enables Module::load_method() to query each buffer's target device
and allocate accordingly (malloc for CPU, DeviceAllocator for CUDA, etc.).
Differential Revision: [D97850708](https://our.internmc.facebook.com/intern/diff/D97850708/)
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18465
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 Awaiting Approval, 112 New Failures, 1 Unrelated FailureAs of commit 31171d0 with merge base 45a9717 ( AWAITING APPROVAL - The following workflows need approval before CI can run:
NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Stack from ghstack (oldest at bottom):
Add memory_planned_buffer_device(index) to MethodMeta, returning the
Device (type + index) for each planned memory buffer. This reads from
the non_const_buffer_device field in the serialized ExecutionPlan.
For CPU-only programs (or legacy PTE files without non_const_buffer_device),
all buffers default to Device{CPU, 0}. The sparse list only stores entries
for non-CPU buffers, so the lookup scans for a matching buffer_idx.
This API enables Module::load_method() to query each buffer's target device
and allocate accordingly (malloc for CPU, DeviceAllocator for CUDA, etc.).
Differential Revision: D97850708