Eliminate redundant NCHW↔NHWC permute_copy and NHWC-safe view_copy transposes in ToTosaMemoryFormatPass (#18314)#18314
Eliminate redundant NCHW↔NHWC permute_copy and NHWC-safe view_copy transposes in ToTosaMemoryFormatPass (#18314)#18314
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18314
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
e505b3e to
7ebe074
Compare
|
Hi, thanks for the PR! This is a complex topic to get right in all cases and FYI we are also planning on improving this internally so it is very nice to get some help with that. I see there are some errors in our unittests so looks like there are a few edge-cases to iron out before a proper review. Let us know if you have any questions about the current logic to help with this. In the meanwhile that I have two comments:
|
|
hey @AdrianLundell thanks for the feedback
(also fixing failing unit tests) |
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
7ebe074 to
1e0b323
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
1e0b323 to
92fa406
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Pull Request resolved: #18314 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
3be2196 to
8b742ba
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Pull Request resolved: #18314 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
4f89292 to
ba24f69
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
ba24f69 to
7b1eff5
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
7b1eff5 to
e02287c
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Pull Request resolved: #18314 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Differential Revision: D97266678
7722299 to
8c9cd27
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Differential Revision: D97266678
8c9cd27 to
d37271b
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Pull Request resolved: #18314 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Differential Revision: D97266678
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Differential Revision: D97266678
38ab0f4 to
524f633
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Pull Request resolved: #18314 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Differential Revision: D97266678
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Differential Revision: D97266678
d9a8827 to
c05fece
Compare
digantdesai
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
|
checking Gemma3n tests... |
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
c05fece to
0ea15d1
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
0ea15d1 to
288227c
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
288227c to
9c5dbf1
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Pull Request resolved: #18314 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
44f225f to
fda5066
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
fda5066 to
5e1384b
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
5e1384b to
751766b
Compare
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Pull Request resolved: #18314 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. bypass-github-export-checks bypass-github-pytorch-ci-checks bypass-github-executorch-ci-checks Reviewed By: digantdesai Differential Revision: D97266678
Summary:
Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes:
NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic
shape_indices on the raw shapes and preserves both the batch dim (index 0)
and the last dimension (NHWC channel) alone in their output groups, skip
inserting input/output transposes. The view_copy can operate directly on
NHWC data.
Redundant permute_copy elimination: Model-level permute_copy ops whose
permutation matches channels_last_order (NCHW→NHWC) or its inverse
(NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant
with the tosa_dim_order annotation. Replace them with view_copy (identity
reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute
models (NCHW input from placeholder) are not affected.
bypass-github-export-checks
bypass-github-pytorch-ci-checks
bypass-github-executorch-ci-checks
Reviewed By: digantdesai
Differential Revision: D97266678