Add DecomposeRnnPass for ARM backend (#17139) by apullin · Pull Request #17139 · pytorch/executorch

apullin · 2026-02-03T07:33:51Z

Summary:

Adds a decomposition pass that transforms aten.rnn_tanh.input and
aten.rnn_relu.input into elementary ops supported by TOSA.

RNN cell equation per timestep:
h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh)

where activation is tanh (rnn_tanh) or relu (rnn_relu).

Features:

Multi-layer RNN support
Bidirectional RNN support
With/without bias
batch_first support
Both tanh and relu nonlinearities

Differential Revision: D92059152

pytorch-bot · 2026-02-03T07:33:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17139

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Unrelated Failures

As of commit 973c777 with merge base 7c79395 ():

NEW FAILURES - The following jobs have failed:

Lint / lintrunner-mypy (gh)
pull / unittest-arm-backend-with-no-deps (test_pytest_ops_tosa) / linux-job (gh)
RuntimeError: Command docker exec -t f55b7d63767b843f61433adc768910707f08372edb319f64d786dd39bbf6eb01 /exec failed with exit code 1

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-02-03T07:39:40Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities --- > Generated by [Confucius Code Assist (CCA)](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Confucius Session](https://www.internalfb.com/confucius?host=62602.od.fbinfra.net&port=8086&tab=Chat&session_id=e1d1ac52-0014-11f1-9d55-75b7d4e71d8a&entry_name=Code+Assist), [Trace](https://www.internalfb.com/confucius?session_id=e1d1ac52-0014-11f1-9d55-75b7d4e71d8a&tab=Trace) Differential Revision: D92059152

meta-codesync · 2026-02-03T23:33:40Z

@apullin has exported this pull request. If you are a Meta employee, you can view the originating Diff in D92059152.

Summary: Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities --- > Generated by [Confucius Code Assist (CCA)](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Confucius Session](https://www.internalfb.com/confucius?host=62602.od.fbinfra.net&port=8086&tab=Chat&session_id=e1d1ac52-0014-11f1-9d55-75b7d4e71d8a&entry_name=Code+Assist), [Trace](https://www.internalfb.com/confucius?session_id=e1d1ac52-0014-11f1-9d55-75b7d4e71d8a&tab=Trace) Differential Revision: D92059152

backends/arm/tosa/backend.py

apullin · 2026-02-06T21:36:29Z

@pytorchbot label "release notes: feature"

Summary: Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities Differential Revision: D92059152

Summary: Pull Request resolved: pytorch#17139 Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities Differential Revision: D92059152

Summary: Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities Differential Revision: D92059152

Summary: Pull Request resolved: pytorch#17139 Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities Differential Revision: D92059152

Summary: Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities Differential Revision: D92059152

Summary: Pull Request resolved: pytorch#17139 Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities Differential Revision: D92059152

Summary: Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities Differential Revision: D92059152

Summary: Adds a decomposition pass that transforms aten.gru.input into elementary ops supported by TOSA (matmul, sigmoid, tanh, mul, add, slice, cat). GRU cell equations per timestep: r_t = sigmoid(x_t @ W_ir.T + b_ir + h_{t-1} @ W_hr.T + b_hr) z_t = sigmoid(x_t @ W_iz.T + b_iz + h_{t-1} @ W_hz.T + b_hz) n_t = tanh(x_t @ W_in.T + b_in + r_t * (h_{t-1} @ W_hn.T + b_hn)) h_t = n_t + z_t * (h_{t-1} - n_t) Features: - Multi-layer GRU support - Bidirectional GRU support - With/without bias - batch_first support - Batched gate computation (2 mm ops per timestep instead of 6) Differential Revision: D92058313

Summary: Pull Request resolved: pytorch#17139 Adds a decomposition pass that transforms aten.rnn_tanh.input and aten.rnn_relu.input into elementary ops supported by TOSA. RNN cell equation per timestep: h_t = activation(x_t @ W_ih.T + b_ih + h_{t-1} @ W_hh.T + b_hh) where activation is tanh (rnn_tanh) or relu (rnn_relu). Features: - Multi-layer RNN support - Bidirectional RNN support - With/without bias - batch_first support - Both tanh and relu nonlinearities Differential Revision: D92059152

apullin requested a review from digantdesai as a code owner February 3, 2026 07:33

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 3, 2026

meta-codesync bot added fb-exported meta-exported labels Feb 3, 2026

apullin force-pushed the export-D92059152 branch from 0fc57ed to bbb6cbb Compare February 3, 2026 19:47

apullin force-pushed the export-D92059152 branch 2 times, most recently from 738855e to 8003c8a Compare February 3, 2026 23:22

apullin force-pushed the export-D92059152 branch from 8003c8a to 0a6c163 Compare February 3, 2026 23:22

apullin force-pushed the export-D92059152 branch 2 times, most recently from 8003c8a to 466c9ab Compare February 3, 2026 23:33

apullin force-pushed the export-D92059152 branch from 466c9ab to 3f1d107 Compare February 3, 2026 23:55

apullin force-pushed the export-D92059152 branch from 3f1d107 to 1c44a77 Compare February 3, 2026 23:59

zingo added the ciflow/trunk label Feb 6, 2026

zingo added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Feb 6, 2026

zingo changed the title ~~Add DecomposeRnnPass for ARM backend~~ Arm backend: Add DecomposeRnnPass Feb 6, 2026

mansnils requested changes Feb 6, 2026

View reviewed changes

backends/arm/tosa/backend.py Outdated Show resolved Hide resolved

apullin force-pushed the export-D92059152 branch from 1c44a77 to c6a47fd Compare February 6, 2026 18:20

pytorch-bot bot removed the ciflow/trunk label Feb 6, 2026

apullin force-pushed the export-D92059152 branch 2 times, most recently from 8356482 to 991e144 Compare March 24, 2026 17:59

apullin force-pushed the export-D92059152 branch 2 times, most recently from 3591ca5 to 2ba998d Compare March 24, 2026 18:08

apullin force-pushed the export-D92059152 branch from 2ba998d to 352a04e Compare March 24, 2026 19:55

apullin force-pushed the export-D92059152 branch from 352a04e to 948f4f4 Compare March 24, 2026 19:56

apullin force-pushed the export-D92059152 branch from 948f4f4 to e3efcf1 Compare March 24, 2026 19:59

apullin force-pushed the export-D92059152 branch from e3efcf1 to c2b44ea Compare March 24, 2026 22:38

apullin force-pushed the export-D92059152 branch 2 times, most recently from ae79c65 to dc4781b Compare March 25, 2026 16:04

Andrew Pullin and others added 2 commits March 27, 2026 08:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add DecomposeRnnPass for ARM backend (#17139)#17139

Add DecomposeRnnPass for ARM backend (#17139)#17139
apullin wants to merge 2 commits intopytorch:mainfrom
apullin:export-D92059152

apullin commented Feb 3, 2026 •

edited by meta-codesync bot

Loading

Uh oh!

pytorch-bot bot commented Feb 3, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

meta-codesync bot commented Feb 3, 2026

Uh oh!

Uh oh!

apullin commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

apullin commented Feb 3, 2026 • edited by meta-codesync bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17139

❌ 2 New Failures, 2 Unrelated Failures

Uh oh!

github-actions bot commented Feb 3, 2026

This PR needs a release notes: label

Uh oh!

meta-codesync bot commented Feb 3, 2026

Uh oh!

Uh oh!

apullin commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

apullin commented Feb 3, 2026 •

edited by meta-codesync bot

Loading

pytorch-bot bot commented Feb 3, 2026 •

edited

Loading

This PR needs a `release notes:` label