Skip to content

Conversation

@ndgrigorian
Copy link
Collaborator

@ndgrigorian ndgrigorian commented Aug 27, 2024

This PR adds dedicated kernels to the pow dtype matrix for some cases of a floating-point base and integer exponent.

This greatly improves the performance by removing the need to copy and up-cast both arrays.

Also refactors in-place binary operation type support: rather than relying on the out-of-place type support matrix and table, binary in-place operations now have their own dedicated tables. This means that in-place operations can more easily support type combinations which the out-of-place functions don't.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you opening the PR as a draft?

Improves performance for specific edge cases where the base array is of a floating-point data type and the exponent is 32-bit integer
This makes the code easier to understand
Improves readability of in-place code
@github-actions
Copy link

github-actions bot commented Aug 27, 2024

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.18.0dev0=py310hdf72452_375 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

@ndgrigorian
Copy link
Collaborator Author

Superseded by #1815 , which contains the same commits refactoring in-place type support tables, but excludes sycl::pown overloads due to issues with precision on AMD CPUs revealed in the CI.

@ndgrigorian ndgrigorian deleted the improve-inplace-binary-ops-new-pow-type-support branch September 10, 2024 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant