Use experimental complex extension for all complex arithmetic #2069

ndgrigorian · 2025-05-01T01:14:21Z

With current nightly compiler, dpctl will fail to link when built for HIP backend due to undefined symbols (i.e., __muldc3).

Since compiler does not support arithmetic with std::complex on HIP backend, this PR takes the steps to refactor complex arithmetic throughout tensor library

libtensor/include/kernels/elementwise_functions/sycl_complex.hpp is now moved to libtensor/include/utils/sycl_complex.hpp and refactored to defined sycl_complex_t<T>, aliasing the complex type defined in the extension
sycl_complex.hpp still indirectly sets SYCL_EXT_ONEAPI_COMPLEX and includes the header for experimental extension, but no longer defines exprm_ns namespace alias. This is now left to individual files.
All uses of std::real, std::imag, etc. have been removed
sycl_utils.hpp now defines new custom functors a la Maximum and Minimum, Plus and Multiplies. These structs are used by accumulation operations (sum, cumulative_sum, GEMM, etc.). They perform casting of std::complex inputs to SYCL equivalent, perform operations, and then return as std::complex
Several element-wise functions are updated to properly perform operations in SYCL complex type

Have you provided a meaningful PR description?
Have you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
Have you checked performance impact of proposed changes?
Have you added documentation for your changes, if necessary?
Have you added your changes to the changelog?
If this PR is a work in progress, are you opening the PR as a draft?

github-actions · 2025-05-01T01:50:39Z

View rendered docs @ https://intelpython.github.io/dpctl/pulls/2069/index.html

github-actions · 2025-05-01T02:20:26Z

Array API standard conformance tests for dpctl=0.20.0dev0=py310h93fe807_183 ran successfully.
Passed: 1108
Failed: 4
Skipped: 119

…L complex type

converts to experimental sycl complex values, then performs math operations

* Move sycl_complex.hpp to utils * No longer use exprm_ns defined by header, define on per-file basis * Include alias to type sycl_complex_t<T> under sycl_utils namespace * Use identical include macro where inclusion of sycl_complex would be impossible

github-actions · 2025-05-01T03:22:55Z

Array API standard conformance tests for dpctl=0.20.0dev0=py310h93fe807_183 ran successfully.
Passed: 1109
Failed: 3
Skipped: 119

coveralls · 2025-05-01T03:23:19Z

coverage: 86.419%. remained the same
when pulling 52bb73e on use-sycl-complex-conversions
into 29eeac7 on master.

antonwolfy · 2025-05-02T09:20:45Z

dpctl/tensor/libtensor/include/utils/math_utils.hpp

+#ifndef SYCL_EXT_ONEAPI_COMPLEX
+#define SYCL_EXT_ONEAPI_COMPLEX 1
+#endif
+#if __has_include(<sycl/ext/oneapi/experimental/sycl_complex.hpp>)
+#include <sycl/ext/oneapi/experimental/sycl_complex.hpp>
+#else
+#include <sycl/ext/oneapi/experimental/complex/complex.hpp>
+#endif


Do we need to include #include "sycl_complex.hpp" here instead or it was intentional?

Good point. It was originally intentional, because instead of sycl_complex.hpp, I had made the include part of sycl_utils.hpp, but decided it was too complicated. I'll change it.

Also, I don't think sycl_utils.hpp is actually using sycl_complex.hpp anymore, so it should be removed there.

antonwolfy · 2025-05-02T09:45:40Z

dpctl/tensor/libtensor/include/utils/math_utils.hpp

@@ -133,6 +161,20 @@ template <typename T> T logaddexp(T x, T y)
    }
 }

+template <typename T> T plus_complex(const T &x1, const T &x2)


Do we need here something like below?

Suggested change

template <typename T> T plus_complex(const T &x1, const T &x2)

template <typename T, typename = std::enable_if_t<is_complex_v<T>> T plus_complex(const T &x1, const T &x2)

It seems applicable to many declaration here expecting complex template type only

this would be a good idea, good suggestion

antonwolfy · 2025-05-02T09:50:17Z

dpctl/tensor/libtensor/include/utils/sycl_utils.hpp

+{
+    T operator()(const T &x, const T &y) const
+    {
+        if constexpr (detail::IsComplex<T>::value) {


We can reuse is_complex_v here:

Suggested change

if constexpr (detail::IsComplex<T>::value) {

if constexpr (dpctl::tensor::type_utils::is_complex_v<T>) {

type_utils.hpp isn't included in this header. Maybe it could be though.

antonwolfy · 2025-05-02T09:57:05Z

dpctl/tensor/libtensor/include/utils/math_utils.hpp

@@ -133,6 +161,20 @@ template <typename T> T logaddexp(T x, T y)
    }
 }

+template <typename T> T plus_complex(const T &x1, const T &x2)


Do we need to use inline here explicitly?
As I can see compiler is actively uses inline __attribute__((__visibility__("hidden"), __always_inline__)) in $CONDA_PREFIX/include/sycl/ext/oneapi/experimental/complex/detail/complex.hpp header and I wonder if we should do the same or similar. Or it will be well optimized and inlined implicitly anyway?

We could, but I'm not sure it would do much good: function is only really used in sycl_utils.hpp for Plus struct, which is just our own implementation of sycl::plus.

It's arguable whether this function is needed at all. It only really avoids (directly) including sycl_complex.hpp in sycl_utils.hpp.

antonwolfy · 2025-05-02T10:22:54Z

dpctl/tensor/libtensor/include/utils/math_utils.hpp

+    using sycl_complexT = exprm_ns::complex<realT>;
+    sycl_complexT z1 = sycl_complexT(x1);
+    sycl_complexT z2 = sycl_complexT(x2);
+    realT real1 = exprm_ns::real(z1);


In general it can be simplified by:

Suggested change

realT real1 = exprm_ns::real(z1);

realT real1 = z1.real();

Probably it is not a bad idea to implement the wrappers for real and imag functions also, since anyway it will be declared with theconstexpr:

template<typename Tp> constexpr Tp real_complex(const std::complex<Tp> &z) { return sycl_complex_t<Tp>(z).real(); }

Alternatively, we can stop using function calls to real and imag and just use the methods everywhere, which I think is pretty sensible. The methods also don't seem to trip up HIP compiler.

Then again, the std::real and std::imag functions actually didn't either, I just changed them for consistency. I vote to just use methods everywhere, they look cleaner anyway.

I do see your point though, the wrapper would be good anywhere we don't want actually use the result of sycl_complex_t<Tp>

antonwolfy · 2025-05-02T10:32:02Z

dpctl/tensor/libtensor/include/utils/sycl_complex.hpp

+namespace sycl_utils
+{
+
+template <typename T>


Suggested change

template <typename T>

template <typename T, typename std::enable_if_t<std::is_floating_point_v<T>>

antonwolfy · 2025-05-02T11:18:52Z

dpctl/tensor/libtensor/include/utils/sycl_complex.hpp

+{
+
+template <typename T>
+using sycl_complex_t = sycl::ext::oneapi::experimental::complex<T>;


It sounds like we have to use sycl_complex_t type everywhere to have it working with HIP backend.
In that case would it be worse idea to cast usm_ndarray to sycl_complex_t type from the beginning (or at least in case of building with HIP support enabled)? It might be that sycl_complex_t = std::complex<T> by default.

It would impact all type matrix where we need to replace std::complex with sycl_complex_t and probably to update some helper functions and aliases. But that would help us avoid casting to sycl_complex_t every time we need to do some math operation.

I considered this, but the flaw is especially with MKL and oneMath (for dpnp): they expect std::complex inputs. oneMath actually casts std::complex inputs to the SYCL complex type internally, and never expects the SYCL type in its signatures.

(see here: https://github.com/uxlfoundation/oneMath/blob/4ad4dfb5db834117248ad5f8fbded5cfc1097005/src/blas/backends/generic/generic_level3.cxx#L35)

this would mean that in calls to MKL/oneMath, dpnp would have to copy into a fresh allocation of type std::complex, and then if using AMD/CUDA backend, it would have to be copied again by oneMath.

ndgrigorian force-pushed the use-sycl-complex-conversions branch 2 times, most recently from cb75d56 to 2dd0a72 Compare May 1, 2025 01:36

ndgrigorian added 9 commits April 30, 2025 19:38

Use sycl complex extension throughout element-wise and utils

ae75273

Update binary functions multiply and subtract to use experimental SYC…

88e64c1

…L complex type

Use experimental SYCL complex in dot product

01f22d3

Use experimental namespace in sequential dot product

34168bd

Use experimental complex namespace in gemm

0fd49ae

Use specialized functor for multiplying or adding complex inputs

e3be74e

converts to experimental sycl complex values, then performs math operations

Use custom plus operator in gemm and dot product tree reduction kernels

6fba5f2

Refactor operators when dispatching to tree reductions

da15d0e

ndgrigorian force-pushed the use-sycl-complex-conversions branch from 2dd0a72 to 52bb73e Compare May 1, 2025 02:38

antonwolfy reviewed May 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use experimental complex extension for all complex arithmetic #2069

Use experimental complex extension for all complex arithmetic #2069

ndgrigorian commented May 1, 2025 •

edited

Loading

github-actions bot commented May 1, 2025

github-actions bot commented May 1, 2025

github-actions bot commented May 1, 2025

coveralls commented May 1, 2025

antonwolfy May 2, 2025

ndgrigorian May 2, 2025

ndgrigorian May 2, 2025

antonwolfy May 2, 2025

antonwolfy May 2, 2025

ndgrigorian May 2, 2025

antonwolfy May 2, 2025

ndgrigorian May 2, 2025

antonwolfy May 2, 2025

ndgrigorian May 2, 2025 •

edited

Loading

antonwolfy May 2, 2025

antonwolfy May 2, 2025

ndgrigorian May 2, 2025

ndgrigorian May 3, 2025

antonwolfy May 2, 2025

antonwolfy May 2, 2025

ndgrigorian May 2, 2025 •

edited

Loading

	template <typename T> T plus_complex(const T &x1, const T &x2)
	template <typename T, typename = std::enable_if_t<is_complex_v<T>> T plus_complex(const T &x1, const T &x2)

	if constexpr (detail::IsComplex<T>::value) {
	if constexpr (dpctl::tensor::type_utils::is_complex_v<T>) {

	template <typename T>
	template <typename T, typename std::enable_if_t<std::is_floating_point_v<T>>

Use experimental complex extension for all complex arithmetic #2069

Are you sure you want to change the base?

Use experimental complex extension for all complex arithmetic #2069

Conversation

ndgrigorian commented May 1, 2025 • edited Loading

github-actions bot commented May 1, 2025

github-actions bot commented May 1, 2025

github-actions bot commented May 1, 2025

coveralls commented May 1, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ndgrigorian May 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ndgrigorian May 2, 2025 • edited Loading

Choose a reason for hiding this comment

ndgrigorian commented May 1, 2025 •

edited

Loading

ndgrigorian May 2, 2025 •

edited

Loading

ndgrigorian May 2, 2025 •

edited

Loading