-
Notifications
You must be signed in to change notification settings - Fork 29
Use experimental complex extension for all complex arithmetic #2069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
cb75d56
to
2dd0a72
Compare
View rendered docs @ https://intelpython.github.io/dpctl/pulls/2069/index.html |
Array API standard conformance tests for dpctl=0.20.0dev0=py310h93fe807_183 ran successfully. |
converts to experimental sycl complex values, then performs math operations
* Move sycl_complex.hpp to utils * No longer use exprm_ns defined by header, define on per-file basis * Include alias to type sycl_complex_t<T> under sycl_utils namespace * Use identical include macro where inclusion of sycl_complex would be impossible
2dd0a72
to
52bb73e
Compare
Array API standard conformance tests for dpctl=0.20.0dev0=py310h93fe807_183 ran successfully. |
#ifndef SYCL_EXT_ONEAPI_COMPLEX | ||
#define SYCL_EXT_ONEAPI_COMPLEX 1 | ||
#endif | ||
#if __has_include(<sycl/ext/oneapi/experimental/sycl_complex.hpp>) | ||
#include <sycl/ext/oneapi/experimental/sycl_complex.hpp> | ||
#else | ||
#include <sycl/ext/oneapi/experimental/complex/complex.hpp> | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to include #include "sycl_complex.hpp"
here instead or it was intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. It was originally intentional, because instead of sycl_complex.hpp
, I had made the include part of sycl_utils.hpp
, but decided it was too complicated. I'll change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't think sycl_utils.hpp
is actually using sycl_complex.hpp
anymore, so it should be removed there.
@@ -133,6 +161,20 @@ template <typename T> T logaddexp(T x, T y) | |||
} | |||
} | |||
|
|||
template <typename T> T plus_complex(const T &x1, const T &x2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need here something like below?
template <typename T> T plus_complex(const T &x1, const T &x2) | |
template <typename T, typename = std::enable_if_t<is_complex_v<T>> T plus_complex(const T &x1, const T &x2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems applicable to many declaration here expecting complex template type only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be a good idea, good suggestion
{ | ||
T operator()(const T &x, const T &y) const | ||
{ | ||
if constexpr (detail::IsComplex<T>::value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can reuse is_complex_v
here:
if constexpr (detail::IsComplex<T>::value) { | |
if constexpr (dpctl::tensor::type_utils::is_complex_v<T>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type_utils.hpp
isn't included in this header. Maybe it could be though.
@@ -133,6 +161,20 @@ template <typename T> T logaddexp(T x, T y) | |||
} | |||
} | |||
|
|||
template <typename T> T plus_complex(const T &x1, const T &x2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to use inline
here explicitly?
As I can see compiler is actively uses inline __attribute__((__visibility__("hidden"), __always_inline__))
in $CONDA_PREFIX/include/sycl/ext/oneapi/experimental/complex/detail/complex.hpp
header and I wonder if we should do the same or similar. Or it will be well optimized and inlined implicitly anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I'm not sure it would do much good: function is only really used in sycl_utils.hpp
for Plus
struct, which is just our own implementation of sycl::plus
.
It's arguable whether this function is needed at all. It only really avoids (directly) including sycl_complex.hpp
in sycl_utils.hpp
.
using sycl_complexT = exprm_ns::complex<realT>; | ||
sycl_complexT z1 = sycl_complexT(x1); | ||
sycl_complexT z2 = sycl_complexT(x2); | ||
realT real1 = exprm_ns::real(z1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it can be simplified by:
realT real1 = exprm_ns::real(z1); | |
realT real1 = z1.real(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably it is not a bad idea to implement the wrappers for real
and imag
functions also, since anyway it will be declared with theconstexpr
:
template<typename Tp>
constexpr Tp real_complex(const std::complex<Tp> &z) {
return sycl_complex_t<Tp>(z).real();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we can stop using function calls to real
and imag
and just use the methods everywhere, which I think is pretty sensible. The methods also don't seem to trip up HIP compiler.
Then again, the std::real
and std::imag
functions actually didn't either, I just changed them for consistency. I vote to just use methods everywhere, they look cleaner anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see your point though, the wrapper would be good anywhere we don't want actually use the result of sycl_complex_t<Tp>
namespace sycl_utils | ||
{ | ||
|
||
template <typename T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template <typename T> | |
template <typename T, typename std::enable_if_t<std::is_floating_point_v<T>> |
{ | ||
|
||
template <typename T> | ||
using sycl_complex_t = sycl::ext::oneapi::experimental::complex<T>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like we have to use sycl_complex_t
type everywhere to have it working with HIP backend.
In that case would it be worse idea to cast usm_ndarray
to sycl_complex_t
type from the beginning (or at least in case of building with HIP support enabled)? It might be that sycl_complex_t = std::complex<T>
by default.
It would impact all type matrix where we need to replace std::complex
with sycl_complex_t
and probably to update some helper functions and aliases. But that would help us avoid casting to sycl_complex_t
every time we need to do some math operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this, but the flaw is especially with MKL and oneMath
(for dpnp): they expect std::complex
inputs. oneMath
actually casts std::complex
inputs to the SYCL complex type internally, and never expects the SYCL type in its signatures.
this would mean that in calls to MKL/oneMath, dpnp would have to copy into a fresh allocation of type std::complex
, and then if using AMD/CUDA backend, it would have to be copied again by oneMath.
With current nightly compiler, dpctl will fail to link when built for HIP backend due to undefined symbols (i.e.,
__muldc3
).Since compiler does not support arithmetic with
std::complex
on HIP backend, this PR takes the steps to refactor complex arithmetic throughout tensor librarylibtensor/include/kernels/elementwise_functions/sycl_complex.hpp
is now moved tolibtensor/include/utils/sycl_complex.hpp
and refactored to definedsycl_complex_t<T>
, aliasing the complex type defined in the extensionsycl_complex.hpp
still indirectly setsSYCL_EXT_ONEAPI_COMPLEX
and includes the header for experimental extension, but no longer definesexprm_ns
namespace alias. This is now left to individual files.std::real
,std::imag
, etc. have been removedsycl_utils.hpp
now defines new custom functors a laMaximum
andMinimum
,Plus
andMultiplies
. These structs are used by accumulation operations (sum
,cumulative_sum
, GEMM, etc.). They perform casting ofstd::complex
inputs to SYCL equivalent, perform operations, and then return asstd::complex