-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Half matrix and components #1708
Conversation
8f3a17d
to
b7d4a15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good. There are still some places where the new half-enabled types are missing.
using ComplexAndPODTypes = merge_type_list_t<ComplexValueTypes, PODTypes>; | ||
|
||
using ComplexAndPODTypesWithHalf = | ||
merge_type_list_t<ComplexValueTypesWithHalf, PODTypes>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still instances, where the type list without half is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not get it. Could you elaborate it more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you search for ComplexValueTypes
you fill find it still in use in places where the version with half should be used instead, e.g. the reference array tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PermuteIterator
test in core/test/base/iterator_factory.cpp
still uses ComplexAndPODTypes
without half. Is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good Catch. I also change all ComplexAndPODTypes with half now.
omp/components/atomic.hpp
Outdated
static_assert(sizeof(ValueType) == sizeof(ResultType), | ||
"The type to reinterpret to must be of the same size as the " | ||
"original type."); | ||
return reinterpret_cast<ResultType&>(val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just use memcpy
here directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also prefer if you used memcpy
instead (because it is defined behavior):
return reinterpret_cast<ResultType&>(val); | |
ResultType res; | |
using std::memcpy; | |
memcpy(&res, &val, sizeof(ValueType)); | |
return res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still think this should be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change it but I do not think it help anything but we still need the same undefined behavior outside to make it work.
ee4be45
to
3037d52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part 1 / 2 of my review. So far, I only have small comments.
option(GINKGO_ENABLE_HALF "Enable the use of half precision" ON) | ||
# We do not support MSVC. SYCL will come later | ||
if(MSVC OR GINKGO_BUILD_SYCL) | ||
message(STATUS "HALF is not supported in MSVC, and later support in SYCL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be rephrased since I really don't know what you mean by "and later support in SYCL".
Do you mean that SYCL does support half-precision in a later version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will enable the support from #1710
As the half is trivial copy again now, we might not need the device_type mapping though.
common/cuda_hip/base/math.hpp
Outdated
// It is required by NVHPC 23.3, isnan is undefined when NVHPC are only as host | ||
// compiler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get the meaning, maybe:
// It is required by NVHPC 23.3, isnan is undefined when NVHPC are only as host | |
// compiler. | |
// It is required by NVHPC 23.3, `isnan` is undefined when NVHPC is only used as a host | |
// compiler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall correctly,
I think cuda will go through the code twice. one for device and the other for the rest.
NVCC does not complain anything, but NVHPC will complain isnan is not defined.
TBH, I forgot whether I put __device__
or not when I encounter this issue.
I will check again
exec, | ||
[] GKO_KERNEL(auto idx, auto array) { | ||
if constexpr (std::is_same_v<remove_complex<ValueType>, half>) { | ||
// __half can not be from int64_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by that?
That half can't be converted to int64_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, __half
can not be converted from int64_t
.
cuda only writes the conversion from short
, int
, long long
and the corresponding unsigned
version.
Unfortuntately, it does not accepts int64_t even if long long
and int64_t
are the same technically.
3b05fc9
to
b5afcac
Compare
2aa59e2
to
70a8f26
Compare
…fallback to a working version if it is the case for matrix
Error: PR already merged! |
This PR adds the matrix and components (like arrary/device_matrix_data) with half precision support
Also, to avoid touch the files, which are not related to this PR, I add several type list with half additionally.
For example, RealValueTypes -> RealValueTypesWithHalf and next_precision -> next_precision_with_half
(We will add bfloat16 in the future, so maybe do not use Half)
for the friend and corresponding function
If we only use next in friend and function
Moreover, the second one does not work when we fallback the next_precision_with_half to next_precision by disabling half because next<next<value_type>> is value_type without half now. However, the first one always work.
TODO:
add as_device_type to sycl for gko::half <-> sycl::halfmove to Sycl Half #1710