Half matrix and components #1708

yhmtsai · 2024-10-24T15:30:15Z

This PR adds the matrix and components (like arrary/device_matrix_data) with half precision support

Also, to avoid touch the files, which are not related to this PR, I add several type list with half additionally.
For example, RealValueTypes -> RealValueTypesWithHalf and next_precision -> next_precision_with_half
(We will add bfloat16 in the future, so maybe do not use Half)

for the friend and corresponding function

friend class <prev<value_type>> // class previous precision can access this
function(class <next<value_type>>) // it can access the class with next precision because it is class with previous precision of class with next precision.

If we only use next in friend and function

friend class <next<next<value_type>>>
function(class <next<value_type>>)

Moreover, the second one does not work when we fallback the next_precision_with_half to next_precision by disabling half because next<next<value_type>> is value_type without half now. However, the first one always work.

TODO:

merge Half base type #1706
~~add as_device_type to sycl for gko::half <-> sycl::half~~ move to Sycl Half #1710
ensure we do not call 16-bit fake atomic

MarcelKoch

Mostly looks good. There are still some places where the new half-enabled types are missing.

include/ginkgo/core/base/segmented_array.hpp

include/ginkgo/core/base/precision_dispatch.hpp

include/ginkgo/core/matrix/row_gatherer.hpp

MarcelKoch · 2024-10-28T08:06:22Z

core/test/utils.hpp

 using ComplexAndPODTypes = merge_type_list_t<ComplexValueTypes, PODTypes>;

+using ComplexAndPODTypesWithHalf =
+    merge_type_list_t<ComplexValueTypesWithHalf, PODTypes>;


There are still instances, where the type list without half is used.

I do not get it. Could you elaborate it more?

if you search for ComplexValueTypes you fill find it still in use in places where the version with half should be used instead, e.g. the reference array tests.

The PermuteIterator test in core/test/base/iterator_factory.cpp still uses ComplexAndPODTypes without half. Is that intended?

Good Catch. I also change all ComplexAndPODTypes with half now.

core/test/utils.hpp

hip/components/cooperative_groups.hip.hpp

MarcelKoch · 2024-10-28T10:07:29Z

omp/components/atomic.hpp

+    static_assert(sizeof(ValueType) == sizeof(ResultType),
+                  "The type to reinterpret to must be of the same size as the "
+                  "original type.");
+    return reinterpret_cast<ResultType&>(val);


maybe just use memcpy here directly.

I would also prefer if you used memcpy instead (because it is defined behavior):

Suggested change

return reinterpret_cast<ResultType&>(val);

ResultType res;

using std::memcpy;

memcpy(&res, &val, sizeof(ValueType));

return res;

Still think this should be changed.

I will change it but I do not think it help anything but we still need the same undefined behavior outside to make it work.

omp/components/atomic.hpp

reference/matrix/ell_kernels.cpp

include/ginkgo/core/base/math.hpp

CMakeLists.txt

thoasm

Part 1 / 2 of my review. So far, I only have small comments.

thoasm · 2024-11-05T10:07:20Z

CMakeLists.txt

+option(GINKGO_ENABLE_HALF "Enable the use of half precision" ON)
+# We do not support MSVC. SYCL will come later
+if(MSVC OR GINKGO_BUILD_SYCL)
+    message(STATUS "HALF is not supported in MSVC, and later support in SYCL")


This needs to be rephrased since I really don't know what you mean by "and later support in SYCL".
Do you mean that SYCL does support half-precision in a later version?

Yes, we will enable the support from #1710
As the half is trivial copy again now, we might not need the device_type mapping though.

common/cuda_hip/base/math.hpp

thoasm · 2024-11-05T10:53:37Z

common/cuda_hip/base/math.hpp

+// It is required by NVHPC 23.3, isnan is undefined when NVHPC are only as host
+// compiler.


I don't quite get the meaning, maybe:

Suggested change

// It is required by NVHPC 23.3, isnan is undefined when NVHPC are only as host

// compiler.

// It is required by NVHPC 23.3, `isnan` is undefined when NVHPC is only used as a host

// compiler.

If I recall correctly,
I think cuda will go through the code twice. one for device and the other for the rest.
NVCC does not complain anything, but NVHPC will complain isnan is not defined.
TBH, I forgot whether I put __device__ or not when I encounter this issue.
I will check again

common/cuda_hip/base/types.hpp

thoasm · 2024-11-05T12:41:21Z

common/unified/components/fill_array_kernels.cpp

+        exec,
+        [] GKO_KERNEL(auto idx, auto array) {
+            if constexpr (std::is_same_v<remove_complex<ValueType>, half>) {
+                // __half can not be from int64_t


What do you mean by that?
That half can't be converted to int64_t?

No, __half can not be converted from int64_t.
cuda only writes the conversion from short, int, long long and the corresponding unsigned version.
Unfortuntately, it does not accepts int64_t even if long long and int64_t are the same technically.

…fallback to a working version if it is the case for matrix

ginkgo-bot · 2024-12-03T01:21:52Z

Error: PR already merged!

yhmtsai requested review from a team October 24, 2024 15:30

yhmtsai self-assigned this Oct 24, 2024

yhmtsai mentioned this pull request Oct 24, 2024

Half precision support #1257

Closed

12 tasks

yhmtsai force-pushed the half_matrix branch 2 times, most recently from 8f3a17d to b7d4a15 Compare October 25, 2024 08:20

yhmtsai added this to the Ginkgo 1.9.0 milestone Oct 25, 2024

yhmtsai added the 1:ST:ready-for-review This PR is ready for review label Oct 25, 2024

yhmtsai force-pushed the half_matrix branch from b7d4a15 to dc5a7e6 Compare October 25, 2024 15:29

yhmtsai force-pushed the half_type branch from 07dcc75 to fac512f Compare October 25, 2024 15:29

MarcelKoch self-requested a review October 25, 2024 15:54

MarcelKoch requested changes Oct 28, 2024

View reviewed changes

yhmtsai force-pushed the half_matrix branch from dc5a7e6 to f53438a Compare October 28, 2024 16:12

yhmtsai force-pushed the half_type branch from fac512f to bfbe44b Compare October 28, 2024 16:12

yhmtsai force-pushed the half_matrix branch from f53438a to 5b22346 Compare October 28, 2024 17:19

yhmtsai force-pushed the half_type branch from bfbe44b to 0c24e81 Compare October 28, 2024 17:19

yhmtsai force-pushed the half_matrix branch 2 times, most recently from ee4be45 to 3037d52 Compare October 29, 2024 10:51

MarcelKoch requested a review from thoasm October 30, 2024 14:07

MarcelKoch reviewed Oct 31, 2024

View reviewed changes

CMakeLists.txt Show resolved Hide resolved

yhmtsai force-pushed the half_type branch from 0c24e81 to 7eaf547 Compare November 4, 2024 09:43

yhmtsai force-pushed the half_matrix branch from 3037d52 to 39b79d5 Compare November 4, 2024 14:24

thoasm reviewed Nov 5, 2024

View reviewed changes

yhmtsai force-pushed the half_type branch 2 times, most recently from 3b05fc9 to b5afcac Compare November 30, 2024 01:30

yhmtsai force-pushed the half_matrix branch 2 times, most recently from 2aa59e2 to 70a8f26 Compare November 30, 2024 18:36

yhmtsai requested a review from MarcelKoch December 1, 2024 00:41

MarcelKoch approved these changes Dec 1, 2024

View reviewed changes

yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. 1:ST:skip-full-test and removed 1:ST:ready-for-review This PR is ready for review labels Dec 3, 2024

Base automatically changed from half_type to develop December 3, 2024 01:08

yhmtsai added 16 commits December 3, 2024 02:09

instantiation/testing/next/prev/stub type definition

502df54

half option

c53ab57

device type mapping

6f14b13

consider custom namespace for thrust::complex<__half> and benchmark

476cf28

atomic and cooperative_groups

7cc8c6f

fix math and device_numeric_limit

a85f462

array operation in half

7b4829c

matrix with half

561f173

device_matrix_data and mtx_io

036485a

components such as array/iterator/segmented_array test with half

b29a8f6

matrix test with half

2be0042

base such as composition/combination with half and corr. test

8d3e4b5

test_utils test

b2fa55a

constexpr restriction for nvc++

8910f83

cuda with CC<70 and hip do not support 16 bit atomic. throw error or …

0421615

…fallback to a working version if it is the case for matrix

implement half shuffle via 32 bit impl

8190bf6

yhmtsai force-pushed the half_matrix branch from 70a8f26 to 8190bf6 Compare December 3, 2024 01:09

yhmtsai merged commit 76ef161 into develop Dec 3, 2024
9 of 11 checks passed

yhmtsai deleted the half_matrix branch December 3, 2024 01:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Half matrix and components #1708

Half matrix and components #1708

yhmtsai commented Oct 24, 2024 •

edited

Loading

MarcelKoch left a comment

MarcelKoch Oct 28, 2024

yhmtsai Oct 28, 2024

MarcelKoch Oct 28, 2024

MarcelKoch Nov 19, 2024

yhmtsai Nov 29, 2024

MarcelKoch Oct 28, 2024

thoasm Nov 15, 2024

MarcelKoch Nov 19, 2024

yhmtsai Nov 29, 2024

thoasm left a comment

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024 •

edited

Loading

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024

ginkgo-bot commented Dec 3, 2024

-    return reinterpret_cast<ResultType&>(val);
+    ResultType res;
+    using std::memcpy;
+    memcpy(&res, &val, sizeof(ValueType));
+    return res;

		// It is required by NVHPC 23.3, isnan is undefined when NVHPC are only as host
		// compiler.

Half matrix and components #1708

Half matrix and components #1708

Conversation

yhmtsai commented Oct 24, 2024 • edited Loading

MarcelKoch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thoasm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yhmtsai Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ginkgo-bot commented Dec 3, 2024

yhmtsai commented Oct 24, 2024 •

edited

Loading

yhmtsai Nov 5, 2024 •

edited

Loading