Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract merge sort kernels to NVRTC compilable header #3438

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

NaderAlAwar
Copy link
Contributor

Description

Closes #3386

Similar to #2231 and #3334, this PR extracts DeviceMergeSortBlockSortKernel, DeviceMergeSortPartitionKernel, and DeviceMergeSortMergeKernel into an NVRTC compilable header.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@NaderAlAwar NaderAlAwar requested review from a team as code owners January 17, 2025 15:06
Copy link
Contributor

🟨 CI finished in 1h 54m: Pass: 96%/78 | Total: 2d 02h | Avg: 39m 03s | Max: 1h 13m | Hits: 189%/10972
  • 🟨 thrust: Pass: 91%/37 | Total: 22h 55m | Avg: 37m 09s | Max: 1h 13m | Hits: 121%/7408

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  91%/35  | Total: 21h 49m | Avg: 37m 24s | Max:  1h 13m | Hits: 121%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 50s | Max: 34m 24s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  3h 30m | Avg: 42m 02s | Max:  1h 07m | Hits: 108%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 07m
      🔍 12.6               Pass:  90%/30  | Total: 17h 09m | Avg: 34m 18s | Max:  1h 13m | Hits: 125%/5556  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 06m | Avg: 33m 04s | Max: 34m 04s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 30m | Avg: 42m 02s | Max:  1h 07m | Hits: 108%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 07m
      🔍 nvcc12.6           Pass:  89%/28  | Total: 16h 03m | Avg: 34m 24s | Max:  1h 13m | Hits: 125%/5556  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 06m | Avg: 33m 04s | Max: 34m 04s
      🔍 nvcc               Pass:  91%/35  | Total: 21h 48m | Avg: 37m 23s | Max:  1h 13m | Hits: 121%/7408  
    🚨 jobs: TestCPU 🚨
      🟩 Build              Pass: 100%/31  | Total: 21h 23m | Avg: 41m 23s | Max:  1h 13m | Hits: 121%/7408  
      🔥 TestCPU            Pass:   0%/3   | Total: 52m 58s | Avg: 17m 39s | Max: 36m 53s
      🟩 TestGPU            Pass: 100%/3   | Total: 39m 02s | Avg: 13m 00s | Max: 14m 45s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/14  | Total: 10h 35m | Avg: 45m 25s | Max:  1h 11m | Hits: 125%/5556  
      🔍 20                 Pass:  85%/21  | Total: 11h 38m | Avg: 33m 15s | Max:  1h 13m | Hits: 108%/1852  
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 20m | Avg: 35m 02s | Max: 37m 30s
      🟩 Clang15            Pass: 100%/1   | Total: 35m 52s | Avg: 35m 52s | Max: 35m 52s
      🟩 Clang16            Pass: 100%/1   | Total: 37m 42s | Avg: 37m 42s | Max: 37m 42s
      🟩 Clang17            Pass: 100%/1   | Total: 35m 29s | Avg: 35m 29s | Max: 35m 29s
      🟨 Clang18            Pass:  85%/7   | Total:  3h 12m | Avg: 27m 29s | Max: 36m 19s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 12m | Avg: 36m 27s | Max: 36m 46s
      🟩 GCC8               Pass: 100%/1   | Total: 36m 19s | Avg: 36m 19s | Max: 36m 19s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 12m | Avg: 36m 17s | Max: 36m 21s
      🟩 GCC10              Pass: 100%/1   | Total: 37m 57s | Avg: 37m 57s | Max: 37m 57s
      🟩 GCC11              Pass: 100%/1   | Total: 35m 27s | Avg: 35m 27s | Max: 35m 27s
      🟩 GCC12              Pass: 100%/1   | Total: 37m 40s | Avg: 37m 40s | Max: 37m 40s
      🟨 GCC13              Pass:  87%/8   | Total:  3h 08m | Avg: 23m 34s | Max: 37m 41s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 07m | Hits: 117%/3704  
      🟨 MSVC14.39          Pass:  66%/3   | Total:  3h 01m | Avg:  1h 00m | Max:  1h 13m | Hits: 125%/3704  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 07m
    🟨 cxx_family
      🟨 Clang              Pass:  92%/14  | Total:  7h 21m | Avg: 31m 32s | Max: 37m 42s
      🟨 GCC                Pass:  93%/16  | Total:  8h 01m | Avg: 30m 05s | Max: 37m 57s
      🟨 MSVC               Pass:  80%/5   | Total:  5h 16m | Avg:  1h 03m | Max:  1h 13m | Hits: 121%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 07m
    🟨 gpu
      🟨 v100               Pass:  91%/37  | Total: 22h 55m | Avg: 37m 09s | Max:  1h 13m | Hits: 121%/7408  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 40m 44s | Avg: 20m 22s | Max: 28m 41s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 24m 00s | Avg: 24m 00s | Max: 24m 00s
    
  • 🟩 cub: Pass: 100%/38 | Total: 1d 03h | Avg: 42m 58s | Max: 1h 10m | Hits: 330%/3564

    🟩 cpu
      🟩 amd64              Pass: 100%/36  | Total:  1d 01h | Avg: 42m 41s | Max:  1h 10m | Hits: 330%/3564  
      🟩 arm64              Pass: 100%/2   | Total:  1h 36m | Avg: 48m 19s | Max: 49m 07s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 02m | Avg: 48m 33s | Max:  1h 03m | Hits: 329%/891   
      🟩 12.5               Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 05m
      🟩 12.6               Pass: 100%/31  | Total: 21h 00m | Avg: 40m 38s | Max:  1h 10m | Hits: 330%/2673  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 55m | Avg: 57m 30s | Max: 59m 55s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 02m | Avg: 48m 33s | Max:  1h 03m | Hits: 329%/891   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 05m
      🟩 nvcc12.6           Pass: 100%/29  | Total: 19h 05m | Avg: 39m 29s | Max:  1h 10m | Hits: 330%/2673  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 30s | Max: 59m 55s
      🟩 nvcc               Pass: 100%/36  | Total:  1d 01h | Avg: 42m 10s | Max:  1h 10m | Hits: 330%/3564  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 55m | Avg: 43m 57s | Max: 45m 57s
      🟩 Clang15            Pass: 100%/1   | Total: 47m 15s | Avg: 47m 15s | Max: 47m 15s
      🟩 Clang16            Pass: 100%/1   | Total: 42m 45s | Avg: 42m 45s | Max: 42m 45s
      🟩 Clang17            Pass: 100%/1   | Total: 42m 10s | Avg: 42m 10s | Max: 42m 10s
      🟩 Clang18            Pass: 100%/7   | Total:  4h 54m | Avg: 42m 00s | Max: 59m 55s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 30m | Avg: 45m 14s | Max: 47m 01s
      🟩 GCC8               Pass: 100%/1   | Total: 45m 38s | Avg: 45m 38s | Max: 45m 38s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 28m | Avg: 44m 03s | Max: 44m 57s
      🟩 GCC10              Pass: 100%/1   | Total: 46m 40s | Avg: 46m 40s | Max: 46m 40s
      🟩 GCC11              Pass: 100%/1   | Total: 42m 50s | Avg: 42m 50s | Max: 42m 50s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 28m | Avg: 29m 20s | Max: 47m 57s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 54m | Avg: 29m 21s | Max: 47m 32s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 06m | Hits: 331%/1782  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 10m | Hits: 329%/1782  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 05m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total: 10h 02m | Avg: 43m 00s | Max: 59m 55s
      🟩 GCC                Pass: 100%/18  | Total: 10h 36m | Avg: 35m 22s | Max: 47m 57s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 24m | Avg:  1h 06m | Max:  1h 10m | Hits: 330%/3564  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 05m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 40m 05s | Avg: 20m 02s | Max: 20m 32s
      🟩 v100               Pass: 100%/36  | Total:  1d 02h | Avg: 44m 15s | Max:  1h 10m | Hits: 330%/3564  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  1d 00h | Avg: 48m 13s | Max:  1h 10m | Hits: 330%/3564  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 17m 35s | Avg: 17m 35s | Max: 17m 35s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 36s | Avg: 14m 36s | Max: 14m 36s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 00m | Avg: 20m 15s | Max: 20m 53s
      🟩 TestGPU            Pass: 100%/2   | Total: 45m 35s | Avg: 22m 47s | Max: 23m 27s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 40m 05s | Avg: 20m 02s | Max: 20m 32s
      🟩 90a                Pass: 100%/1   | Total: 20m 42s | Avg: 20m 42s | Max: 20m 42s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total: 11h 58m | Avg: 51m 21s | Max:  1h 06m | Hits: 331%/2673  
      🟩 20                 Pass: 100%/24  | Total: 15h 14m | Avg: 38m 05s | Max:  1h 10m | Hits: 328%/891   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 10m 18s | Avg: 5m 09s | Max: 8m 08s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 18s | Avg:  5m 09s | Max:  8m 08s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 10m 18s | Avg:  5m 09s | Max:  8m 08s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 10m 18s | Avg:  5m 09s | Max:  8m 08s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 10m 18s | Avg:  5m 09s | Max:  8m 08s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 10m 18s | Avg:  5m 09s | Max:  8m 08s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 10m 18s | Avg:  5m 09s | Max:  8m 08s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 10m 18s | Avg:  5m 09s | Max:  8m 08s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 10s | Avg:  2m 10s | Max:  2m 10s
      🟩 Test               Pass: 100%/1   | Total:  8m 08s | Avg:  8m 08s | Max:  8m 08s
    
  • 🟩 python: Pass: 100%/1 | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 28m 04s | Avg: 28m 04s | Max: 28m 04s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Copy link
Contributor

🟩 CI finished in 2h 10m: Pass: 100%/78 | Total: 2d 08h | Avg: 43m 17s | Max: 1h 17m | Hits: 195%/12824
  • 🟩 cub: Pass: 100%/38 | Total: 1d 07h | Avg: 50m 25s | Max: 1h 09m | Hits: 238%/3564

    🟩 cpu
      🟩 amd64              Pass: 100%/36  | Total:  1d 06h | Avg: 50m 04s | Max:  1h 09m | Hits: 238%/3564  
      🟩 arm64              Pass: 100%/2   | Total:  1h 53m | Avg: 56m 35s | Max: 56m 49s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 54m | Avg: 58m 49s | Max:  1h 02m | Hits: 238%/891   
      🟩 12.5               Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m
      🟩 12.6               Pass: 100%/31  | Total:  1d 00h | Avg: 48m 02s | Max:  1h 09m | Hits: 237%/2673  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 57m | Avg: 58m 35s | Max: 59m 39s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 54m | Avg: 58m 49s | Max:  1h 02m | Hits: 238%/891   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m
      🟩 nvcc12.6           Pass: 100%/29  | Total: 22h 52m | Avg: 47m 19s | Max:  1h 09m | Hits: 237%/2673  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 35s | Max: 59m 39s
      🟩 nvcc               Pass: 100%/36  | Total:  1d 05h | Avg: 49m 57s | Max:  1h 09m | Hits: 238%/3564  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 47m | Avg: 56m 47s | Max: 59m 59s
      🟩 Clang15            Pass: 100%/1   | Total: 56m 42s | Avg: 56m 42s | Max: 56m 42s
      🟩 Clang16            Pass: 100%/1   | Total: 54m 33s | Avg: 54m 33s | Max: 54m 33s
      🟩 Clang17            Pass: 100%/1   | Total: 59m 39s | Avg: 59m 39s | Max: 59m 39s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 35m | Avg: 47m 54s | Max: 59m 39s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 52m | Avg: 56m 24s | Max: 58m 54s
      🟩 GCC8               Pass: 100%/1   | Total: 57m 35s | Avg: 57m 35s | Max: 57m 35s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 50m | Avg: 55m 24s | Max: 55m 59s
      🟩 GCC10              Pass: 100%/1   | Total: 54m 33s | Avg: 54m 33s | Max: 54m 33s
      🟩 GCC11              Pass: 100%/1   | Total: 56m 04s | Avg: 56m 04s | Max: 56m 04s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 40m | Avg: 33m 22s | Max: 54m 20s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 51m | Avg: 36m 27s | Max:  1h 01m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 05m | Hits: 238%/1782  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 09m | Hits: 237%/1782  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total: 12h 13m | Avg: 52m 23s | Max: 59m 59s
      🟩 GCC                Pass: 100%/18  | Total: 13h 03m | Avg: 43m 31s | Max:  1h 01m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 26m | Avg:  1h 06m | Max:  1h 09m | Hits: 238%/3564  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 45m 46s | Avg: 22m 53s | Max: 26m 22s
      🟩 v100               Pass: 100%/36  | Total:  1d 07h | Avg: 51m 56s | Max:  1h 09m | Hits: 238%/3564  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  1d 05h | Avg: 56m 32s | Max:  1h 09m | Hits: 238%/3564  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 19m 45s | Avg: 19m 45s | Max: 19m 45s
      🟩 GraphCapture       Pass: 100%/1   | Total: 21m 43s | Avg: 21m 43s | Max: 21m 43s
      🟩 HostLaunch         Pass: 100%/3   | Total: 59m 24s | Avg: 19m 48s | Max: 20m 22s
      🟩 TestGPU            Pass: 100%/2   | Total:  1h 02m | Avg: 31m 06s | Max: 31m 13s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 45m 46s | Avg: 22m 53s | Max: 26m 22s
      🟩 90a                Pass: 100%/1   | Total: 25m 21s | Avg: 25m 21s | Max: 25m 21s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total: 13h 47m | Avg: 59m 06s | Max:  1h 08m | Hits: 238%/2673  
      🟩 20                 Pass: 100%/24  | Total: 18h 08m | Avg: 45m 20s | Max:  1h 09m | Hits: 235%/891   
    
  • 🟩 thrust: Pass: 100%/37 | Total: 23h 24m | Avg: 37m 57s | Max: 1h 17m | Hits: 178%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 57m 01s | Avg: 28m 30s | Max: 33m 52s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total: 22h 17m | Avg: 38m 13s | Max:  1h 17m | Hits: 178%/9260  
      🟩 arm64              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 16s | Max: 34m 58s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 29m | Avg: 41m 54s | Max:  1h 07m | Hits: 142%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 12m
      🟩 12.6               Pass: 100%/30  | Total: 17h 30m | Avg: 35m 00s | Max:  1h 17m | Hits: 187%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 01m | Avg: 30m 50s | Max: 30m 56s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 29m | Avg: 41m 54s | Max:  1h 07m | Hits: 142%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 12m
      🟩 nvcc12.6           Pass: 100%/28  | Total: 16h 28m | Avg: 35m 18s | Max:  1h 17m | Hits: 187%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 50s | Max: 30m 56s
      🟩 nvcc               Pass: 100%/35  | Total: 22h 22m | Avg: 38m 21s | Max:  1h 17m | Hits: 178%/9260  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 21m | Avg: 35m 23s | Max: 35m 37s
      🟩 Clang15            Pass: 100%/1   | Total: 34m 37s | Avg: 34m 37s | Max: 34m 37s
      🟩 Clang16            Pass: 100%/1   | Total: 36m 32s | Avg: 36m 32s | Max: 36m 32s
      🟩 Clang17            Pass: 100%/1   | Total: 37m 30s | Avg: 37m 30s | Max: 37m 30s
      🟩 Clang18            Pass: 100%/7   | Total:  3h 06m | Avg: 26m 36s | Max: 36m 26s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 08m | Avg: 34m 24s | Max: 34m 25s
      🟩 GCC8               Pass: 100%/1   | Total: 37m 35s | Avg: 37m 35s | Max: 37m 35s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 14m | Avg: 37m 14s | Max: 37m 42s
      🟩 GCC10              Pass: 100%/1   | Total: 36m 24s | Avg: 36m 24s | Max: 36m 24s
      🟩 GCC11              Pass: 100%/1   | Total: 35m 41s | Avg: 35m 41s | Max: 35m 41s
      🟩 GCC12              Pass: 100%/1   | Total: 39m 23s | Avg: 39m 23s | Max: 39m 23s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 28m | Avg: 26m 02s | Max: 39m 48s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 07m | Hits: 135%/3704  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  3h 07m | Avg:  1h 02m | Max:  1h 17m | Hits: 207%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 12m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  7h 16m | Avg: 31m 10s | Max: 37m 30s
      🟩 GCC                Pass: 100%/16  | Total:  8h 20m | Avg: 31m 17s | Max: 39m 48s
      🟩 MSVC               Pass: 100%/5   | Total:  5h 22m | Avg:  1h 04m | Max:  1h 17m | Hits: 178%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 12m
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total: 23h 24m | Avg: 37m 57s | Max:  1h 17m | Hits: 178%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total: 21h 40m | Avg: 41m 57s | Max:  1h 17m | Hits: 132%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 52m 21s | Avg: 17m 27s | Max: 36m 54s | Hits: 365%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 51m 24s | Avg: 17m 08s | Max: 23m 09s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 22m 15s | Avg: 22m 15s | Max: 22m 15s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total: 10h 35m | Avg: 45m 24s | Max:  1h 12m | Hits: 133%/5556  
      🟩 20                 Pass: 100%/21  | Total: 11h 51m | Avg: 33m 52s | Max:  1h 17m | Hits: 247%/3704  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 11m 08s | Avg: 5m 34s | Max: 8m 53s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 11m 08s | Avg:  5m 34s | Max:  8m 53s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 11m 08s | Avg:  5m 34s | Max:  8m 53s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 11m 08s | Avg:  5m 34s | Max:  8m 53s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 11m 08s | Avg:  5m 34s | Max:  8m 53s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 11m 08s | Avg:  5m 34s | Max:  8m 53s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 11m 08s | Avg:  5m 34s | Max:  8m 53s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 11m 08s | Avg:  5m 34s | Max:  8m 53s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 15s | Avg:  2m 15s | Max:  2m 15s
      🟩 Test               Pass: 100%/1   | Total:  8m 53s | Avg:  8m 53s | Max:  8m 53s
    
  • 🟩 python: Pass: 100%/1 | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 45m 13s | Avg: 45m 13s | Max: 45m 13s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

Extract merge sort kernels to NVRTC compilable header
1 participant