Releases: NVIDIA/cutlass
Releases · NVIDIA/cutlass
CUTLASS 1.1
CUTLASS 1.1.0 release adds:
- Documentation
- Examples
- Turing Features
- Batched Strided GEMM
- Threadblock rasterization strategies
- Extended CUTLASS Core components
- Enhanced CUTLASS utilities
CUTLASS 1.0.1
CUTLASS 1.0.1.
Intra-threadblock reduction added for small threadblock tile sizes
- sgemm_64x128x16, sgemm_128x128x16, sgemm_128x64x16, sgemm_128x32x16, sgemm_64x64x16, sgemm_64x32x16
- igemm_32x32x128
- GEMM K residue handled during prologue prior to mainloop
Replaced Google Test copy with submodule. Use git submodule init
CUTLASS 1.0.0
CUTLASS v1.0.0
CUTLASS 0.1.1
Final patch of CUTLASS v0.1.
CUTLASS 0.1.0
CUTLASS initial release.