-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added CK-gemm runner #674
base: main_perf
Are you sure you want to change the base?
Added CK-gemm runner #674
Conversation
97d7206
to
398c67a
Compare
51868e8
to
fb2115d
Compare
fb2115d
to
39d7945
Compare
@zhanglx13 @giuseros @sjw36 , can you, please, review the PR? |
let me give it a try and I'll review the details later today |
I'm getting the following error
Is there some dependency that we need to install before installing CK? |
Hi @zhanglx13, usually we have ROCm software stack installed on all of our dev. machine (under |
Yes, my base docker is rocm/pytorch:rocm6.2.3_ubuntu22.04_py3.10_pytorch_release_2.3.0 |
I tried to reproduce the error with your Docker image and I didn't succeed. Are you sure that you followed the documentation? CK_INSTALL_DIR=$(realpath .)/usr/ck
mkdir -p ${CK_INSTALL_DIR}
git clone https://github.com/ROCm/composable_kernel.git
cd composable_kernel
mkdir -p build && cd build
# specify your target here
TARGET="gfx942"
# Edit `./profiler/src/CMakeLists.txt` to speed up compilation (see below)
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ cmake .. \
-DCMAKE_LINKER_TYPE=LLD \
-DCMAKE_PREFIX_PATH=/opt/rocm \
-DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
-DCMAKE_BUILD_TYPE=Release \
-DGPU_TARGETS="${TARGET}" \
-DGPU_ARCHS="${TARGET}" \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_INSTALL_PREFIX="${CK_INSTALL_DIR}" \
-DBUILD_TESTING=OFF \
-DCK_PARALLEL_COMPILE_JOBS=20 \
-DCK_PARALLEL_LINK_JOBS=20
make -j<num_proc>
make install |
What is the purpose of this PR? Is the intent to compare performance with CK GEMMs in our performance regression testing? |
CK has |
Thanks. I understand what the code is doing. I want to understand why we need this in the kernels repository. Is this something that we think we will use frequently across multiple users? Do we have resources committed to maintaining this piece of code? |
I remember I was sharing this code with one or two developers (cannot remember names). As for me, it is always a sign to add such a tool to some remote repository. PS Btw, I remember that we did performance comparisons between rocMLIR and CK/rocBLAS kernels in the rocMLIR project. |
This is not maintenance burden. Maintenance burden is the effort needed to ensure that this code is functional and keeps running for eternity. That includes ensuring we keep up with any changes on the CK side to the APIs that are being called here. The kernels team cannot commit to this effort. IMO this is something better suited for a confluence page. However, if the compiler team can commit to maintaining this, I'm fine with checking it in here. CC: @sjw36 |
Added CK-gemm runner. The user can replace
DeviceGemmInstance
inkernel.cpp
. You will findamdgcn
assembly of the kernel in the build directory after the compilation.