-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerfile should use the correct nvidia channel to install libcu* #423
Dockerfile should use the correct nvidia channel to install libcu* #423
Conversation
Signed-off-by: Praateek <[email protected]>
Signed-off-by: Praateek <[email protected]>
@@ -29,14 +29,15 @@ LABEL "nemo.library"=${IMAGE_LABEL} | |||
WORKDIR /opt | |||
|
|||
# Install the minimal libcu* libraries needed by NeMo Curator | |||
RUN conda create -y --name curator -c conda-forge -c nvidia \ | |||
RUN conda create -y --name curator -c nvidia/label/cuda-${CUDA_VER} -c conda-forge | |||
python=3.10 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like gpuCI is currently throwing an error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently build arg's can't be in RUN
, so had to create env var, which is ugly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh lol, interesting. LGTM!
Signed-off-by: Praateek <[email protected]>
Signed-off-by: Praateek <[email protected]>
@@ -29,14 +29,15 @@ LABEL "nemo.library"=${IMAGE_LABEL} | |||
WORKDIR /opt | |||
|
|||
# Install the minimal libcu* libraries needed by NeMo Curator | |||
RUN conda create -y --name curator -c conda-forge -c nvidia \ | |||
RUN conda create -y --name curator -c nvidia/label/cuda-${CUDA_VER} -c conda-forge | |||
python=3.10 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh lol, interesting. LGTM!
Description
Realised that while we're installing cuda-12.5.1 in the CI image, the conda installations are all over the place and some of them are being picked up from conda-forge rather than nvidia. (IIUC -c A -c B, A takes precedence over B, and in our case conda-forge might have older versions of packages)
Secondly if we wish to use UDFs on
dask_cudf
we needNVVM
. The place where I needed it was hack for #417Checklist