Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile should use the correct nvidia channel to install libcu* #423

Merged
merged 4 commits into from
Dec 12, 2024

Conversation

praateekmahajan
Copy link
Collaborator

@praateekmahajan praateekmahajan commented Dec 11, 2024

Description

Realised that while we're installing cuda-12.5.1 in the CI image, the conda installations are all over the place and some of them are being picked up from conda-forge rather than nvidia. (IIUC -c A -c B, A takes precedence over B, and in our case conda-forge might have older versions of packages)

Secondly if we wish to use UDFs on dask_cudf we need NVVM. The place where I needed it was hack for #417

Checklist

  • I am familiar with the Contributing Guide.
  • New or Existing tests cover these changes.
  • The documentation is up to date with these changes.

Signed-off-by: Praateek <[email protected]>
Signed-off-by: Praateek <[email protected]>
@praateekmahajan praateekmahajan added bug Something isn't working gpuci Run GPU CI/CD on PR labels Dec 11, 2024
@sarahyurick sarahyurick added gpuci Run GPU CI/CD on PR and removed gpuci Run GPU CI/CD on PR labels Dec 11, 2024
@@ -29,14 +29,15 @@ LABEL "nemo.library"=${IMAGE_LABEL}
WORKDIR /opt

# Install the minimal libcu* libraries needed by NeMo Curator
RUN conda create -y --name curator -c conda-forge -c nvidia \
RUN conda create -y --name curator -c nvidia/label/cuda-${CUDA_VER} -c conda-forge
python=3.10 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like gpuCI is currently throwing an error here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently build arg's can't be in RUN, so had to create env var, which is ugly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh lol, interesting. LGTM!

Signed-off-by: Praateek <[email protected]>
@praateekmahajan praateekmahajan added gpuci Run GPU CI/CD on PR and removed gpuci Run GPU CI/CD on PR labels Dec 11, 2024
Signed-off-by: Praateek <[email protected]>
@praateekmahajan praateekmahajan added gpuci Run GPU CI/CD on PR and removed gpuci Run GPU CI/CD on PR labels Dec 11, 2024
@@ -29,14 +29,15 @@ LABEL "nemo.library"=${IMAGE_LABEL}
WORKDIR /opt

# Install the minimal libcu* libraries needed by NeMo Curator
RUN conda create -y --name curator -c conda-forge -c nvidia \
RUN conda create -y --name curator -c nvidia/label/cuda-${CUDA_VER} -c conda-forge
python=3.10 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh lol, interesting. LGTM!

@praateekmahajan praateekmahajan merged commit f56e924 into NVIDIA:main Dec 12, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gpuci Run GPU CI/CD on PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants