Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rapids on snowflake deployment #493

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ncclementi
Copy link
Contributor

@ncclementi ncclementi commented Jan 16, 2025

Closes the deployment part of #419

For the "Show reading data from Snowflake into a cudf dataframe" , I created this issue for tracking purposes #494

This is WIP but it's almost there for review, need a few more sections with screenshots.

TODO:

  • Figure out failures in CI. Not sure why I'm getting the failures, I can build locally although I get a warning that says< I assume we fail on warning sin CI. It seems it doesn't like the $. Any suggestions @jacobtomlinson ?

@ncclementi
Copy link
Contributor Author

ncclementi commented Jan 16, 2025

Can't seem to find a fix for the failure:

Looks like it is sort of a known issue sphinx-doc/sphinx#3175 but the suggested fix, I found relies on this sphinx syntax and when trying this, it doesn't render as expected. Any ideas?

.. code-block:: sql :force: 
   USE ROLE CONTAINER_USER_ROLE;
   CALL SYSTEM$REGISTRY_LIST_IMAGES('/CONTAINER_HOL_DB/PUBLIC/IMAGE_REPO');

@ncclementi
Copy link
Contributor Author

ncclementi commented Jan 16, 2025

@jacobtomlinson I think this is ready for a first review. That being said while checking if the volume mount instructions where working correctly I realized that the mount won't work unless we have workspace as a directory created.

We have two options here.

  1. Just mount /home/rapids so all the changes made to the notebooks in there would remain, as well as things added.
  2. Modify the Dockerfilesuch that we create this workspace. I think it would be something like this?
    Any strong preferences here?
FROM rapidsai/notebooks:25.02a-cuda11.8-py3.11-amd64

RUN pip install "snowflake-snowpark-python[pandas]" snowflake-connector-python

USER rapids
WORKDIR /home/rapids/workspace

TODO

  • Add note on volume mount and how to use after making decision.
  • Update Dockerfile image?, currently using nightly.

@ncclementi ncclementi marked this pull request as ready for review January 16, 2025 21:14
@ncclementi ncclementi requested a review from a team as a code owner January 16, 2025 21:14
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled with some permissions error when following these instructions. Could you take a look and see wha I'm doing wrong?

Also to answer your question I wouldn't set the working directory I would just create an empty directory.

We don't need to creater the directory ahead of time. Setting mountPath: /home/rapids/notebooks/workspace works for me.

@@ -0,0 +1,343 @@
# Snowflake

You can install RAPIDS on Snowflake via [Snowpark Container Services](https://docs.snowflake.com/en/developer-guide/snowpark-container-services/overview).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First mention of Snowflake should be a link.

Suggested change
You can install RAPIDS on Snowflake via [Snowpark Container Services](https://docs.snowflake.com/en/developer-guide/snowpark-container-services/overview).
You can install RAPIDS on [Snowflake](https://www.snowflake.com) via [Snowpark Container Services](https://docs.snowflake.com/en/developer-guide/snowpark-container-services/overview).

Create a Dockerfile as follow:

```Dockerfile
FROM rapidsai/notebooks:25.02a-cuda11.8-py3.11-amd64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will go stale. We usually use templating to ensure it always shows the correct RAPIDS image. The requirements to use Python 3.11 may add a little complexity here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the requirement of python 3.11 and explicit amd64 platform will make the templating a bit more complicated. I'll take a look into it

Comment on lines 116 to 127
Create a conda/mamba environment with SnowCLI.

```yaml
name: snow-cli
channels:
- condaforge
dependencies:
- python=3.11
- pip
- pip:
- snowflake-cli
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a whole conda environment to install the snowflake CLI seems specific to a subset of users.

I would suggest just directing folks to the snowflake CLI install docs and letting them install it in whatever way makes sense to them.

For example my personal preference would be to use homebrew on my mac, which their docs include. Or maybe use uv tool to install the Python package.

Comment on lines 155 to 163
```{note}
If you don't recall `<ORG>-<ACCOUNT-NAME>` you can obtain them
by running the following in the Snowflake SQL worksheet.
```

```sql
SELECT CURRENT_ORGANIZATION_NAME(); --org
SELECT CURRENT_ACCOUNT_NAME(); --account name
```
Copy link
Member

@jacobtomlinson jacobtomlinson Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see this note until after I spent a while trying to figure out where to find this info. Maybe move this up, or mention it in the comment next to account : <ORG>-<ACCOUNT-NAME> # e.g. MYORGANIZATION-MYACCOUNT

source/platforms/snowflake.md Show resolved Hide resolved
source/platforms/snowflake.md Show resolved Hide resolved
source/platforms/snowflake.md Show resolved Hide resolved
image: <org-account>.registry.snowflakecomputing.com/container_hol_db/public/image_repo/rapids-nb-snowflake:dev
volumeMounts:
- name: rapids-notebooks
mountPath: home/rapids
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be

Suggested change
mountPath: home/rapids
mountPath: /home/rapids/notebooks/workspace

Copy link
Member

@jacobtomlinson jacobtomlinson Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set the workspace to this and it worked.

image

However if I try and create a new notebook in that directory it gives me a permissions error.

image

I think this is because the UID needs changing. The rapids user's UID is 1001.

$ id
uid=1001(rapids) gid=1000(conda) groups=1000(conda)

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to test the instructions end-to-end and I managed to get a working notebook instance and ran some cudf code 🎉.

I found a few more things that need tweaking though and left a few more comments.

First we login into the snowflake repository with docker, via terminal:

```bash
docker login <snowflake_registry_hostname> -u <snowflake_user_name>
Copy link
Member

@jacobtomlinson jacobtomlinson Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the docker login and docker push commands don't work if you have 2FA enabled. Every API call results in a push notification to approve, but if you don't approve within a few seconds it sends another push and the original API call times out. Then your device just gets spammed with retry notifications but the docker login command has already exited.

I managed to log in by hitting approve on the very first push notification, but then for the docker push you get a notification for every single layer push and it's imporrible to approve the first notification for each one.

Turning on token caching may help here as it should only prompt until the first approval, but I'm not sure it will help with the race condition and retries.

We should probably add a note to suggest enabling token caching or disabling 2FA altogether (which is what I ended up doing). We should also try and report this to Snowflake.

ALTER COMPUTE POOL CONTAINER_HOL_POOL STOP ALL;
ALTER COMPUTE POOL CONTAINER_HOL_POOL SUSPEND;

DROP SERVICE CONTAINER_HOL_DB.PUBLIC.RAPIDS_SNOWPARK_SERVICE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line didn't work for me, it just said the service didn't exist, but the rest of the commands did so I think I cleaned up successfully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, this is as suggested by snowflake. Did your service had a different name or something? I'll give this another try after again soon and check.

@ncclementi
Copy link
Contributor Author

I had a good amount of trouble launching things again, I think there was some cleanup and role mix up instructions that didn't remove everything. But hopefully that's sorted.

Yet TODO:

  • resolve templating of rapids image in the doc to point to latest notebook but on py 3.11 and amd platform
  • Try to add a file to the workspace, I modify the spec where I change the uid: 1001.
  • add a paragraph about the volume mount.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants