-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rapids on snowflake deployment #493
base: main
Are you sure you want to change the base?
Conversation
Can't seem to find a fix for the failure: Looks like it is sort of a known issue sphinx-doc/sphinx#3175 but the suggested fix, I found relies on this sphinx syntax and when trying this, it doesn't render as expected. Any ideas?
|
@jacobtomlinson I think this is ready for a first review. That being said while checking if the volume mount instructions where working correctly I realized that the mount won't work unless we have We have two options here.
FROM rapidsai/notebooks:25.02a-cuda11.8-py3.11-amd64
RUN pip install "snowflake-snowpark-python[pandas]" snowflake-connector-python
USER rapids
WORKDIR /home/rapids/workspace TODO
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggled with some permissions error when following these instructions. Could you take a look and see wha I'm doing wrong?
Also to answer your question I wouldn't set the working directory I would just create an empty directory.
We don't need to creater the directory ahead of time. Setting mountPath: /home/rapids/notebooks/workspace
works for me.
source/platforms/snowflake.md
Outdated
@@ -0,0 +1,343 @@ | |||
# Snowflake | |||
|
|||
You can install RAPIDS on Snowflake via [Snowpark Container Services](https://docs.snowflake.com/en/developer-guide/snowpark-container-services/overview). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First mention of Snowflake should be a link.
You can install RAPIDS on Snowflake via [Snowpark Container Services](https://docs.snowflake.com/en/developer-guide/snowpark-container-services/overview). | |
You can install RAPIDS on [Snowflake](https://www.snowflake.com) via [Snowpark Container Services](https://docs.snowflake.com/en/developer-guide/snowpark-container-services/overview). |
Create a Dockerfile as follow: | ||
|
||
```Dockerfile | ||
FROM rapidsai/notebooks:25.02a-cuda11.8-py3.11-amd64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will go stale. We usually use templating to ensure it always shows the correct RAPIDS image. The requirements to use Python 3.11 may add a little complexity here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the requirement of python 3.11 and explicit amd64 platform will make the templating a bit more complicated. I'll take a look into it
source/platforms/snowflake.md
Outdated
Create a conda/mamba environment with SnowCLI. | ||
|
||
```yaml | ||
name: snow-cli | ||
channels: | ||
- condaforge | ||
dependencies: | ||
- python=3.11 | ||
- pip | ||
- pip: | ||
- snowflake-cli | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a whole conda environment to install the snowflake CLI seems specific to a subset of users.
I would suggest just directing folks to the snowflake CLI install docs and letting them install it in whatever way makes sense to them.
For example my personal preference would be to use homebrew on my mac, which their docs include. Or maybe use uv tool
to install the Python package.
source/platforms/snowflake.md
Outdated
```{note} | ||
If you don't recall `<ORG>-<ACCOUNT-NAME>` you can obtain them | ||
by running the following in the Snowflake SQL worksheet. | ||
``` | ||
|
||
```sql | ||
SELECT CURRENT_ORGANIZATION_NAME(); --org | ||
SELECT CURRENT_ACCOUNT_NAME(); --account name | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see this note until after I spent a while trying to figure out where to find this info. Maybe move this up, or mention it in the comment next to account : <ORG>-<ACCOUNT-NAME> # e.g. MYORGANIZATION-MYACCOUNT
source/platforms/snowflake.md
Outdated
image: <org-account>.registry.snowflakecomputing.com/container_hol_db/public/image_repo/rapids-nb-snowflake:dev | ||
volumeMounts: | ||
- name: rapids-notebooks | ||
mountPath: home/rapids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
mountPath: home/rapids | |
mountPath: /home/rapids/notebooks/workspace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to test the instructions end-to-end and I managed to get a working notebook instance and ran some cudf
code 🎉.
I found a few more things that need tweaking though and left a few more comments.
First we login into the snowflake repository with docker, via terminal: | ||
|
||
```bash | ||
docker login <snowflake_registry_hostname> -u <snowflake_user_name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both the docker login
and docker push
commands don't work if you have 2FA enabled. Every API call results in a push notification to approve, but if you don't approve within a few seconds it sends another push and the original API call times out. Then your device just gets spammed with retry notifications but the docker login
command has already exited.
I managed to log in by hitting approve on the very first push notification, but then for the docker push
you get a notification for every single layer push and it's imporrible to approve the first notification for each one.
Turning on token caching may help here as it should only prompt until the first approval, but I'm not sure it will help with the race condition and retries.
We should probably add a note to suggest enabling token caching or disabling 2FA altogether (which is what I ended up doing). We should also try and report this to Snowflake.
ALTER COMPUTE POOL CONTAINER_HOL_POOL STOP ALL; | ||
ALTER COMPUTE POOL CONTAINER_HOL_POOL SUSPEND; | ||
|
||
DROP SERVICE CONTAINER_HOL_DB.PUBLIC.RAPIDS_SNOWPARK_SERVICE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line didn't work for me, it just said the service didn't exist, but the rest of the commands did so I think I cleaned up successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, this is as suggested by snowflake. Did your service had a different name or something? I'll give this another try after again soon and check.
I had a good amount of trouble launching things again, I think there was some cleanup and role mix up instructions that didn't remove everything. But hopefully that's sorted. Yet TODO:
|
Closes the deployment part of #419
For the "Show reading data from Snowflake into a cudf dataframe" , I created this issue for tracking purposes #494
This is WIP but it's almost there for review, need a few more sections with screenshots.TODO:
$
. Any suggestions @jacobtomlinson ?