Skip to content

Commit

Permalink
add more info about Spider and JupyterLab
Browse files Browse the repository at this point in the history
  • Loading branch information
annefou committed Jun 27, 2024
1 parent 25026cb commit 4f21d6f
Show file tree
Hide file tree
Showing 3 changed files with 150 additions and 1 deletion.
1 change: 1 addition & 0 deletions docs/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ only_build_toc_files: true
execute:
exclude_patterns:
- '*minio*'
- '*dask_spider*'
execute_notebooks: auto
timeout: 1000

Expand Down
3 changes: 3 additions & 0 deletions docs/pangeo/dask_spider.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
"### Authors\n",
"- Tina Odaka, Ifremer (France), [@tinaok](https://github.com/tinaok)\n",
"- Anne Fouilloux, Simula Research Laboratory (Norway), [@annefou](https://github.com/annefou)\n",
"- Francesco Nattino, Netherlands eScience Center (Netherlands), [@fnattino](https://github.com/fnattino)\n",
"- Meiert W. Grootes, Netherlands eScience Center (Netherlands), [@meiertgrootes](https://github.com/meiertgrootes)\n",
"- Ou Ku, Netherlands eScience Center (Netherlands), [@rogerkuou](https://github.com/rogerkuou)\n",
"\n",
"### Contributors\n",
"- Pier Lorenzo Marasco, Ispra (Italy), [@pl-marasco](https://github.com/pl-marasco)\n",
Expand Down
147 changes: 146 additions & 1 deletion docs/pangeo/start_dask_spider.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,33 @@
# Scaling on HPC/HTC

- Anne Fouilloux, Simula Research Laboratory (Norway), [@annefou](https://github.com/annefou)
- Francesco Nattino, Netherlands eScience Center (Netherlands), [@fnattino](https://github.com/fnattino)
- Meiert W. Grootes, Netherlands eScience Center (Netherlands), [@meiertgrootes](https://github.com/meiertgrootes)
- Ou Ku, Netherlands eScience Center (Netherlands), [@rogerkuou](https://github.com/rogerkuou)
- Tina Odaka, Ifremer (France), [@tinaok](https://github.com/tinaok)

<div class="alert alert-info">
<i class="fa-check-circle fa" style="font-size: 22px;color:#666;"></i> <b>Note</b>
<br>
<ul>
<li>This episode has been developed in collaboration with SURF and the eScience Center (Nederlands) and is based on the work done by the eScience Center. Their documentation can be found at <a href="https://github.com/RS-DAT/JupyterDaskOnSLURM/blob/main/user-guide.md#container-wrapper-for-spider-system">https://github.com/RS-DAT/JupyterDaskOnSLURM/blob/main/user-guide.md#container-wrapper-for-spider-system</a>. The scripts we used in this tutorial are also developed by the eScience Center (Francesco Nattino, Meiert W. Grootes, and Ou Ku), and copied on the shared project folder during the Geo Open Hack Pangeo tutorial. If you are interested in using this approach at SURF for your own work, we suggest you follow the <a href="https://github.com/RS-DAT/JupyterDaskOnSLURM/blob/main/user-guide.md#container-wrapper-for-spider-system">documentation</a>. </li>
</ul>
</div>
In this section, we will show how to move from the cloud to HPC/HTC to scale.

For this section, you need to have an account on [Spider](https://doc.spider.surfsara.nl/en/latest/Pages/about.html) from [SURF](https://www.surf.nl/).
We are using [Apptainer](https://apptainer.org) and for the training, we will be using an image with less Python packages (sufficient for executing all the notebooks from the tutorial).

## Container wrapper for Spider system

On Spider, using conda environments will lead to performance issues, due to
conda's nature of many small files. In such cases, one can containerize the
conda environment. One way to do this is to use the
[hpc-container-wrapper](https://github.com/CSCfi/hpc-container-wrapper) tool.
This is a container wrapper tool developed by Finnish IT center for science
(CSC).


## Launch JupyterLab and use dask on Spider

### Set up
Expand Down Expand Up @@ -54,9 +77,131 @@ ssh -i /path/to/private/ssh/key -N -L 8889:wn-ca-03:9300 geocourse-teacher09@spi
Copy/paste the command given in your slurm output but update the path to the ssh key you are using to login to spider (e.g. `/home/annef/.ssh/id_rsa`).

- If you copy the command above, make sure to change the username `geocourse-teacher09` to your username on spider.
- Open your browser and paste `http://localhost:8889/` tor get your JupyterLab session.

### Shutting down

From the Dask tab in the Jupyter interface, click "shutdown" on a running cluster instance to kill all workers and the scheduler (a new cluster based on the default configurations can be re-created by pressing the "+" button).

From the Jupyter interface, select "File > Shutdown" to stop the Jupyter server and release resources.

If the job running the Jupyter server and the Dask scheduler is killed, the Dask workers will also be killed shortly after (configure this using the `death-timeout` key in the config file).

However, you can use `squeue - u $USER` command to check all your jobs (including all the jobs related to dask) are stopped. Cancel any remaining jobs!

```
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
13091627 normal dask-wor geocours R 1:37 1 wn-ca-09
13091628 normal dask-wor geocours R 1:37 1 wn-dc-12
13091629 normal dask-wor geocours R 1:37 1 wn-ca-07
13091630 normal dask-wor geocours R 1:37 1 wn-ca-07
```



## Create your own image

TODO
Follow these steps if you want to know more and create your own image.

To set up the container wrapper, first log in to Spider. Then, clone the `JupyterDaskOnSLURM` repository:

```shell
git clone http://github.com/RS-DAT/JupyterDaskOnSLURM.git
```

Then, clone both the `hpc-container-wrapper` repository:

```shell
git clone https://github.com/CSCfi/hpc-container-wrapper.git
```

Then, copy the container config file `spider.yaml` file from the
`JupyterDaskOnSLURM` to the `.config` file in `hpc-container-wrapper`:

```shell
cp ./JupyterDaskOnSLURM/config/container/spider.yaml ./hpc-container-wrapper/configs/
```

Change to the `hpc-container-wrapper` directory and run the
`install.sh` script to install the container wrapper:

```shell
cd hpc-container-wrapper
bash install.sh spider
```

Next, copy the `environment.yaml` file from the `JupyterDaskOnSLURM`
to the current directory and create a container. In the following example, we
create a container under `jupyter_dask` directory:

```shell
mkdir -p ./jupyter_dask
cp ../JupyterDaskOnSLURM/environment.yaml .
bin/conda-containerize new --prefix ./jupyter_dask ./environment.yaml
```

At the end of the installation, the tool will print the path to the executable
directory (`bin` directory) of the container. For example:

```output
export PATH="/absolute/path/to/the/container/bin:$PATH"
```
```shell
cd ..
mkdir -p ~/.config/dask
cp JupyterDaskOnSLURM/config/dask/config_spider.yml ~/.config/dask/config.yml
```

Then add the following lines to the `~/.config/dask/config.yml` file, under the
`slurm` section of `jobqueue` section, note that you need to replace the `export
PATH` part with the output from the container creation step:

```yaml
job_script_prologue:
- 'export PATH="/absolute/path/to/the/container/bin:$PATH"' # Export environment path to
python: python
```
After adding the lines, the `~/.config/dask/config.yml` file should look like this:

```yaml
distributed:
... Some other configurations ...
labextension:
... Some other configurations ...
jobqueue:
slurm:
... Some other configurations ...
job_script_prologue:
- 'export PATH="/home/caroline-oku/caroline/Public/demo_mobyle/container_wrapper/hpc-container-wrapper/tmp/bin:$PATH"'
python: python
```

Then also configure the SLURM job file
`JupyterDaskOnSLURM/scripts/jupyter_dask_spider_container.bsh`. Then replace the
following part with the PATH exportaion from the container creation step:

```shell
# CHANGE THIS TO THE ABSOLUTE PATH TO THE CONTAINER BIN
export PATH="/absolute/path/to/the/container/bin:$PATH"
```

Now you have reached the exit point of the deployment script! The Jupyter Server
with Dask plugin can now be started using the
`jupyter_dask_spider_container.bsh` script.

```shell
sbatch JupyterDaskOnSLURM/scripts/jupyter_dask_spider_container.bsh
```

After the job starts, there will be an example `ssh` command printed in the job stdout (file `slurm-<JOB_ID>.out`). It should look like:

```shell
ssh -i /path/to/private/ssh/key -N -L 8889:NODE:8888 [email protected]
```

You can execute this command in a new terminal window **on your local machine**
(modify the path to the private key). You can now access the Jupyter session
from your browser at `localhost:8889`.


0 comments on commit 4f21d6f

Please sign in to comment.