Skip to content

Commit

Permalink
Example of using dice updated
Browse files Browse the repository at this point in the history
  • Loading branch information
Demirrr committed Oct 16, 2023
1 parent f81a27d commit f7baf63
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 6 deletions.
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,12 @@ Models can be easily trained in a single node multi-gpu setting
```bash
dice --accelerator "gpu" --strategy "ddp" --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
```
Similarly, models can be easily trained in a multi-node multi-gpu setting
```bash
torchrun --nnodes 2 --nproc_per_node=gpu --node_rank 0 --rdzv_id 455 --rdzv_backend c10d --rdzv_endpoint=nebula -m dicee.run --trainer torchDDP --dataset_dir KGs/UMLS
torchrun --nnodes 2 --nproc_per_node=gpu --node_rank 1 --rdzv_id 455 --rdzv_backend c10d --rdzv_endpoint=nebula -m dicee.run --trainer torchDDP --dataset_dir KGs/UMLS
```

Train a KGE model by providing the path of a single file and store all parameters under newly created directory
called `KeciFamilyRun`.
```bash
Expand Down
12 changes: 6 additions & 6 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,16 @@ Welcome to DICE Embeddings!
.. code-block:: bash
// 1 CPU
(dicee) $ python -m dicee.run --path_dataset_folder KGs/UMLS
(dicee) $ dice --dataset_dir KGs/UMLS
// 10 CPU
(dicee) $ python -m dicee.run --path_dataset_folder KGs/UMLS --num_core 10
(dicee) $ dice --dataset_dir KGs/UMLS --num_core 10
// Distributed Data Parallel (DDP) with all GPUs
(dicee) $ python -m dicee.run --trainer PL --accelerator gpu --strategy ddp --path_dataset_folder KGs/UMLS
(dicee) $ dice --trainer PL --accelerator gpu --strategy ddp --dataset_dir KGs/UMLS
// Model Parallel with all GPUs and low precision
(dicee) $ python -m dicee.run --trainer PL --accelerator gpu --strategy deepspeed_stage_3 --path_dataset_folder KGs/UMLS --precision 16
(dicee) $ dice --trainer PL --accelerator gpu --strategy deepspeed_stage_3 --dataset_dir KGs/UMLS --precision 16
// DDP with all GPUs on two nodes (felis and nebula):
(dicee) cdemir@felis $ torchrun --nnodes 2 --nproc_per_node=gpu --node_rank 0 --rdzv_id 455 --rdzv_backend c10d --rdzv_endpoint=nebula -m dicee.main --trainer torchDDP --path_dataset_folder KGs/UMLS
(dicee) cdemir@nebula $ torchrun --nnodes 2 --nproc_per_node=gpu --node_rank 1 --rdzv_id 455 --rdzv_backend c10d --rdzv_endpoint=nebula -m dicee.main --trainer torchDDP --path_dataset_folder KGs/UMLS
(dicee) cdemir@felis $ torchrun --nnodes 2 --nproc_per_node=gpu --node_rank 0 --rdzv_id 455 --rdzv_backend c10d --rdzv_endpoint=nebula -m dicee.run --trainer torchDDP --dataset_dir KGs/UMLS
(dicee) cdemir@nebula $ torchrun --nnodes 2 --nproc_per_node=gpu --node_rank 1 --rdzv_id 455 --rdzv_backend c10d --rdzv_endpoint=nebula -m dicee.run --trainer torchDDP --dataset_dir KGs/UMLS
.. toctree::
:maxdepth: 2
Expand Down

0 comments on commit f7baf63

Please sign in to comment.