Skip to content

Commit

Permalink
add Ex2
Browse files Browse the repository at this point in the history
  • Loading branch information
Konthee committed Feb 15, 2024
1 parent 25f09e1 commit 27a3df3
Showing 1 changed file with 19 additions and 1 deletion.
20 changes: 19 additions & 1 deletion experiment_effect_of_dpo/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,19 @@
## Experiment 1
## Train DPO

```bash
002-001-dpo-temp-0_3-v-all-ref.sh
```

### Configuration

- BASE_MODEL: Name of Model for save.
- DATA_PATH: Dataset Path.
- EPOCH: Num Train Epoch.
- LR: 2e-5 for full finetune and 2e-4 for lora.
- GRADIENT_ACCUMULATION_STEPS: Accumulation step.
- MAX_LEN: Max training length.
- MAX_PROMPT_LEN: Max training prompt length.
- MICRO_BSZ: Batch size per step.
- VAL_SIZE: Split validation set.
- WANDB_NAME: Wandb project name.
- WARMUP_STEPS: Warmup step for scheduler.

0 comments on commit 27a3df3

Please sign in to comment.