You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We observed noticeable variability when re-running the FSDP model training script for a small 1.xB llama2 model with fixed seed(s) and same tokens. Below is a snapshot of the evaluation results on three models created with the same inputs (tokens, training script, seed(s)). Would you please help us investigate the root cause of this variability (data loader, hardware variability or other additional variables)? Thanks in advance!
The text was updated successfully, but these errors were encountered:
Yes, above results were from 3 runs of the same yaml file (i.e., same model config, dataset, training params, random-seed, etc.) except for the change of experiment_id. The general setting is:
We observed noticeable variability when re-running the FSDP model training script for a small 1.xB llama2 model with fixed seed(s) and same tokens. Below is a snapshot of the evaluation results on three models created with the same inputs (tokens, training script, seed(s)). Would you please help us investigate the root cause of this variability (data loader, hardware variability or other additional variables)? Thanks in advance!
The text was updated successfully, but these errors were encountered: