Comparing results with Deepmind implementation? #17
-
Hello, Thanks so much for this repository. It is so valuable to have a Pytorch implementation of "Learning to simulate", which can really accelerate research progress in this field. Really appreciate this work. I wanted to ask whether you have compared results with Deepmind's implementation that you've reproduced. In particular, it seems feasible to take some dataset (e.g. WaterRamps), train the model, and then evaluate the MSE on the test set, either 1-step or over a whole rollout. Have you done this and if so, seen comparable performance to the Deepmind implementation? I wanted to ask since it would be useful to compare performance. Thanks a lot. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
Hi @arjun-mani Thanks for your interest in our PyTorch implementation. We just finished training for 20 Million steps as done in the DeepMind's paper. The MSE during training were in the same range and the viz looks reasonable. We will post the comparison of MSE for rollouts and 1-step soon. Will add some performance comparisons as well. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Thanks very much! Glad to hear, look forward to seeing the comparisons. |
Beta Was this translation helpful? Give feedback.
-
@arjun-mani After 1 Million training steps - the MSE is roughly similar. The training steps are randomized, so we don't expect exactly the same MSE.
|
Beta Was this translation helpful? Give feedback.
-
Hi @kks32, thanks so much for sending these results. Is it possible to see the results on the test set after training for 20M steps? I was thinking it might be useful to compare the Pytorch version to Table C.4 in the appendix of the paper (https://arxiv.org/pdf/2002.09405.pdf), just to see how the test MSEs compare. And maybe to see some qualitative results after 20M steps. I'm sure you're busy, and I'm extremely grateful to you for this work and don't want to impose. Just suggesting that these may be useful benchmarks so that people can have a point of comparison. Thanks very much! |
Beta Was this translation helpful? Give feedback.
@arjun-mani After 1 Million training steps - the MSE is roughly similar. The training steps are randomized, so we don't expect exactly the same MSE.