Comparing results with Deepmind implementation? #17

arjun-mani · 2022-03-02T05:23:42Z

arjun-mani
Mar 2, 2022

Hello,

Thanks so much for this repository. It is so valuable to have a Pytorch implementation of "Learning to simulate", which can really accelerate research progress in this field. Really appreciate this work.

I wanted to ask whether you have compared results with Deepmind's implementation that you've reproduced. In particular, it seems feasible to take some dataset (e.g. WaterRamps), train the model, and then evaluate the MSE on the test set, either 1-step or over a whole rollout. Have you done this and if so, seen comparable performance to the Deepmind implementation? I wanted to ask since it would be useful to compare performance. Thanks a lot.

Answered by kks32

Mar 15, 2022

@arjun-mani After 1 Million training steps - the MSE is roughly similar. The training steps are randomized, so we don't expect exactly the same MSE.

MSE	TF	PyTorch
Sand	0.000947	0.01352

TensorFlow

PyTorch

View full answer

kks32 · 2022-03-04T16:22:40Z

kks32
Mar 4, 2022
Maintainer

Hi @arjun-mani Thanks for your interest in our PyTorch implementation. We just finished training for 20 Million steps as done in the DeepMind's paper. The MSE during training were in the same range and the viz looks reasonable. We will post the comparison of MSE for rollouts and 1-step soon. Will add some performance comparisons as well. Thanks!

0 replies

arjun-mani · 2022-03-09T19:25:13Z

arjun-mani
Mar 9, 2022
Author

Thanks very much! Glad to hear, look forward to seeing the comparisons.

0 replies

kks32 · 2022-03-15T00:53:13Z

kks32
Mar 15, 2022
Maintainer

@arjun-mani After 1 Million training steps - the MSE is roughly similar. The training steps are randomized, so we don't expect exactly the same MSE.

MSE	TF	PyTorch
Sand	0.000947	0.01352

TensorFlow

PyTorch

0 replies

arjun-mani · 2022-03-16T06:28:01Z

arjun-mani
Mar 16, 2022
Author

Hi @kks32, thanks so much for sending these results. Is it possible to see the results on the test set after training for 20M steps? I was thinking it might be useful to compare the Pytorch version to Table C.4 in the appendix of the paper (https://arxiv.org/pdf/2002.09405.pdf), just to see how the test MSEs compare. And maybe to see some qualitative results after 20M steps.

I'm sure you're busy, and I'm extremely grateful to you for this work and don't want to impose. Just suggesting that these may be useful benchmarks so that people can have a point of comparison. Thanks very much!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing results with Deepmind implementation? #17

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Comparing results with Deepmind implementation? #17

arjun-mani Mar 2, 2022

Replies: 4 comments

kks32 Mar 4, 2022 Maintainer

arjun-mani Mar 9, 2022 Author

kks32 Mar 15, 2022 Maintainer

arjun-mani Mar 16, 2022 Author

arjun-mani
Mar 2, 2022

kks32
Mar 4, 2022
Maintainer

arjun-mani
Mar 9, 2022
Author

kks32
Mar 15, 2022
Maintainer

arjun-mani
Mar 16, 2022
Author