Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple GPUs #726

Open
gitkol opened this issue Mar 19, 2024 · 10 comments
Open

Multiple GPUs #726

gitkol opened this issue Mar 19, 2024 · 10 comments

Comments

@gitkol
Copy link

gitkol commented Mar 19, 2024

Hi,

Can rest.py run on multiple GPUs?

Thanks,

Istvan

@ijpulidos
Copy link
Contributor

Hello! What do you mean by rest.py? Can you be more specific as to what your issue is?

@gitkol
Copy link
Author

gitkol commented Mar 20, 2024 via email

@gitkol
Copy link
Author

gitkol commented Mar 20, 2024 via email

@xiaowei-xie2
Copy link

I have the same question. I am able to run multiple solute tempering REMD simulations in parallel with mpirun according to this issue(#648), but I don't know how to distribute replicas among multiple GPUs so that they contribute to the same REMD simulation.

@gitkol
Copy link
Author

gitkol commented May 20, 2024 via email

@xiaowei-xie2
Copy link

Hi @gitkol, I think I figured it out. Do you have mpi4py installed correctly? For me, what I found was that, if I don't have mpi4py installed, mpirun would run multiple copies of the same REMD (each GPU would run a whole REMD but multiple GPUs running at the same time), but when I have mpi4py installed multiple GPUs will contribute to a single REMD. Here is an example of job files that worked for me (if it's helpful). On my system, using 4GPUs resulted in 2x speed up compared to 1 GPU (not 4x).
test_rest_14.tar.gz

@ijpulidos
Copy link
Contributor

@xiaowei-xie2 is correct, having mpi4py is important in this case. Thank you for providing a test script that we can use to reproduce your results.

There's always some part of the code that cannot be fully parallelized, for example when communicating between the different GPUs. It would be interesting to see if we can accomplish some profiling to check where the overhead is. Thanks!

@xiaowei-xie2
Copy link

Hi @ijpulidos, thank you for the insight. Yes I totally understand using n GPUs won't necessarily result in n times speed up (sometimes not even any speed up), so I am actually satisfied with the current performance. But yes it would be nice to see where the overhead is!

I am also curious does the current repo support parallelizing across multiple GPUs across multiple nodes?

@ijpulidos
Copy link
Contributor

@xiaowei-xie2 It does support that, since everything is handled by the MPI environment. That also means it's highly dependent on the MPI setup of the system. Depending on the connectivity of your HPC system and the system being simulated it might make sense, or not, to do this.

We should try to come up with an example on how to accomplish this that people can use and add it to the docs.

@jakublala
Copy link

Thanks for the comments here, @ijpulidos - following up on the above, is it possible to specify each replica to live on its own GPU somehow?

It seems to me that that setup show provide the best performance, or is that what happens by default under the hood?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants