Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] add a memory logging manager #3300

Open
sayakpaul opened this issue Dec 17, 2024 · 5 comments
Open

[Feature Request] add a memory logging manager #3300

sayakpaul opened this issue Dec 17, 2024 · 5 comments
Labels
contributions-welcome feature request Request for a new feature to be added to Accelerate

Comments

@sayakpaul
Copy link
Member

Logging accelerator and CPU memory during training models is a common scenario all practitioners run into. As discussed with @muellerzr over Slack, accelerate should have a utility for that (its location would be here).

peft has a good reference here:
https://github.com/huggingface/peft/blob/ae55fdcc5c4830e0f9fb6e56f16555bafca392de/examples/oft_dreambooth/train_dreambooth.py#L421

Opening it up as a feature request in case anyone's interested in contributing this. This would be a massive help, IMO.

@sayakpaul sayakpaul added feature request Request for a new feature to be added to Accelerate contributions-welcome labels Dec 17, 2024
@BenjaminBossan
Copy link
Member

To clarify, you would like to see memory usage over time, is that right? I wonder how much could be already covered by the profiler integration.

@sayakpaul
Copy link
Member Author

To clarify, you would like to see memory usage over time, is that right?

Yes. In this case, users probably won't care about this level of granularity (as they can always use the PT profiler if they do) but rather aggregated statistics on a few of the important metrics logged by https://github.com/huggingface/peft/blob/ae55fdcc5c4830e0f9fb6e56f16555bafca392de/examples/oft_dreambooth/train_dreambooth.py#L421.

@BenjaminBossan
Copy link
Member

Not sure if the torch profiler can be configured to give the information on that aggregation level. But even if it's possible, it should at least be documented, as many users are probably interested in this higher level view.

@sayakpaul
Copy link
Member Author

sayakpaul commented Dec 17, 2024

Yeah if it can be done, I think having it documented even in accelerate should be enough with a minimal end-to-end example.

@BenjaminBossan
Copy link
Member

Pinging @yhna940 who added the profiler, maybe they have an idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributions-welcome feature request Request for a new feature to be added to Accelerate
Projects
None yet
Development

No branches or pull requests

2 participants