[Feature Request] add a memory logging manager #3300

sayakpaul · 2024-12-17T02:29:52Z

Logging accelerator and CPU memory during training models is a common scenario all practitioners run into. As discussed with @muellerzr over Slack, accelerate should have a utility for that (its location would be here).

peft has a good reference here:
https://github.com/huggingface/peft/blob/ae55fdcc5c4830e0f9fb6e56f16555bafca392de/examples/oft_dreambooth/train_dreambooth.py#L421

Opening it up as a feature request in case anyone's interested in contributing this. This would be a massive help, IMO.

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2024-12-17T13:39:04Z

To clarify, you would like to see memory usage over time, is that right? I wonder how much could be already covered by the profiler integration.

sayakpaul · 2024-12-17T14:29:07Z

To clarify, you would like to see memory usage over time, is that right?

Yes. In this case, users probably won't care about this level of granularity (as they can always use the PT profiler if they do) but rather aggregated statistics on a few of the important metrics logged by https://github.com/huggingface/peft/blob/ae55fdcc5c4830e0f9fb6e56f16555bafca392de/examples/oft_dreambooth/train_dreambooth.py#L421.

BenjaminBossan · 2024-12-17T14:36:55Z

Not sure if the torch profiler can be configured to give the information on that aggregation level. But even if it's possible, it should at least be documented, as many users are probably interested in this higher level view.

sayakpaul · 2024-12-17T14:41:37Z

Yeah if it can be done, I think having it documented even in accelerate should be enough with a minimal end-to-end example.

BenjaminBossan · 2024-12-18T12:09:26Z

Pinging @yhna940 who added the profiler, maybe they have an idea.

sayakpaul added feature request Request for a new feature to be added to Accelerate contributions-welcome labels Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] add a memory logging manager #3300

[Feature Request] add a memory logging manager #3300

sayakpaul commented Dec 17, 2024

BenjaminBossan commented Dec 17, 2024

sayakpaul commented Dec 17, 2024

BenjaminBossan commented Dec 17, 2024

sayakpaul commented Dec 17, 2024 •

edited

Loading

BenjaminBossan commented Dec 18, 2024

[Feature Request] add a memory logging manager #3300

[Feature Request] add a memory logging manager #3300

Comments

sayakpaul commented Dec 17, 2024

BenjaminBossan commented Dec 17, 2024

sayakpaul commented Dec 17, 2024

BenjaminBossan commented Dec 17, 2024

sayakpaul commented Dec 17, 2024 • edited Loading

BenjaminBossan commented Dec 18, 2024

sayakpaul commented Dec 17, 2024 •

edited

Loading