High memory allocation in prometheus.MakeLabelPair #1702

jkroepke · 2024-12-25T18:04:57Z

While investigate memory leak with windows_exporter, I figure out that there is a decent memory allocation rate in prometheus.MakeLabelPairs.

windows_exporter (like node_exporter) generate a high amount of ConstMetrics on each scrape call. The process collector expose around 6200 metrics on each call, resulting in 6200 MustNewConstMetric calls on each scrape.

I ask myself, if this can be optimized somehow. The heap dump is generate after 1 days of runtime, with 5 seconds scrape interval (to simulate a long running scenario)

the dto.LabelPair struct contains other struct related to protobuf. Not sure, if they needed. Prometheus 2.0 does not support protobuf metris scrape, it feels like an expensive leftover.

Flame Graph:

Source View:

Heap Dump via pprof: memory_profile_windows.pb.gz

The text was updated successfully, but these errors were encountered:

dosubot · 2024-12-25T18:06:16Z

Hey @jkroepke! I'm here to help you with bugs, questions, and becoming a contributor. Let's tackle this issue together!

The MakeLabelPairs function in the Prometheus client_golang library is indeed a point of interest when it comes to memory allocation, especially in scenarios like the one you described with the windows_exporter. This function constructs label pairs for metrics, which can lead to significant memory usage when a large number of metrics are generated, such as the 6200 MustNewConstMetric calls per scrape in your case.

The function preallocates a slice for label pairs to minimize memory allocations during appends, but the creation of dto.LabelPair instances, which involve protobuf-related structures, can still contribute to memory usage ^[1]. Since Prometheus 2.0 does not support protobuf metrics scrape, these structures might seem like an unnecessary overhead.

To optimize memory usage, consider the following approaches:

Reduce Metric Volume: If possible, reduce the number of metrics being generated or exposed. This might involve aggregating metrics or selectively disabling certain metrics that are not critical.
Optimize Label Usage: Minimize the number of labels and their cardinality. High cardinality can lead to increased memory usage due to the large number of unique label combinations.
Custom Implementation: If the protobuf structures are indeed unnecessary for your use case, consider implementing a custom version of MakeLabelPairs that avoids using these structures, though this would require maintaining a fork or custom build of the library.
Review Scrape Interval: Adjust the scrape interval to balance between data granularity and resource usage.

These strategies can help mitigate the memory allocation issues you're experiencing.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

bwplotka · 2025-01-21T16:45:05Z

Yup, that's a known problem for the Const* metric users. A great showcase is in cadvisor also impacted by this. Kube state metrics get rid of client_golang for this reason for the scrape endpoints.

At some point I tried to optimize cadvisor code to essentially cache a metrics and update them in place or so, which is not trivial for the bigger codebase: google/cadvisor#2974

There's definitely room to either make mutable caching layer for bigger exporters efficiency or make LabelPairs more efficient in the current NewConst* API. For the latter, we could also try a global (or local) set of sync.Pool pools, or something longer term for both LabelPair and LabelPair slices (typically small). It feels collaborating on efficient mutable dto.Metric or something for exporter use has a bigger effect.

the dto.LabelPair struct contains other struct related to protobuf. Not sure, if they needed. Prometheus 2.0 does not support protobuf metris scrape, it feels like an expensive leftover.

That's true the LabelPair struct itself could be smaller. However we DO support and intend to have some protobuf metric scrape, so that part has to work (marshalling/unmarshalling). Maybe we could get rid of unknown field support for this type.

bwplotka added the enhancement label Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High memory allocation in prometheus.MakeLabelPair #1702

High memory allocation in prometheus.MakeLabelPair #1702

jkroepke commented Dec 25, 2024

dosubot bot commented Dec 25, 2024

bwplotka commented Jan 21, 2025

High memory allocation in prometheus.MakeLabelPair #1702

High memory allocation in prometheus.MakeLabelPair #1702

Comments

jkroepke commented Dec 25, 2024

dosubot bot commented Dec 25, 2024

bwplotka commented Jan 21, 2025