Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an arena for allocating strings. #2532

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

sesse
Copy link

@sesse sesse commented Nov 25, 2024

This is the patch we took out during the review of PR #2519, now rebased and re-measured. It's much less memory-bad, but still wins ~12%.

Benchmark 1: ./ninja.old -C ~/chromium/src/out/Default content_shell
  Time (mean ± σ):      4.608 s ±  0.014 s    [User: 3.073 s, System: 1.531 s]
  Range (min … max):    4.590 s …  4.634 s    10 runs
 
Benchmark 2: ./ninja -C ~/chromium/src/out/Default content_shell
  Time (mean ± σ):      4.084 s ±  0.009 s    [User: 2.558 s, System: 1.523 s]
  Range (min … max):    4.068 s …  4.097 s    10 runs
 
Summary
  ./ninja -C ~/chromium/src/out/Default content_shell ran
    1.13 ± 0.00 times faster than ./ninja.old -C ~/chromium/src/out/Default content_shell

What do you think? Should we still pursue this?

This reduces the amount of malloc traffic significantly,
speeding up parsing. For a no-op build of Chromium (Linux, Zen 2),
this reduces time spent from 4.61 to 4.08 seconds. However, note
that it also increases RSS from 914 to 937 MB; I haven't looked
deeply into why, but it's reasonable to assume that this is related
to the fact that we no longer merge small strings together (since
they are now immutable).

We still use some time in actually copying the string into the arena,
but it seems this is cheaper than just persisting the file contents
wholesale and pointing into that.
@sesse
Copy link
Author

sesse commented Dec 17, 2024

I moved so that the EvalStrings used in Rule are now owned by a smaller arena on the BindingEnv instead of being allocated on the heap and leaked.

@elliotgoodrich
Copy link

I spent a bit of time removing allocations in trimja and the main changes related to EvalString were to:

  1. Change EvalString so that its puts all segments in a single std::string so its full capacity is retained when cleared (trimja/evalstring.h)
  2. Structure the ManifestParser to reuse the same EvalString throughout parsing (m_storage)

Then the only copies you need to make of EvalString is in Rule::AddBinding, which would be faster as the copy constructor of trimja's EvalString only needs 1 allocation and copy of contiguous data. Otherwise, during parsing the capacity is reused in the single EvalString and eventually we'll stop any allocations.

Maybe changing the entire ManifestParser is too much work, but it looks possible to add an EvalString member variable and reuse this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants