Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using full command line as GC hints key causes key proliferation #19308

Closed
gjdeval opened this issue Apr 11, 2024 · 16 comments · Fixed by #20366
Closed

Using full command line as GC hints key causes key proliferation #19308

gjdeval opened this issue Apr 11, 2024 · 16 comments · Fixed by #20366

Comments

@gjdeval
Copy link

gjdeval commented Apr 11, 2024

The Java command line may have unique elements which are irrelevant to GC hints, causing multiple keys to be created when the GC hints are going to be the same. Using the entire command line can also make the keys very long and complex, which increases the work to search for a matching key when a new JVM instance is launched. This extra unnecessary work slows down JVM startup.

@pshipton
Copy link
Member

@hangshao0
Copy link
Contributor

There is an ongoing PR that avoid finding/storing GC hints in SCC in certain cases: #19305.

@amicic
Copy link
Contributor

amicic commented Apr 11, 2024

There is a huge list of 'elements' that are irrelevant to GC hints. Probably the complement list is much shorter, mostly GC command line options (for example, tenure age threshold that may effect heap expansion dynamics). Either way it's not easy to identify it precisely...

As far as reducing find/store operations on a given JVM run, beside fully expanded heap scenario that we are addressing, we could perhaps avoid storing new hints if they are close to the existing value.

@hangshao0
Copy link
Contributor

The original discussions for the key of the GC hints are here #3743. As there could be many options (combinations), the decision at that time was to use the whole command line.

@dmitripivkine
Copy link
Contributor

dmitripivkine commented Apr 11, 2024

This is an enhancement. The idea is to extract significant parameters from java command line and ignore insignificant for hint search. For example customer has unique ID identifier for each run. It prevents finding the hint using exact command line match. Also may be it makes sense to limit number of stored hints to prevent long search. I guess these improvements might help in general, not only for GC related items.

@hangshao0
Copy link
Contributor

It is also possible to store the GC hints in JVM exit phase rather than JVM startup, in this way users won't experience the possible performance impact during startup .

We have existing options like -Xscminaot<size> and -Xscmaxjitdata<size> to limit data stored by AOT and JIT to the shared cache. We could have a similar option for GC hints if it is not easy to identify the set of relevant options in the command line. In this case, we need to determine the default max number of GC hints allowed.

@hangshao0
Copy link
Contributor

Talked to @amicic @dmitripivkine, we still want to store the GC hints in the startup phase so that other JVMs startup together could benefit from the GC hints. We could add a limit for the number of the GC hints that can be added to the SCC, so that the SCC will not be gradually filled up with the hints.

@hangshao0
Copy link
Contributor

It is worth mentioning for this particular case, the unique ID of each run is in the java command line arguments, not the JVM command line options.

@tajila
Copy link
Contributor

tajila commented Apr 17, 2024

Here is my understanding, there are two issues:

  1. GC may be query SCC for hints more than it needs to. This is being addressed in issues like Don't store or find heap size hints when heap fully expanded #19305. So I dont think we need to be concerned with that in this issue.

  2. SCC uses the cmdline to track and associate gc hints with different applications. Using the entire cmdline is important because the behaviour of the application may change when a single parameter is change (JVM param or Application param).

For 2) Peter suggested adding a new capability (-Xshareclass:appConfig=[configName] or something like that). This option will notify SCC (for the purposes of GC hints) that all applications with the specified config can be treated as the same. So if a config is specified one does not need to store the cmdline.

What do you think of this approach @gjdeval ?

@gjdeval
Copy link
Author

gjdeval commented Apr 18, 2024

How would this new capability (-Xshareclass:appConfig=[configName] or something like that) be configured?

If this setting must be manually configured by the system operator, I wonder how useful it will be ... even with my long GC experience, I would not know how to decide which applications should be grouped together for GC hints identification.

@pshipton
Copy link
Member

You don't need to group any applications together, you can use a separate config for each application. The VM can't separate applications from each other if the command line for an application is changing from run to run.

@tajila
Copy link
Contributor

tajila commented Apr 23, 2024

Here is the latest on this issue.

  1. If a user specifies a non-default named SCC then we will interpret that as a user specifying a config (describe above) for an application. So we will not save cmdlines and assume that every invocation is part of the same config. We will also provide an option to disable this behaviour.

  2. If a user uses the default SCC then we will have the same behaviour as we currently do. However, we will only store N cmdlines (we will provide an option to toggle N, we can use 16 as a default). When the JVM starts up, if the cmdline is not found in the SCC, that invocation will not use any GC hints.

Any questions on the feasibility of this approach @hangshao0 ?

@hangshao0
Copy link
Contributor

hangshao0 commented Apr 24, 2024

I agree that we should keep a maximum number of GC hints that can be stored into the SCC so that it will not gradually fill up the cache, like what is happening in this case.

One more thing I want to mention is that we are not doing a linear search to find the GC hints. It is a hashtable lookup where the command line string is the key. In this case as the java command line argument changes in every run, no existing GC hints will be found and one more new hint will be stored each run. It will create one more store contention if there are multiple JVMs starting up together. This perf impact can be measured comparing to a run using -XX:-UseGCStartupHints.

@hangshao0
Copy link
Contributor

For multiple JVMs starting up together sharing the same GC hint (under the same config), there could also be one more store contention if they go to update the hint.

@dmitripivkine
Copy link
Contributor

Removing comp:c tag, an implementation is on VM side

Copy link

Issue Number: 19308
Status: Closed
Actual Components: enhancement, comp:vm, userRaised
Actual Assignees: No one :(
PR Assignees: No one :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants