Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt benchmarks configuration to be in line with updated charts. #213

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

rodrigo-lourenco-lopes
Copy link
Contributor

@rodrigo-lourenco-lopes rodrigo-lourenco-lopes commented Nov 18, 2024

This is already a very messy PR which migrates these benchmarks to work with the latest changes in the helm charts.
I will create some follow-up issues to:

  • Test and change the core resources. Now that we have the entire application in one container these need to be tested and changed in order to keeep the same perfomance.
  • Add zeebe configuratitions to our helm values here, this can be done through core.configuration but this will overide the defaults in the config values in core.configmap in the camunda-platform-helm repo, see https://github.com/camunda/camunda-platform-helm/blob/75a1a164e9c7eb4cf021d1de4eee9b5ae39b6438/charts/camunda-platform-alpha/templates/core/configmap.yaml#L17-L21
  • (this was already an existing problem before the changes) The exporters seem to be laggin behind quite a lot and exporting at a slower pace than what we expecting them to do, for example capping at around 200 to 300 while processing is at 2000 in normal benchamrks. This has been observed in many weekly benchmarks, and needs to be investigated as it can possible be a regression.Due to some errors of benchmarks out of disk space in ES in some of the latest benchmarks we decided to double it for now.

Additionally this PR also addresses #214 fixing the tests related to assuring correct configurations with the golden files.

Closes: #214

Follow up issues:

@ChrisKujawa
Copy link
Member

@rodrigo-lourenco-lopes it is about disk space not memory :D

@rodrigo-lourenco-lopes rodrigo-lourenco-lopes changed the title Increase ES memory for the benchmarks. Increase ES disk size for normal benchmarks. Nov 18, 2024
@rodrigo-lourenco-lopes
Copy link
Contributor Author

@rodrigo-lourenco-lopes it is about disk space not memory :D

Ups, I misread the intention. Just corrected it.

Downloading camunda-platform from repo oci://ghcr.io/camunda/helm
Save error occurred: could not download oci://ghcr.io/camunda/helm/camunda-platform: ghcr.io/camunda/helm/camunda-platform:0.0.0-12.0.0-alpha1: not found
Error: could not download oci://ghcr.io/camunda/helm/camunda-platform: ghcr.io/camunda/helm/camunda-platform:0.0.0-12.0.0-alpha1: not found

@Zelldon also it seems that we are having issues pulling our chart dependency, have you seen this issue before?

@ChrisKujawa
Copy link
Member

@rodrigo-lourenco-lopes it is about disk space not memory :D

Ups, I misread the intention. Just corrected it.

Downloading camunda-platform from repo oci://ghcr.io/camunda/helm
Save error occurred: could not download oci://ghcr.io/camunda/helm/camunda-platform: ghcr.io/camunda/helm/camunda-platform:0.0.0-12.0.0-alpha1: not found
Error: could not download oci://ghcr.io/camunda/helm/camunda-platform: ghcr.io/camunda/helm/camunda-platform:0.0.0-12.0.0-alpha1: not found

@Zelldon also it seems that we are having issues pulling our chart dependency, have you seen this issue before?

I haven't seen this before 🤔

@rodrigo-lourenco-lopes
Copy link
Contributor Author

This commit also changed the templates and the configuration files, we will have to adapt the templates to this camunda/camunda-platform-helm@28d7927#diff-139e41c5d8eff029161d9db039ef8f0d46df8d6de4e91e42d7abb3c2103ba61a

@rodrigo-lourenco-lopes rodrigo-lourenco-lopes changed the title Increase ES disk size for normal benchmarks. Increase ES disk size, Fix tests and fetch correct Camunda version Nov 21, 2024
@ChrisKujawa
Copy link
Member

@rodrigo-lourenco-lopes can you try to deploy this into a namespace and check whether it comes up?

Copy link
Member

@ChrisKujawa ChrisKujawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rodrigo-lourenco-lopes I think two or three things we still need to validate

charts/zeebe-benchmark/Chart.yaml Show resolved Hide resolved
charts/zeebe-benchmark/values.yaml Outdated Show resolved Hide resolved
@rodrigo-lourenco-lopes
Copy link
Contributor Author

@ChrisKujawa It is starting core pods but these are failing to initialize, do we need to change the values.yaml for this?
The other pods seem to initialize correctly.

@ChrisKujawa
Copy link
Member

@ChrisKujawa It is starting core pods but these are failing to initialize, do we need to change the values.yaml for this? The other pods seem to initialize correctly.

Looks to me that a complete new Core section was introduced https://github.com/camunda/camunda-platform-helm/blob/main/charts/camunda-platform-alpha/values.yaml#L1968-L1980

@ChrisKujawa
Copy link
Member

@ChrisKujawa It is starting core pods but these are failing to initialize, do we need to change the values.yaml for this?
The other pods seem to initialize correctly.

I will validate with stakeholders

@@ -10,7 +10,7 @@ sources:
dependencies:
- name: camunda-platform
repository: "oci://ghcr.io/camunda/helm"
version: "0.0.0-8.7.0-alpha1"
version: "0.0.0-snapshot-alpha"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ This is now a snapshot version. When alpha2 is released we can pin this to 0.0.0-8.7.0-alpha2. For now we need this to be able to use the fixes we made in the camunda platform helm charts.

@remcowesterhoud
Copy link

remcowesterhoud commented Dec 3, 2024

@rodrigo-lourenco-lopes sorry for hijacking your PR 😄 I can't assign you as a reviewer, but please have a look at the last 3 commits I made. These should fix benchmarks 🤞 Mine is currently running here

@ChrisKujawa If you have the time I'd appreciate it if you could also look at my last 3 commits 🙂

charts/zeebe-benchmark/values-realistic-benchmark.yaml Outdated Show resolved Hide resolved
charts/zeebe-benchmark/values.yaml Outdated Show resolved Hide resolved
Comment on lines 341 to 342
pvc:
accessModes: [ "ReadWriteOnce" ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤡 Uh real break of the objects names

charts/zeebe-benchmark/values.yaml Show resolved Hide resolved
charts/zeebe-benchmark/values.yaml Show resolved Hide resolved
Comment on lines -466 to -474
env:
- name: OPERATE_LOG_APPENDER
value: Stackdriver
- name: OPERATE_LOG_STACKDRIVER_SERVICENAME
value: operate
- name: OPERATE_LOG_STACKDRIVER_SERVICEVERSION
valueFrom:
fieldRef:
fieldPath: metadata.namespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh that will be painful. This means we will not filter anymore by service name in stackdriver.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're saying I should add these env variables to the core application?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general yes we need some to set the right service name. But this allows to set only one service name before we split between zeebe and operate and gateway

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand it. I've addded these env variables to the core app now, but the way it sounds that's not sufficient.


exec /usr/local/camunda/bin/camunda
application.yaml: |

spring:
profiles:
active: auth
active: "operate,tasklist,broker,auth"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Ah ok so it seems Operate is enabled.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be 🤔 I modified the SPRING_PROFILES_ACTIVE to disable it.

charts/zeebe-benchmark/templates/NOTES.txt Outdated Show resolved Hide resolved
@remcowesterhoud
Copy link

remcowesterhoud commented Dec 4, 2024

Okay I've processed some of the review comments. I can run the regular benchmarks fine manually. The realistic benchmarks don't really work. sa soon as I start Operate the readiness probe returns a 503.

I can't spend more time on this atm since I'm not medic anymore. @rodrigo-lourenco-lopes feel free to take over again 😄

FWIW, I believe we should merge this PR. It fixes the regular benchmarks so that's more than we have now and a fresh PR for any more changes would be a lot more comprehensible.

@rodrigo-lourenco-lopes
Copy link
Contributor Author

FWIW, I believe we should merge this PR. It fixes the regular benchmarks so that's more than we have now and a fresh PR for any more changes would be a lot more comprehensible.

@ChrisKujawa is it ok that we merge this now and start working on the next fixes?

Copy link
Member

@ChrisKujawa ChrisKujawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both 🚀🚀

@rodrigo-lourenco-lopes
Copy link
Contributor Author

Further context to the latest chages:

This is already a very messy PR which migrates these benchmarks to work with the latest changes in the helm charts.
I will create some follow up issues to:

  • Test and change the core resources. Now that we have the entire application in one container these need to be tested and changed in order to keeep the same perfomance.
  • Add or zeebe configuratitions to the values here, this can be done through core.configuration but will overide the defaults in the config values in core.configmap in the camunda-platform-helm repo, see https://github.com/camunda/camunda-platform-helm/blob/75a1a164e9c7eb4cf021d1de4eee9b5ae39b6438/charts/camunda-platform-alpha/templates/core/configmap.yaml#L17-L21
  • (this was already an existing problem before the changes) The exporters seem to be laggin behind quite a lot and exporting at a slower pace than what we expecting them to do, for example capping at around 200 to 300 while processing is at 2000 in normal benchamrks. This has been observed in many weekly benchmarks, and needs to be investigated as it can possible be a regression.

@rodrigo-lourenco-lopes rodrigo-lourenco-lopes changed the title Increase ES disk size, Fix tests and fetch correct Camunda version Adapt benchmarks configuration to be in line with updated charts. Dec 18, 2024
Copy link
Member

@ChrisKujawa ChrisKujawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you tested it because i feel there are some issues with it still? Not sure whether it makes sense to merge it like that tbh.

charts/zeebe-benchmark/templates/starter.yaml Outdated Show resolved Hide resolved
charts/zeebe-benchmark/templates/workers.yaml Outdated Show resolved Hide resolved
charts/zeebe-benchmark/test/golden/publisher.golden.yaml Outdated Show resolved Hide resolved
fieldPath: metadata.namespace
- name: OPERATE_LOG_APPENDER
value: Stackdriver
- name: OPERATE_LOG_STACKDRIVER_SERVICENAME
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean OPERATE_LOG_STACKDRIVER_SERVICENAME being "operate"? I just moved this var from "operate" to "core". Is the service name now different because we are in the core pod? and if so, do we also need to change the zeebe one?

- name: ZEEBE_LOG_STACKDRIVER_SERVICENAME
        value: zeebe

charts/zeebe-benchmark/values.yaml Outdated Show resolved Hide resolved
charts/zeebe-benchmark/values.yaml Outdated Show resolved Hide resolved
charts/zeebe-benchmark/values.yaml Show resolved Hide resolved
rodrigo-lourenco-lopes and others added 24 commits January 3, 2025 11:03
I have autogenerated these because there were so many changes
Add the resources previously used by the gateway to the core app.
We can't just override 1 env variable to activate operate. Instead all the env variables got overridden and thus we were missing most of them. By adding them to this values file we overcome that problem.
These images are not available in the default registry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix golden files tests after Camunda-platform-helm Changes
3 participants