Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispatcher verticles are now worker verticles #2300

Merged
merged 9 commits into from
Jun 15, 2022

Conversation

pierDipi
Copy link
Member

@pierDipi pierDipi commented Jun 13, 2022

Using worker verticles instead of event loop verticle
provides better isolation between consumers.

I added a reproducer for #2263 (thanks Marek)
that is now running in CI during KafkaChannel testing.

I've also improved logging which helped me tracking
down the issue.

As Vert.x's docs says:

Worker verticle instances are never executed
concurrently by Vert.x by more than one thread,
but can executed by different threads at different times.

That's the reason why we're moving to synchronized
and atomic data structures for dispatcher components.

Fixes #2263

@knative-prow knative-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/control-plane area/data-plane area/test approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 13, 2022
@pierDipi
Copy link
Member Author

/test channel-integration-tests-sasl-plain_eventing-kafka-broker_main

2 similar comments
@pierDipi
Copy link
Member Author

/test channel-integration-tests-sasl-plain_eventing-kafka-broker_main

@pierDipi
Copy link
Member Author

/test channel-integration-tests-sasl-plain_eventing-kafka-broker_main

@pierDipi
Copy link
Member Author

/test channel-integration-tests-sasl-plain_eventing-kafka-broker_main
/test channel-integration-tests-ssl_eventing-kafka-broker_main
/test channel-integration-tests-sasl-ssl_eventing-kafka-broker_main

@pierDipi
Copy link
Member Author

/test channel-integration-tests-sasl-plain_eventing-kafka-broker_main

@pierDipi
Copy link
Member Author

Worked, retesting:

/test channel-integration-tests-sasl-ssl_eventing-kafka-broker_main
/test channel-integration-tests-ssl_eventing-kafka-broker_main

@pierDipi
Copy link
Member Author

unrelated EOF error knative/pkg#1509:

Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "sinkbindings.webhook.sources.knative.dev": failed to call webhook: Post "[https://eventing-webhook.knative-eventing.svc:443/sinkbindings?timeout=10s](https://eventing-webhook.knative-eventing.svc/sinkbindings?timeout=10s)": EOF

and timeout on KafkaChannel readiness:

kafkachannel.messaging.knative.dev/channel created
subscription.messaging.knative.dev/event-display created
service/event-display created
deployment.apps/event-display created
sinkbinding.sources.knative.dev/bind-heartbeat created
deployment.apps/hearbeat created
error: timed out waiting for the condition on kafkachannels/channel

@pierDipi
Copy link
Member Author

Retesting the EOF error:
/test channel-integration-tests-sasl-ssl_eventing-kafka-broker_main

@pierDipi
Copy link
Member Author

There are no nodes that your pod can schedule to - check your requests, tolerations, and node selectors (skip schedule deleting pod: test-pods/e7440397-ebcc-11ec-a2fb-666d91f9e04d)

@pierDipi
Copy link
Member Author

/retest-required

@codecov
Copy link

codecov bot commented Jun 14, 2022

Codecov Report

Merging #2300 (f61381e) into main (ebb10be) will decrease coverage by 0.07%.
The diff coverage is 71.79%.

@@             Coverage Diff              @@
##               main    #2300      +/-   ##
============================================
- Coverage     65.51%   65.44%   -0.08%     
  Complexity      699      699              
============================================
  Files           140      140              
  Lines          9216     9240      +24     
  Branches        209      211       +2     
============================================
+ Hits           6038     6047       +9     
- Misses         2777     2789      +12     
- Partials        401      404       +3     
Flag Coverage Δ
java-unittests 81.64% <71.79%> (-0.57%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...e/eventing/kafka/broker/core/file/FileWatcher.java 74.68% <0.00%> (-0.96%) ⬇️
...ka/broker/core/reconciler/ResourcesReconciler.java 100.00% <ø> (ø)
...patcher/impl/consumer/OrderedConsumerVerticle.java 70.52% <55.17%> (-4.48%) ⬇️
...ispatcher/impl/http/WebClientCloudEventSender.java 60.43% <57.14%> (-0.03%) ⬇️
...ve/eventing/kafka/broker/core/metrics/Metrics.java 84.61% <66.66%> (-1.60%) ⬇️
...tcher/impl/consumer/UnorderedConsumerVerticle.java 77.55% <70.00%> (-2.05%) ⬇️
...a/broker/dispatcher/impl/RecordDispatcherImpl.java 87.91% <77.77%> (-0.49%) ⬇️
...dispatcher/impl/consumer/BaseConsumerVerticle.java 87.09% <83.33%> (-6.66%) ⬇️
...ive/eventing/kafka/broker/core/AsyncCloseable.java 76.92% <100.00%> (-14.75%) ⬇️
.../core/reconciler/impl/ResourcesReconcilerImpl.java 90.66% <100.00%> (+0.38%) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebb10be...f61381e. Read the comment docs.

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 14, 2022
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 14, 2022
@pierDipi
Copy link
Member Author

Job timed out after 2h but it was working

{"component":"entrypoint","file":"k8s.io/test-infra/prow/entrypoint/run.go:255","func":"k8s.io/test-infra/prow/entrypoint.gracefullyTerminate","level":"error","msg":"Process did not exit before 15s grace period","severity":"error","time":"2022-06-14T13:59:23Z"}

@pierDipi
Copy link
Member Author

/test channel-integration-tests-sasl-plain_eventing-kafka-broker_main

@pierDipi pierDipi changed the title [WIP] Run custom KafkaChannel test [WIP] Dispatcher verticles are now worker verticles Jun 14, 2022
@pierDipi
Copy link
Member Author

/test integration-tests_eventing-kafka-broker_main

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 14, 2022
pierDipi added 3 commits June 14, 2022 20:18
Signed-off-by: Pierangelo Di Pilato <[email protected]>
Signed-off-by: Pierangelo Di Pilato <[email protected]>
Signed-off-by: Pierangelo Di Pilato <[email protected]>
pierDipi added 5 commits June 14, 2022 20:19
Signed-off-by: Pierangelo Di Pilato <[email protected]>
Signed-off-by: Pierangelo Di Pilato <[email protected]>
Signed-off-by: Pierangelo Di Pilato <[email protected]>
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 14, 2022
@pierDipi
Copy link
Member Author

/retest-required

@pierDipi pierDipi changed the title [WIP] Dispatcher verticles are now worker verticles Dispatcher verticles are now worker verticles Jun 15, 2022
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 15, 2022
Copy link
Member

@aliok aliok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added a minor comment.

My knowledge on Vert.x is rather limited, so, LGTM since tests are green and the reproducer is fixed

@aliok
Copy link
Member

aliok commented Jun 15, 2022

/lgtm
/approve

Thanks!

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jun 15, 2022
@knative-prow
Copy link

knative-prow bot commented Jun 15, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aliok, pierDipi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pierDipi
Copy link
Member Author

/test integration-tests_eventing-kafka-broker_main

@knative-prow knative-prow bot merged commit b3d50ad into knative-extensions:main Jun 15, 2022
@pierDipi pierDipi deleted the SRVKE-1237 branch June 15, 2022 10:39
@pierDipi
Copy link
Member Author

/cherry-pick release-1.4

@pierDipi
Copy link
Member Author

/cherry-pick release-1.5

@knative-prow-robot
Copy link
Contributor

@pierDipi: #2300 failed to apply on top of branch "release-1.4":

Applying: Add reproducer as test
Applying: WIP fix
Using index info to reconstruct a base tree...
M	data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/AsyncCloseable.java
M	data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/metrics/Metrics.java
M	data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/reconciler/impl/ResourcesReconcilerMessageHandler.java
M	data-plane/dispatcher/src/main/java/dev/knative/eventing/kafka/broker/dispatcher/impl/consumer/BaseConsumerVerticle.java
M	data-plane/dispatcher/src/main/java/dev/knative/eventing/kafka/broker/dispatcher/impl/consumer/OrderedConsumerVerticle.java
Falling back to patching base and 3-way merge...
Auto-merging data-plane/dispatcher/src/main/java/dev/knative/eventing/kafka/broker/dispatcher/impl/consumer/OrderedConsumerVerticle.java
CONFLICT (content): Merge conflict in data-plane/dispatcher/src/main/java/dev/knative/eventing/kafka/broker/dispatcher/impl/consumer/OrderedConsumerVerticle.java
Auto-merging data-plane/dispatcher/src/main/java/dev/knative/eventing/kafka/broker/dispatcher/impl/consumer/BaseConsumerVerticle.java
Auto-merging data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/reconciler/impl/ResourcesReconcilerMessageHandler.java
CONFLICT (content): Merge conflict in data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/reconciler/impl/ResourcesReconcilerMessageHandler.java
Auto-merging data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/metrics/Metrics.java
CONFLICT (content): Merge conflict in data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/metrics/Metrics.java
Auto-merging data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/AsyncCloseable.java
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 WIP fix
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot
Copy link
Contributor

@pierDipi: #2300 failed to apply on top of branch "release-1.5":

Applying: Add reproducer as test
Applying: WIP fix
Using index info to reconstruct a base tree...
M	data-plane/dispatcher/src/main/java/dev/knative/eventing/kafka/broker/dispatcher/impl/consumer/OrderedConsumerVerticle.java
Falling back to patching base and 3-way merge...
Auto-merging data-plane/dispatcher/src/main/java/dev/knative/eventing/kafka/broker/dispatcher/impl/consumer/OrderedConsumerVerticle.java
CONFLICT (content): Merge conflict in data-plane/dispatcher/src/main/java/dev/knative/eventing/kafka/broker/dispatcher/impl/consumer/OrderedConsumerVerticle.java
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 WIP fix
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pierDipi added a commit to pierDipi/eventing-kafka-broker that referenced this pull request Jul 6, 2022
* Add reproducer as test

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* WIP fix

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Use for loop in reproducer

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Fix reproducer script path

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Handle closed sender

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Fix unit tests

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Describe KafkaChannel and Subscription for debugging

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Move reproducer to test location and use a smaller number

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Revert meterBinderExecutor thread pool changes

Signed-off-by: Pierangelo Di Pilato <[email protected]>
pierDipi added a commit to pierDipi/eventing-kafka-broker that referenced this pull request Jul 6, 2022
* Add reproducer as test

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* WIP fix

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Use for loop in reproducer

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Fix reproducer script path

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Handle closed sender

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Fix unit tests

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Describe KafkaChannel and Subscription for debugging

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Move reproducer to test location and use a smaller number

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Revert meterBinderExecutor thread pool changes

Signed-off-by: Pierangelo Di Pilato <[email protected]>
knative-prow bot pushed a commit that referenced this pull request Jul 6, 2022
* Add reproducer as test

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* WIP fix

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Use for loop in reproducer

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Fix reproducer script path

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Handle closed sender

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Fix unit tests

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Describe KafkaChannel and Subscription for debugging

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Move reproducer to test location and use a smaller number

Signed-off-by: Pierangelo Di Pilato <[email protected]>

* Revert meterBinderExecutor thread pool changes

Signed-off-by: Pierangelo Di Pilato <[email protected]>
@pierDipi pierDipi restored the SRVKE-1237 branch August 5, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane area/data-plane area/test lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
3 participants