-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Test udn node scale + informer multiplexing #2378
base: master
Are you sure you want to change the base?
Conversation
/hold Just for testing |
c9fe0f9
to
00d6f2f
Compare
/test |
@jtaleric: The
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test qe-perfscale-aws-ovn-small-udn-density-l3 |
d60dde2
to
c3d3b62
Compare
c3d3b62
to
2ac90fa
Compare
Routes via mp0 were being deleted on every ovnkube-node restart: [root@ovn-worker ~]# ip monitor route Deleted 192.72.3.0/24 dev ovn-k8s-mp0 proto kernel scope link src 192.72.3.2 Deleted broadcast 192.72.3.255 dev ovn-k8s-mp0 table local proto kernel scope link src 192.72.3.2 Deleted local 192.72.3.2 dev ovn-k8s-mp0 table local proto kernel scope host src 192.72.3.2 local 192.72.3.2 dev ovn-k8s-mp0 table local proto kernel scope host src 192.72.3.2 broadcast 192.72.3.255 dev ovn-k8s-mp0 table local proto kernel scope link src 192.72.3.2 This causes traffic outage during upgrade, as well as other unwanted side effects when pod-destined traffic is routed via default gateway route in the host. This is especially disruptive in local gateway mode. This patch removes the teardown, and then makes the synchronization of addresses and routes more robust, so that we can safely handle changes to MTU or mp0 addresses. Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Next need to: - add mgmt port mac address - update l2 secondary - update node tracker Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
- Adds locking to protect node/UDNNode syncmap updates - Adds UDNNode client support to node controller for updating mac - make codegen Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
We still need workqueue and retry for nodetracker, as well as to abstract to a singleton. The start up time of adding all nodes is a waste everytime we create a new UDN. Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Make UDN Node informer have 15 threads Make Network Manager and NAD controller have higher thread count Signed-off-by: Tim Rozet <[email protected]>
From the documentation of the PollUntilContextTimeout (used by startupWaiter): // ConditionWithContextFunc returns true if the condition is satisfied, or an error // if the loop should be aborted. In the readyFunc callback we were incorrectly returning an error if the flows couldn't be found in the current try, causing the waiter to bail early, essentially never retrying. Change this to log the error inside the callback instead of returning it. Signed-off-by: Dumitru Ceara <[email protected]>
The netConfig map is accessed from multiple threads (e.g., for UDNs). Signed-off-by: Dumitru Ceara <[email protected]>
Signed-off-by: Dumitru Ceara <[email protected]>
Increases async performance of informer cache being able to always queue events and not blocking while performing ADD/UPDATE/DELETE operation. Signed-off-by: Tim Rozet <[email protected]>
2ac90fa
to
914c896
Compare
914c896
to
92ffec2
Compare
Add a pool of Event handlers instead of a single (federated) event handler per informer. Ensure a controller always gets registers with the same event handler. TODO: this is used by the secondary L3 network controller and by the cluster manager for now. Signed-off-by: Dumitru Ceara <[email protected]>
Always use pool entry with index 0 for the default network controller. Bump the queue sizes to 1K (except for the initial sync queue, keep that small enough to avoid contention on handler initial processing). Signed-off-by: Dumitru Ceara <[email protected]>
Signed-off-by: Patryk Diak <[email protected]>
Signed-off-by: Dumitru Ceara <[email protected]>
Compare annotations directly if possible. For network specific map entries only compare raw json entries without parsing the map in full. Co-authored-by: Tim Rozet <[email protected]> Signed-off-by: Patryk Diak <[email protected]>
Instead of always parsing all node/join subnets parse the raw json map and only compute the results for the affected network. Signed-off-by: Patryk Diak <[email protected]>
Signed-off-by: Patryk Diak <[email protected]>
92ffec2
to
f5e4576
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dceara The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
8266a3d
to
4d764f9
Compare
Signed-off-by: Dumitru Ceara <[email protected]>
Signed-off-by: Dumitru Ceara <[email protected]>
Signed-off-by: Dumitru Ceara <[email protected]>
Today ovnkube-controller relies on ovnkube-node to annotate the mac address so that it can properly program the OVN mgmt port. In practice with UDN, this means cluster manager creates the node and allocates some subnet info to the node, both the node and the ovnkube-controller side get updates, but ovnkube-controller fails to program the first time because it is waiting for the node side to configure the mac address. This patch changes the behavior of ovn-kubernetes to calculate the MAC address for the management port from the mgmt port IP address of the first subnet for the network. For backwards compatibility, it will first attempt to read the node annotation, and if it no mgmt MAC exists, it will derive it from the mgmt IP of the subnet. Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Dumitru Ceara <[email protected]>
Signed-off-by: Dumitru Ceara <[email protected]>
4d764f9
to
ad1682f
Compare
@dceara: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
π Description
Fixes #
Additional Information for reviewers
β Checks
How to verify it