Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDN-4168: Cleanup ipsec state only when ipsec is not full mode #2611

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

pperiyasamy
Copy link
Member

@pperiyasamy pperiyasamy commented Jan 9, 2025

This PR does the following to fixes to prevent unnecessary ipsec service restart, ip xfrm state policy cleanups while bringing up ipsec-host pod. This would potentially avoid reestablishment of IKE SAs during ipsec pod restarts and let OVN networking pods traffic go on without any packet drops.

  1. There is an incorrect check in ipsec pod clean up logic which removes /etc/ipsec.d/openshift.conf file, ip xfrm state and policy entries in all cases, but these must be removed only when ipsec mode is changed from full to external or disabled.
  2. We don't need narrowing=yes option to be set explicitly anymore because system default crypto policies are commented out now, otherwise TS_UNACCEPTABLE error is seen temporarily at the time of ipsec service restart.
  3. The IPsec service restart is needed only at the time of specific IPsec config changes, so doing ipsec service only at the time commenting out default crypto-policies conf file.

There is an incorrect check while cleaning up ipsec state upon deleting ipsec pod
which removes states in all cases, so this fix removes state only when ipsec mode
is not full mode.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jan 9, 2025

@pperiyasamy: This pull request references SDN-4168 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

There is an incorrect check while cleaning up ipsec state upon deleting ipsec pod which removes states in all cases, so this fix removes state only when ipsec mode is not full mode.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 9, 2025
@openshift-ci openshift-ci bot requested review from trozet and tssurya January 9, 2025 10:33
Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pperiyasamy
Once this PR has been reviewed and has the lgtm label, please assign kyrtapz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-ovn-ipsec-step-registry openshift/origin#29232

This reverts commit e0bfa7e.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
Signed-off-by: Periyasamy Palanisamy <[email protected]>
@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-ovn-ipsec-step-registry openshift/origin#29232

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jan 9, 2025

@pperiyasamy: This pull request references SDN-4168 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

There is an incorrect check while cleaning up ipsec state upon deleting ipsec pod which removes states in all cases, so this fix removes state only when ipsec mode is not full mode.

Seeing disruptive events being thrown for ipsec pod restart test, It's going to be fixed with this PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

@pperiyasamy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-local-to-shared-gateway-mode-migration ea1d489 link false /test e2e-aws-ovn-local-to-shared-gateway-mode-migration
ci/prow/e2e-network-mtu-migration-ovn-ipv6 ea1d489 link false /test e2e-network-mtu-migration-ovn-ipv6
ci/prow/e2e-metal-ipi-ovn-ipv6 ea1d489 link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/security ea1d489 link false /test security
ci/prow/e2e-aws-ovn-serial ea1d489 link false /test e2e-aws-ovn-serial
ci/prow/4.18-upgrade-from-stable-4.17-e2e-azure-ovn-upgrade ea1d489 link false /test 4.18-upgrade-from-stable-4.17-e2e-azure-ovn-upgrade
ci/prow/e2e-aws-ovn-single-node ea1d489 link false /test e2e-aws-ovn-single-node
ci/prow/e2e-aws-ovn-upgrade ea1d489 link true /test e2e-aws-ovn-upgrade
ci/prow/4.18-upgrade-from-stable-4.17-e2e-aws-ovn-upgrade ea1d489 link false /test 4.18-upgrade-from-stable-4.17-e2e-aws-ovn-upgrade
ci/prow/e2e-aws-hypershift-ovn-kubevirt ea1d489 link false /test e2e-aws-hypershift-ovn-kubevirt
ci/prow/e2e-vsphere-ovn-dualstack-primaryv6 ea1d489 link false /test e2e-vsphere-ovn-dualstack-primaryv6
ci/prow/e2e-metal-ipi-ovn-ipv6-ipsec ea1d489 link true /test e2e-metal-ipi-ovn-ipv6-ipsec
ci/prow/4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-upgrade ea1d489 link false /test 4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-upgrade
ci/prow/e2e-openstack-ovn ea1d489 link false /test e2e-openstack-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-ovn-ipsec-step-registry openshift/origin#29232

@pperiyasamy
Copy link
Member Author

The [sig-arch][Late] operators should not create watch channels very often [apigroup:apiserver.openshift.io] [Suite:openshift/conformance/parallel] test is failing in the run https://prow.ci.openshift.org/view/gs/test-platform-results/logs/multi-pr-openshift-cluster-network-operator-2611-openshift-origin-29232-e2e-ovn-ipsec-step-registry/1877412979897536512 upon ipsec pod reboots.
It happens even when there is no ipsec state/policy cleanup, no pluto restart. Needs investigation...

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-ovn-ipsec-step-registry openshift/origin#29232

@pperiyasamy
Copy link
Member Author

The [sig-arch][Late] operators should not create watch channels very often [apigroup:apiserver.openshift.io] [Suite:openshift/conformance/parallel] test is failing in the run https://prow.ci.openshift.org/view/gs/test-platform-results/logs/multi-pr-openshift-cluster-network-operator-2611-openshift-origin-29232-e2e-ovn-ipsec-step-registry/1877412979897536512 upon ipsec pod reboots. It happens even when there is no ipsec state/policy cleanup, no pluto restart. Needs investigation...

it may be a flaky test, tracking it via bug https://issues.redhat.com/browse/OCPBUGS-46414.

@pperiyasamy
Copy link
Member Author

/assign @jcaamano @trozet

@pperiyasamy
Copy link
Member Author

/retest

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jan 13, 2025

@pperiyasamy: This pull request references SDN-4168 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

This PR does the following to fixes to prevent unnecessary ipsec service restart, ip xfrm state policy cleanups while bringing up ipsec-host pod. This would potentially avoid reestablishment of IKE SAs during ipsec pod restarts and let OVN networking pods traffic go on without any packet drops.

  1. There is an incorrect check in ipsec pod clean up logic which removes /etc/ipsec.d/openshift.conf file, ip xfrm state and policy entries in all cases, but these must be removed only when ipsec mode is changed from full to external or disabled.
  2. We don't need narrowing=yes option to be set explicitly anymore because system default crypto policies are commented out now, otherwise TS_UNACCEPTABLE error is seen temporarily at the time of ipsec service restart.
  3. The IPsec service restart is needed only at the time of specific IPsec config changes, so doing ipsec service only at the time commenting out default crypto-policies conf file.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants