-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete next cell irrespective of last deletion #901
base: main
Are you sure you want to change the base?
Conversation
Right now when multiple cells gets deleted if any one the cell deletion fails, the control exits with error msg. This change stores the errors in a list and let next cells deleted. Closes #OSPRH-10550
The jira report says, we need to do this for cellCreation as well, but I think its already taken care of at here https://github.com/openstack-k8s-operators/nova-operator/blob/main/controllers/nova_controller.go#L328-L356 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some func tests for cell deletion to assert if deletion is working and returning errors for all cells. Also it will be nice to assert what are conditions of nova instance
00be5c8
to
832bf37
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: auniyal61 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
b8de3bd
to
25cc5c6
Compare
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/55d93ec55a534c64a74ed0e2e7f7761a ✔️ openstack-meta-content-provider SUCCESS in 47m 42s |
ea9aefb
to
fe73ce1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deletion logic looks good with a small request about ordering. I have couple of comments on the testing side.
} | ||
|
||
if len(deleteErrs) > 0 { | ||
return ctrl.Result{}, fmt.Errorf("errors: %v", deleteErrs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can user errors.Join() to wrap the list of errors into a single error without the string conversion. https://pkg.go.dev/errors#Join
cell2Account, cell2Secret := mariadb.CreateMariaDBAccountAndSecret( | ||
cell2.MariaDBAccountName, mariadbv1.MariaDBAccountSpec{}) | ||
DeferCleanup(k8sClient.Delete, ctx, cell2Account) | ||
DeferCleanup(k8sClient.Delete, ctx, cell2Secret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like an unrelated change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This When block is for 3 cells
The newly added When Block is for 4 cells
with this now, BeforeEach block under Describe has 2 cells - 0 and 1
|
||
It("deletes cell3 and verifies error for cell2 because its DB deleted already", func() { | ||
|
||
// manually delete DB for cell2, to reproduce the error in cell2 deletion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that is one way to trigger an error during cell deletion.
delete(nova.Spec.CellTemplates, "cell2") | ||
delete(nova.Spec.CellTemplates, "cell3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know that the deletion of cell2 is always before cell3 in novaCellList.Items
at L605 in the controller? If cell3 can be before cell2 then the test below does not prove that the error in cell2 was just collected but not stopped the loop there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the way to make it testable is to sort the cells be cell name when iterated in the controller. We do that for cell creation anyhow in #903 to avoid the unstable ordering to cause unstable condition messages.
g.Expect(nova.Status.RegisteredCells).NotTo(HaveKey(cell3.CellCRName.Name)) | ||
}, timeout, interval).Should(Succeed()) | ||
|
||
// NovaCellNotExists(cell3.CellCRName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yepp you can check if k8s returns not found for cell3 to prove that it is deleted. You can also check that the RegisteredCells list does not contain cell3 any more but still contains cell2.
// cell2 deletion should have failed | ||
Eventually(func(g Gomega) { | ||
mappingJob := th.GetJob(cell2.CellDeleteJobName) | ||
newJobInputHash := GetEnvVarValue( | ||
mappingJob.Spec.Template.Spec.Containers[0].Env, "INPUT_HASH", "") | ||
g.Expect(newJobInputHash).NotTo(BeEmpty()) | ||
}, timeout, interval).Should(Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how this proves that the cell2 delete failed. It checks for the cell2 delete job to exists
.
@@ -600,18 +600,26 @@ func (r *NovaReconciler) Reconcile(ctx context.Context, req ctrl.Request) (resul | |||
return ctrl.Result{}, err | |||
} | |||
|
|||
var deleteErrs []error | |||
|
|||
for _, cr := range novaCellList.Items { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would sort this list by cell name for two reasons:
- Sort cell-names in nova_controller #903 does it for cell creation already to make the error messages stable even if the order of the list of cells in the api is not stable.
- this makes testing possible as we can rely on the ordering to prove that even if cell2 failed to delete cell3 deletion is tried.
Running the test locally shows that cell2 is also successfully deleted so the current test does not recreate the necessary scenario to prove that failure in one cell does not prevent the deletion of the other. |
fe73ce1
to
6d31959
Compare
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/5839fe00e7f1456ba553d6adfcd1e5ee ✔️ openstack-meta-content-provider SUCCESS in 2h 43m 17s |
Right now when multiple cells gets deleted if any one the cell deletion fails, the control exits with error msg.
This change stores the errors in a list and let next cells deleted.
Closes #OSPRH-10550