Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Barbican integration #378

Merged
merged 1 commit into from
Dec 8, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions templates/glance/config/00-config.conf
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,13 @@ use_keystone_limits = {{ .QuotaEnabled }}
[oslo_limit]
password = {{ .ServicePassword }}

[key_manager]
backend = barbican

[barbican]
auth_endpoint={{ .KeystoneInternalURL }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reasons I don't understand (but plan to investigate), it seems at some point glance deployments need to generate the service config files when most (all?) of the jinja2 variables have not been defined. Notice all of the "{{ if (index . "SomeVariable") }}" conditionals in this file, including the one at L60.

L40 is not conditional, which leads to this [1] failure.

[1] https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openstack-k8s-operators_glance-operator/378/pull-ci-openstack-k8s-operators-glance-operator-main-glance-operator-build-deploy-kuttl/1732439320112926720#1:build-log.txt%3A3592

The short term fix is to add the conditional. The long term question to answer is why all these conditionals are required (cinder doesn't have them).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found out why so many variables in this file are wrapped with a conditional: it's because the top glance controller defines only a small subset (the rest of the variables are defined by the glanceapi controller). Unraveling things to adopt the pattern implemented by the cinder-operator would require non-trivial changes, but I submitted an interim PR #385 that should help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense and thank you for the quick patch (PR#385).
We really need a minimal 00-config for dbsync and cronjobs, so it makes sense to me driving the creation via a single variable instead of wrapping up every single line of that file.
I'm going to rebase this patch soon and test locally.

barbican_endpoint_type=internal
Copy link
Contributor Author

@fmount fmount Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

endpoint should be internal, I think we all agree with this (and it was the TripleO default).
While we confirmed that upstream (Caracal) with devstack glance/barbican interaction works as expected, in this context I'm still seeing issues because of the following:

2023-12-04 17:10:46.934 705 ERROR castellan.key_manager.barbican_key_manager [None req-3b3fcfa5-7eaf-47d1-ae19-d5ded8a20324 01097071334a44a3a8109673cacafeef c03f3ec812ae4dafa89dbfedae96f304 - - default default] Error creating Barbican client: Unable to establish connection to http://barbican-internal.openstack.svc:9311/: HTTPConnectionPool(host='barbican-internal.openstack.svc', port=9311): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7e1f2596d0>: Failed to establish a new connection: [Errno 111] ECONNREFUSED')): keystoneauth1.exceptions.connection.ConnectFailure: Unable to establish connection to http://barbican-internal.openstack.svc:9311/: HTTPConnectionPool(host='barbican-internal.openstack.svc', port=9311): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7e1f2596d0>: Failed to establish a new connection: [Errno 111] ECONNREFUSED'))
2023-12-04 17:10:46.938 705 ERROR cursive.signature_utils [None req-3b3fcfa5-7eaf-47d1-ae19-d5ded8a20324 01097071334a44a3a8109673cacafeef c03f3ec812ae4dafa89dbfedae96f304 - - default default] Unable to retrieve certificate with ID 38a45ab6-cf20-4ea1-9b18-309257c393bb: Key manager error: Unable to establish connection to http://barbican-internal.openstack.svc:9311/: HTTPConnectionPool(host='barbican-internal.openstack.svc', port=9311): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7e1f2596d0>: Failed to establish a new connection: [Errno 111] ECONNREFUSED')): castellan.common.exception.KeyManagerError: Key manager error: Unable to establish connection to http://barbican-internal.openstack.svc:9311/: HTTPConnectionPool(host='barbican-internal.openstack.svc', port=9311): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7e1f2596d0>: Failed to establish a new connection: [Errno 111] ECONNREFUSED'))
2023-12-04 17:10:46.940 705 ERROR glance.api.v2.image_data [None req-3b3fcfa5-7eaf-47d1-ae19-d5ded8a20324 01097071334a44a3a8109673cacafeef c03f3ec812ae4dafa89dbfedae96f304 - - default default] Signature verification failed for image 9b883aa6-268c-4c6b-9538-01350b4ea2a6: Signature verification for the image failed: Unable to retrieve certificate with ID: 38a45ab6-cf20-4ea1-9b18-309257c393bb.: cursive.exception.SignatureVerificationError: Signature verification for the image failed: Unable to retrieve certificate with ID: 38a45ab6-cf20-4ea1-9b18-309257c393bb.

I have a deployment without network isolation and I assume there won't be any issue on the cluster network because barbican should be resolved by the svc.
I see that curl fails almost half of the time:

sh-5.1# curl barbican-internal.openstack.svc:9311
curl: (7) Failed to connect to barbican-internal.openstack.svc port 9311: Connection refused
sh-5.1# curl barbican-internal.openstack.svc:9311
{"versions": {"values": [{"id": "v1", "status": "stable", "links": [{"rel": "self", "href": "https://barbican.openstack.svc:9311/v1/"}, {"rel": "describedby", "type": "text/html", "href": "https://docs.openstack.org/"}], "media-types": [{"base": "application/json", "type": "application/vnd.openstack.key-manager-v1+json"}]}]}}sh-5.1#

and I think this is the root cause of the issue. @vakwetu any thoughts on this? Did you see something similar in the interaction w/ any other component?
/cc @ASBishop @konan-abhi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is just environmental, I'll try a fresh deployment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is just environmental, I'll try a fresh deployment.

It's a known issue, see comments in openstack-k8s-operators/nova-operator#609. We need an update in the barbican-operator to expose the internal endpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the fix is openstack-k8s-operators/openstack-operator#588 (mentioned in nova PR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to enable it by default for all deployments or this should be optional default disabled?

Copy link
Contributor Author

@fmount fmount Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to enable it by default because barbican has been set has enabled: true by default in the OpenstackControlPlane template section. We can discuss more if we want a diff behavior, but in my understanding (and to keep the operators aligned) it should be enabled by default (see the cinder PR for example).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, it will not be have any impact if enabled by default.


[database]
connection = {{ .DatabaseConnection }}
max_retries = -1
Expand Down