Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The documented installation process no longer works #2965

Open
0x6675636b796f75676974687562 opened this issue Oct 28, 2024 · 6 comments
Open
Assignees
Labels
DevOps DevOps tasks documentation Improvements or additions to documentation

Comments

@0x6675636b796f75676974687562
Copy link
Member

0x6675636b796f75676974687562 commented Oct 28, 2024

0.2.1+1056, Docker tag 0.4.0-alpha.0.379-70423bd, commit 70423bd

$ helm --kube-context external install save-cloud oci://ghcr.io/saveourtool/save-cloud --version=$CHART_VERSION   --namespace save-cloud --create-namespace -f values-cce.yaml -f values-images.yaml
Pulled: ghcr.io/saveourtool/save-cloud:0.2.1_1217
Digest: sha256:34a2730379069f24818e40fe9891c8de2a88256d261c6f6209163788fc8c94a7
ghcr.io/saveourtool/save-cloud:0.2.1_1217 contains an underscore.

OCI artifact references (e.g. tags) do not support the plus sign (+). To support
storing semantic versions, Helm adopts the convention of changing plus (+) to
an underscore (_) in chart version tags when pushing to a registry and back to
a plus (+) when pulling from a registry.
Error: INSTALLATION FAILED: cannot re-use a name that is still in use
@0x6675636b796f75676974687562
Copy link
Member Author

0x6675636b796f75676974687562 commented Oct 29, 2024

This seems like a known issue:

Yet, when running

helm --kube-context external upgrade --install save-cloud oci://ghcr.io/saveourtool/save-cloud --version=$CHART_VERSION --namespace save-cloud --create-namespace -f values-cce.yaml -f values-images.yaml

the following errors are reported (#2964):

Pulled: ghcr.io/saveourtool/save-cloud:0.2.1_1217
Digest: sha256:34a2730379069f24818e40fe9891c8de2a88256d261c6f6209163788fc8c94a7
ghcr.io/saveourtool/save-cloud:0.2.1_1217 contains an underscore.

OCI artifact references (e.g. tags) do not support the plus sign (+). To support
storing semantic versions, Helm adopts the convention of changing plus (+) to
an underscore (_) in chart version tags when pushing to a registry and back to
a plus (+) when pulling from a registry.
W1029 15:40:04.269014   26872 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W1029 15:40:04.570140   26872 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W1029 15:40:04.867864   26872 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
Error: UPGRADE FAILED: cannot patch "backend-cosv" with kind Service: Service "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update

@0x6675636b796f75676974687562
Copy link
Member Author

Running helm history shows the problem existed since February, 2024, and the last (chart) version successfully deployed was 0.2.1+1056:

$ helm history save-cloud
REVISION        UPDATED                         STATUS          CHART                   APP VERSION     DESCRIPTION

247             Wed Feb 21 17:27:31 2024        deployed        save-cloud-0.2.1+1056   0.3.0-SNAPSHOT  Upgrade complete

254             Wed Feb 28 14:54:53 2024        failed          save-cloud-0.2.1+1200   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update
255             Thu Mar  7 13:52:16 2024        failed          save-cloud-0.2.1+1200   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update
256             Mon Oct 28 20:12:02 2024        failed          save-cloud-0.2.1+1217   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update
257             Tue Oct 29 15:39:49 2024        failed          save-cloud-0.2.1+1217   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update
258             Tue Oct 29 17:24:19 2024        failed          save-cloud-0.2.1+1217   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update
259             Tue Oct 29 17:26:52 2024        failed          save-cloud-0.2.1+1217   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update
260             Tue Oct 29 17:39:42 2024        failed          save-cloud-0.2.1+1217   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update
261             Tue Oct 29 19:01:28 2024        failed          save-cloud-0.2.1+1217   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update
262             Tue Oct 29 19:16:04 2024        failed          save-cloud-0.2.1+1217   0.3.0-SNAPSHOT  Upgrade "save-cloud" failed: cannot patch "backend-cosv" with kind Service: Service
 "backend-cosv" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update

@0x6675636b796f75676974687562
Copy link
Member Author

Running helm upgrade after helm uninstall brings the following log:

Release "save-cloud" does not exist. Installing it now.
Pulled: ghcr.io/saveourtool/save-cloud:0.2.1_1217
Digest: sha256:34a2730379069f24818e40fe9891c8de2a88256d261c6f6209163788fc8c94a7
ghcr.io/saveourtool/save-cloud:0.2.1_1217 contains an underscore.

OCI artifact references (e.g. tags) do not support the plus sign (+). To support
storing semantic versions, Helm adopts the convention of changing plus (+) to
an underscore (_) in chart version tags when pushing to a registry and back to
a plus (+) when pulling from a registry.
W1029 19:40:30.870912   43716 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W1029 19:40:57.828905   43716 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
Error: services "backend-cosv" already exists

@0x6675636b796f75676974687562
Copy link
Member Author

Rolling back to chart version 0.2.1+1056 and Docker tag 0.4.0-alpha.0.379-70423bd (the last successful combination) succeeds:

Release "save-cloud" does not exist. Installing it now.
Pulled: ghcr.io/saveourtool/save-cloud:0.2.1_1056
Digest: sha256:421bdca9aa24d71da5ba252dbb1461636937b7f2c7e633ebccae259d4ab4cad4
ghcr.io/saveourtool/save-cloud:0.2.1_1056 contains an underscore.

OCI artifact references (e.g. tags) do not support the plus sign (+). To support
storing semantic versions, Helm adopts the convention of changing plus (+) to
an underscore (_) in chart version tags when pushing to a registry and back to
a plus (+) when pulling from a registry.
W1029 20:06:23.032374   45696 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W1029 20:06:52.836412   45696 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W1029 20:06:53.143400   45696 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
NAME: save-cloud
LAST DEPLOYED: Tue Oct 29 20:06:17 2024
NAMESPACE: save-cloud
STATUS: deployed
REVISION: 1

This corresponds to commit 70423bd, made on Nov 8th 2023 (~70 commits ago, see #2851), when the following components didn't even existed:

@0x6675636b796f75676974687562
Copy link
Member Author

0x6675636b796f75676974687562 commented Oct 29, 2024

After the rollback to 0.2.1+1056, the deployment looks more or less healthy, except for sandbox and save-cloud-loki:

$ kubectl get po
NAME                                                 READY   STATUS                  RESTARTS   AGE
backend-d44fb4fbf-bphz4                              1/1     Running                 0          11m
demo-5bff7d8458-f7qqs                                1/1     Running                 0          11m
demo-cpg-7ff487c96c-cccl8                            1/1     Running                 0          11m
frontend-795d957dfc-ft2fl                            1/1     Running                 0          11m
gateway-b56c9f7ff-qqzql                              1/1     Running                 0          11m
loki-canary-62cm9                                    1/1     Running                 0          11m
loki-canary-hmzgw                                    1/1     Running                 0          11m
loki-canary-sn9th                                    1/1     Running                 0          11m
orchestrator-6946c56889-s6r96                        1/1     Running                 0          11m
preprocessor-86d86487b4-k42jr                        1/1     Running                 0          11m
sandbox-5d94b5db7d-zgk95                             0/1     Init:CrashLoopBackOff   7          11m
save-cloud-0                                         1/1     Running                 0          11m
save-cloud-grafana-agent-operator-6b9d4f9d8d-9dtlx   1/1     Running                 0          11m
save-cloud-grafana-c8c645d67-nw6zp                   1/1     Running                 0          11m
save-cloud-loki-0                                    0/1     ContainerCreating       0          11m
save-cloud-prometheus-server-6f8577bbf6-gcnrj        1/1     Running                 0          11m
save-cloud-promtail-6hfrq                            1/1     Running                 0          11m
save-cloud-promtail-gmr7t                            1/1     Running                 0          11m
save-cloud-promtail-tmxq5                            0/1     Running                 0          11m

For save-cloud-loki-0, the following events are logged:

  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               13m                   default-scheduler        Successfully assigned save-cloud/save-cloud-loki-0 to 172.16.0.52
  Normal   SuccessfulAttachVolume  13m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-46f9922d-f4a8-4615-8a85-7871a133cf49"
  Warning  FailedMount             9m41s                 kubelet                  Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[kube-api-access-v79jx tmp config runtime-config storage]: timed out waiting for the condition
  Warning  FailedMount             5m7s (x2 over 7m23s)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[runtime-config storage kube-api-access-v79jx tmp config]: timed out waiting for the condition
  Warning  FailedMount             2m51s (x2 over 11m)   kubelet                  Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[config runtime-config storage kube-api-access-v79jx tmp]: timed out waiting for the condition
  Warning  FailedMount             100s (x13 over 11m)   kubelet                  MountVolume.MountDevice failed for volume "pvc-46f9922d-f4a8-4615-8a85-7871a133cf49" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name disk.csi.everest.io not found in the list of registered CSI drivers
  Warning  FailedMount             37s                   kubelet                  Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[tmp config runtime-config storage kube-api-access-v79jx]: timed out waiting for the condition

@0x6675636b796f75676974687562
Copy link
Member Author

0x6675636b796f75676974687562 commented Oct 30, 2024

According to @icemachined, deleting and re-creating a persistent volume solved the problem.

Now we have (0.2.1+1056, Docker tag 0.4.0-alpha.0.379-70423bd):

$ kubectl get po
NAME                                                 READY   STATUS                  RESTARTS   AGE
backend-d44fb4fbf-bxdmv                              1/1     Running                 0          5m7s
demo-5bff7d8458-bscnv                                1/1     Running                 0          5m7s
demo-cpg-7ff487c96c-jf54n                            1/1     Running                 0          5m7s
frontend-795d957dfc-qb75s                            1/1     Running                 0          5m7s
gateway-b56c9f7ff-5st6d                              1/1     Running                 0          5m7s
loki-canary-9k2h9                                    1/1     Running                 0          5m7s
loki-canary-ls7zg                                    1/1     Running                 0          5m7s
loki-canary-nlxc9                                    1/1     Running                 0          5m7s
orchestrator-6946c56889-wqvvv                        1/1     Running                 0          5m7s
preprocessor-86d86487b4-4q84h                        1/1     Running                 0          5m7s
sandbox-5d94b5db7d-8jvxl                             0/1     Init:CrashLoopBackOff   5          5m7s
save-cloud-0                                         1/1     Running                 0          5m6s
save-cloud-grafana-agent-operator-6b9d4f9d8d-7vv29   1/1     Running                 0          5m7s
save-cloud-grafana-c8c645d67-562nr                   1/1     Running                 0          5m7s
save-cloud-loki-0                                    1/1     Running                 0          5m6s
save-cloud-prometheus-server-6f8577bbf6-6k95k        1/1     Running                 0          5m7s
save-cloud-promtail-5cckf                            1/1     Running                 0          5m7s
save-cloud-promtail-8v9tt                            1/1     Running                 0          5m7s
save-cloud-promtail-w2mgq                            1/1     Running                 0          5m7s

@0x6675636b796f75676974687562 0x6675636b796f75676974687562 added bug Something isn't working documentation Improvements or additions to documentation DevOps DevOps tasks and removed bug Something isn't working labels Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DevOps DevOps tasks documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant