-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd context deadline exceeded - sensu backend not connecting to etcd #9
Comments
Hi, anyone taking a look at the above issue? Also having the same problem |
same issue |
Okay Here's the underlying issue as i see it in my minikube environment running on my fedora linux system. It looks like minikube is letting the sensu-backend bind its tcp api to ipv6 localhost tcp port 8080 instead of ipv4 tcp port 8080, and there doesn't seeem to be an obvious way to prevent minikube from allowing this to happen.
those last 3 services are listening on ipv6 and that definitely not good. The k8s configurations provided in this repo assumes ipv4 will be using in the pods. The sensu-backend readinessProbe uses the busybox provided wget in an alpine container which is not ipv6 compatible. Need to either figure out a way to configure minikube so it doesnt let that happen, or we need to figure out a way to tell the sensu-backend to explicitly bind onf ipv4 localhost. |
Turns out this is a problem with the sensu-backend readinessProbe settings. The settings were too aggressive for default minikube resource provisioning and probes were being started faster then they were timing out, causing a problem. Please test PR #10 and comment there on the potential fix |
Did anyone got this working? |
@mvthul |
I changed ur changes that I could see I see everything is green and running but stil context deadline is appearing in logs. When I log in t sensu there is a red bar popping up and if I click details I see under ETCD context deadline. Tried so many things to fix and tried so many other helm charts and scripts. Nothing seems to work with version 6+ |
The specific changes needed to solve the problem may require system specific changes to the configuration... let me explain. There are timeouts configured for the readiness probes and if the system running minikube is resource poor, then the those configurations will be too aggressive and the readiness probes will fall over because the underlying service didn't get enough cpu cycles to complete the start up process. the PR i put together changes these settings enough so that it works on my laptop running minikube. But the nature of the problem is such that even though it works for me, it might fail for someone else with tighter system resources. There might not be a one size fits all solution here, because we definitely still want the readiness probes to give up at reasonable point. For something like google or amazon's service that reasonable point of failure is much sooner than any local minikube deployment...because of available resources. If as a minikube user your still having this specific problem, you may need to further adjust the readinessProbe settings to give your minikube deployment more time to provision everything. |
I tried in Azure AKS and locally with Microk8s both same issue 😭
…________________________________
Van: Jef Spaleta ***@***.***>
Verzonden: Thursday, August 11, 2022 6:40:31 PM
Aan: sensu/sensu-k8s-quick-start ***@***.***>
CC: mvthul ***@***.***>; Mention ***@***.***>
Onderwerp: Re: [sensu/sensu-k8s-quick-start] etcd context deadline exceeded - sensu backend not connecting to etcd (#9)
The specific changes needed to solve the problem may require system specific changes to the configuration... let me explain.
There are timeouts configured for the readiness probes and if the system running minikube is resource poor, then the those configurations will be too aggressive and the readiness probes will fall over because the underlying service didn't get enough cpu cycles to complete the start up process.
the PR i put together changes these settings enough so that it works on my laptop running minikube. But the nature of the problem is such that even though it works for me, it might fail for someone else with tighter system resources.
There might not be a one size fits all solution here, because we definitely still want the readiness probes to give up at reasonable point. For something like google or amazon's service that reasonable point of failure is much sooner than any local minikube deployment...because of available resources.
If as a minikube user your still having this specific problem, you may need to further adjust the readinessProbe settings to give your minikube deployment more time to provision everything.
—
Reply to this email directly, view it on GitHub<#9 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AM3SRPX2VHTZMGBFECJSI3TVYUUH7ANCNFSM44XJBN3Q>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
okay well this inst confined to minikube.. this needs to be reinvestigated. Azure AKS isn't a service I've tested against yet, but I'll look into it. |
@mvthul For me the context timeout exceeded messages are intermittent and aren't causing a problem for the intended purpose of kicking the tires in minikube, everything spins up and I'm able to use the sensu dashboard. For Azure AKS, you might need to change the storage class associated with the sensu-etcd persistent volume. I don't know what AKS storageClass options has out of the gate, but you'll want a dedicated SSD for the sensu-etcd volume. |
I was experiencing the same issue on tanzu kubernetes, seems the PR works as expected, i think you should merge it. |
This issue has been mentioned on Sensu Community. There might be relevant details there: https://discourse.sensu.io/t/issues-installing-sensu-6-10-on-eks/3137/2 |
I'm following the readme and using all default settings. Running locally on minikube. sensu-backend pod repeatedly fails because the readiness check for the backend's /health endpoint never passes. It returns:
etcd cluster health comes back as healthy from both the etcd and the sensu-backend container :
Following errors appear in the sensu-backend logs:
The text was updated successfully, but these errors were encountered: