Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Connection Healthcheck option #295

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sideangleside
Copy link
Member

Based upon a few requests I have gotten, I want to propose the following patch to check the health of a Foreman installation prior to registration. The general premise being:

If the services fail the health-check, exit, and remove the katello-ca-consumer-latest package.

Questions that I have:

  • is a response time of 1000 milliseconds an appropriate amount of time to consider a service too busy to service clients
  • should we finish up Validate user credendials before trying to bootstrap. #209 in this PR? (In that PR, we were attempting to validate creds before installing the katello-ca-consumer-latest RPM, which I don't think we need to do. (assuming we clean up if the creds fail).

print_generic("Ensuring services are online and accessible on within %s milliseconds" % options.service_timeout)
url = "https://" + options.foreman_fqdn + ":" + API_PORT + "/katello/api/ping"
features = get_json(url)['services']
svc_list = ["candlepin", "pulp", "pulp_auth", "candlepin_auth", "foreman_tasks"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this list will vary between Katello versions, why not iterating over features.keys() directly?

@@ -1339,6 +1361,7 @@ def exec_service(service, command, failonerror=True):
print("PUPPET CA PORT - %s" % options.puppet_ca_port)
print("IGNORE REGISTRATION FAILURES - %s" % options.ignore_registration_failures)
print("PRESERVE RHSM PROXY CONFIGURATION - %s" % options.preserve_rhsm_proxy)
print("SERVICE TIMEOUT - %s" % options.service_timeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KATELLO SERVICE TIMEOUT

url = "https://" + options.foreman_fqdn + ":" + API_PORT + "/katello/api/ping"
features = get_json(url)['services']
svc_list = ["candlepin", "pulp", "pulp_auth", "candlepin_auth", "foreman_tasks"]
for service in svc_list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the next 10 lines are pretty repetitive, how about:

    for service in svc_list:
        service_error = None
        if features[service]['status'] != 'ok':
            service_error = "service %s reports a state of %s" % (service, features[service]['status'])
        elif int(features[service]['duration_ms']) >= options.service_timeout:
            service_error = "service %s reponds in %s ms which exceeds maximum of %s ms" % (service, features[service]['duration_ms'], str(options.service_timeout))
        if service_error:
            print_error("Error detected in the Service Health Check. Cleaning up install")
            clean_katello_agent()
            print_error(service_error)
            sys.exit(3)

@evgeni
Copy link
Member

evgeni commented Feb 19, 2019

I think 1000ms is enough, but I've long not seen heavy used katello installs 🤷‍♀️

One thing that bothers me with this setup: we do changes to the bootstraped system, before telling "oh, no, we won't bootstrap today". And while we do try to revert those changes (you clean the ca rpm), the system is not (might not) be bit-identical as before. So yes, I think piggy-backing this ontop of #209 and making it not even install the CA RPM would be awesome.

(My worry case is when migrating from one Katello to another, we already removed the old CA rpm, but we cannot reliably restore it and thus leave the machine without updates)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants