Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled wsgiref exception in log #397

Closed
Charles1000Chen opened this issue Oct 26, 2023 · 7 comments · Fixed by #413
Closed

Unhandled wsgiref exception in log #397

Charles1000Chen opened this issue Oct 26, 2023 · 7 comments · Fixed by #413
Assignees
Labels
area: code resolution: mitigated Temporary solution type: bug Something isn't working
Milestone

Comments

@Charles1000Chen
Copy link
Contributor

Describe the bug
The following error from the wsgiref package in ZHMC prometheus exporter log is unhandled:

Traceback (most recent call last):
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 138, in run
    self.finish_response()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 184, in finish_response
    self.write(data)
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 288, in write
    self.send_headers()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 346, in send_headers
    self.send_preamble()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 268, in send_preamble
    self._write(
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 467, in _write
    result = self.stdout.write(data)
  File "/usr/lib/python3.10/socketserver.py", line 826, in write
    self._sock.sendall(b)
  File "/usr/lib/python3.10/ssl.py", line 1237, in sendall
    v = self.send(byte_view[count:])
  File "/usr/lib/python3.10/ssl.py", line 1206, in send
    return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2426)

Expected behavior
The wsgiref exception should be catched by zhmc prometheus exporter and output understandable error message.

To Reproduce
<-- Describe the steps to reproduce the behavior. -->

Environment information

  • Output of zhmc_prometheus_exporter --version:
    1.5.0.dev1
  • HMC version:
    N/A

Command output
<-- Relevant parts of the command output. If possible, with '-vv'. -->

Log file
<-- If possible, attach a log file generated with '--log-comp all=debug --log exporter.log'. -->

@Charles1000Chen
Copy link
Contributor Author

The SSLEOFError exception needs be handled when call the "start_http_server" function.

@andy-maier andy-maier self-assigned this Nov 9, 2023
@andy-maier andy-maier added type: bug Something isn't working area: code labels Nov 9, 2023
@andy-maier andy-maier added this to the 1.5.0 milestone Nov 9, 2023
@andy-maier
Copy link
Member

@Charles1000Chen The code already catches ssl.SSLError when raised by the "start_http_server" function, because it is a subclass of IOError: https://github.com/zhmcclient/zhmc-prometheus-exporter/blob/master/zhmc_prometheus_exporter/zhmc_prometheus_exporter.py#L1871

I think what probably happens is that the ssl exception is raised in the thread that is started.

@andy-maier
Copy link
Member

andy-maier commented Nov 14, 2023

I put up PR #413 which in the HTTPS case simplifies the error message but keeps on catching IOError around the call to start_http_server() (because that also catches any ssl.SSLError exceptions - I verified that), and that resulted in the following (properly caught) error messages, for a few selected error situations:

  • invalid format in client CA certificate file:
    Error: Cannot start HTTPS server: SSLError: ("Cannot load CA certificate chain from file 'myconfig/../../certs/ibmca_bundle.pem' or directory None: [X509: NO_CERTIFICATE_OR_CRL_FOUND] no certificate or crl found (_ssl.c:4147)",)
  • truncated client CA certificate file:
    Error: Cannot start HTTPS server: SSLError: ("Cannot load CA certificate chain from file 'myconfig/../../certs/ibmca_bundle.pem' or directory None: [X509] PEM lib (_ssl.c:4147)",)
  • non-existing client CA certificate file:
    Error: Cannot start HTTPS server: FileNotFoundError: Cannot load CA certificate chain from file 'myconfig/../../certs/ibmca_bundlexx.pem' or directory None: [Errno 2] No such file or directory

Note that the error reported by you is not part of these tests. I suspect that reproducing that error would require cutting off the network between Prometheus and the zhmc exporter during the TLS handshake, which is hard to reproduce for me.

I think the improvement in PR #413 is as much as we can do in the zhmc exporter, because the handling of exceptions raised by the HTTP/HTTPS server while it runs would need to be done by the Python Prometheus client code, and not by the zhmc exporter code.

If you have indications that the above is incorrect, please let me know.

@Charles1000Chen
Copy link
Contributor Author

@andy-maier I basically agree with you. Currently, the error can be seen in my test every once in a while. Could we catch the Exception as well and put the update in the version 1.5.0b2, so that I'll test it in my environment to see if any difference?

@andy-maier
Copy link
Member

@Charles1000Chen Yes, i can open up the type of exceptions from IOError to Exception to make sure we catch everything, and build a new beta version.

@andy-maier
Copy link
Member

andy-maier commented Nov 15, 2023

Beta version 1.5.0b3 has been released, with the change to the exception handling.

The exception handling change is only to be double sure that the ssl.SSLEOFError gets handled if it is raised in the call to start_http_server().

I'll leave this issue open, since it is not solved yet. Once the exception reoccurs we will investigate what else can be done.

@andy-maier
Copy link
Member

I am closing his issue and will release the official 1.5.0 version today. If the problem re-occurs, please reopen this issue or open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: code resolution: mitigated Temporary solution type: bug Something isn't working
Projects
None yet
2 participants