Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove InMemoryCollector from liveness check on shutdown #1349

Merged
merged 1 commit into from
Sep 23, 2024

Conversation

VinozzZ
Copy link
Contributor

@VinozzZ VinozzZ commented Sep 23, 2024

Which problem is this PR solving?

related to #1348

Currently, the collector's timeout for healthcheck is set to 3 seconds. During a shutdown, there could be a large number of traces or spans that need to be forwarded to peers. If this is taking longer than 3 seconds, it will definitely timeout the liveness check.

Since we are shutting down anyway, just let the ShutdownDelay to do the work and not worry about the liveness.

Short description of the changes

  • Unregister collector from Refinery's liveness check

@VinozzZ VinozzZ requested a review from a team as a code owner September 23, 2024 15:59
Copy link
Contributor

@kentquirk kentquirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea, but I will also note that I've seen the collector log message when we weren't shutting down -- so there's a problem somewhere that it can exit while holding the lock, without causing a stack trace.

@VinozzZ
Copy link
Contributor Author

VinozzZ commented Sep 23, 2024

I added my theory in the issue itself: #1348 (comment)

@VinozzZ VinozzZ merged commit 398f615 into main Sep 23, 2024
6 checks passed
@VinozzZ VinozzZ deleted the yingrong.collector_liveness branch September 23, 2024 16:53
TylerHelmuth pushed a commit that referenced this pull request Oct 16, 2024
## Which problem is this PR solving?

related to #1348 

Currently, the collector's timeout for healthcheck is set to 3 seconds.
During a shutdown, there could be a large number of traces or spans that
need to be forwarded to peers. If this is taking longer than 3 seconds,
it will definitely timeout the liveness check.

Since we are shutting down anyway, just let the ShutdownDelay to do the
work and not worry about the liveness.
## Short description of the changes

- Unregister `collector` from Refinery's liveness check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants