You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Less than 1% of cirrus-run invocations hang indefinitely after Cirrus CI has long finished the corresponding build. CIRRUS_TIMEOUT is eventually reached and job failure is reported.
This issue needs further investigation. Is API server reporting incorrect build status sometimes? Is this some kind of cache/CDN issue?
Troubleshooting is difficult because of the rarity of this failure and because most invocations of cirrus-run happen non-interactively (via another CI service, e.g. GitLab).
Observer needs to act quickly upon encountering cirrus-run timeout:
Confirm that the build is in fact finished on Cirrus CI side. Link to the build is usually printed to stdout by cirrus-run.
Check API response for that particular build status:
Optional: Ensure that CIRRUS_API_TOKEN environment variable is provided with a correct value. Without a token only public repos will be viewable, and API rate limits will probably be more strict.
Execute make debug/build_status DEBUG_BUILD_ID=5735044040884224 from repo top-level directory (replace the number with your build ID)
Report the output here. If the script just keeps repeating the same "EXECUTING" status you can interrupt it with Ctrl+C or keep running to see when/if it fails.
The text was updated successfully, but these errors were encountered:
Looks like some kind of caching issue. I was running a lot of concurrent Cirrus CI jobs (30+) and most of them were delayed by community cluster scheduler, so a lot of cirrus-run instances were just sitting there each querying the API every few seconds. It appears that after some amount of repeated queries reply got cached somewhere and was not updated when job finished successfully. Running the same query from a different host (my workstation) produced the correct result immediately.
There is no caching built into cirrus-run, so stale cache must be coming from the API itself or from some middleware in between (CDN?). I'm not sure if/how this is fixable on our side.
GitLab CI log: cirrus-run hangs indefinitely. Output verbosity is set to low, unfortunately.
Less than 1% of cirrus-run invocations hang indefinitely after Cirrus CI has long finished the corresponding build.
CIRRUS_TIMEOUT
is eventually reached and job failure is reported.This issue needs further investigation. Is API server reporting incorrect build status sometimes? Is this some kind of cache/CDN issue?
Troubleshooting is difficult because of the rarity of this failure and because most invocations of cirrus-run happen non-interactively (via another CI service, e.g. GitLab).
Observer needs to act quickly upon encountering cirrus-run timeout:
CIRRUS_API_TOKEN
environment variable is provided with a correct value. Without a token only public repos will be viewable, and API rate limits will probably be more strict.make debug/build_status DEBUG_BUILD_ID=5735044040884224
from repo top-level directory (replace the number with your build ID)The text was updated successfully, but these errors were encountered: