You will need to email Lincoln Bryant ([email protected]) or Ilija Vukotic ([email protected]) for read-only credentials to access elastic search at UChicago.
This software requires both python36-elasticsearch
and python36-tabulate
packages installed, either through system packages or virtual environment.
yum install python36-elasticsearch6
yum install python36-tabulate
python3.6 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
You will need to export the ES_USER
and ES_PASS
variables into your
environment that match the credentials you requested above.
$ ./check2.py --help
usage: FailedJobs [-h] [-s SITE] [-i IDS [IDS ...]] [-l LAST] {site,node}
Query ATLAS Analytics Elasticsearch for failed jobs
positional arguments:
{site,node}
optional arguments:
-h, --help show this help message and exit
-s SITE, --site SITE Site, e.g. 'MWT2', 'AGLT2', 'SWT2_CPB', etc
-i IDS [IDS ...], --ids IDS [IDS ...]
space separated list of PanDA job IDs
-l LAST, --last LAST Maximum number of hours ago. Only relevant for 'site'
mode
This mode will look at failures across the entire site for a given number of hours before the current time.
$ ./check2.py site -s MWT2 -l 1
pandaid batchid modificationhost piloterrorcode
----------------------------------------------- --------------------------------------- --------------------------------------------- ----------------
https://bigpanda.cern.ch/job?pandaid=6069497346 iut2-gk02.mwt2.org#9398499.0#1704436223 [email protected] 0
https://bigpanda.cern.ch/job?pandaid=6069648316 uiuc-gk02.mwt2.org#1418461.0#1704446683 [email protected] 1305
https://bigpanda.cern.ch/job?pandaid=6069652861 uct2-gk02.mwt2.org#8772333.0#1704447294 [email protected] 1305
https://bigpanda.cern.ch/job?pandaid=6069287664 iut2-gk02.mwt2.org#9395973.0#1704414228 [email protected] 0
https://bigpanda.cern.ch/job?pandaid=6069640354 iut2-gk02.mwt2.org#9400250.0#1704445812 [email protected] 1305
https://bigpanda.cern.ch/job?pandaid=6069633343 uct2-gk02.mwt2.org#8771860.0#1704445805 [email protected] 1305
https://bigpanda.cern.ch/job?pandaid=6069625491 uiuc-gk02.mwt2.org#1418224.0#1704444378 [email protected] 1305
https://bigpanda.cern.ch/job?pandaid=6069608909 uct2-gk02.mwt2.org#8771359.0#1704441618 [email protected] 1305
https://bigpanda.cern.ch/job?pandaid=6069633964 uct2-gk.mwt2.org#8739758.0#1704434949 [email protected] 1098
This mode will attempt to locate the HTCondor EXECUTE_DIR
directory and grep
the job's stdout for the Pilot ID for each job running at the site.
Otherwise, PanDA job IDs can be supplied via the -i
or --ids
flags.
$ ./check2.py node -i 6069633343
pandaid modificationhost piloterrorcode jobstatus
----------------------------------------------- --------------------------------------------- ---------------- -----------
https://bigpanda.cern.ch/job?pandaid=6069633343 [email protected] 1305 failed
NOTE This mode requires sudo as it needs to read files in the directories of batch system users.
(venv) [10:01] uct2-c578.mwt2.org:~/failed-jobs-py $ sudo ./check2.py node
pandaid modificationhost piloterrorcode jobstatus
----------------------------------------------- --------------------------- ---------------- -----------
https://bigpanda.cern.ch/job?pandaid=6069660793 [email protected] 1098 failed
https://bigpanda.cern.ch/job?pandaid=6069661283 [email protected] 1305 failed
https://bigpanda.cern.ch/job?pandaid=6069687372 [email protected] 1305 failed
https://bigpanda.cern.ch/job?pandaid=6069630217 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069115751 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069687373 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069700489 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069659220 [email protected] 1098 failed
https://bigpanda.cern.ch/job?pandaid=6069112758 [email protected] 1305 failed
https://bigpanda.cern.ch/job?pandaid=6069242358 [email protected] 1098 failed
https://bigpanda.cern.ch/job?pandaid=6069661442 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069630218 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069630219 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069115749 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069661445 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069661490 [email protected] 0 failed
https://bigpanda.cern.ch/job?pandaid=6069242357 [email protected] 1098 failed
https://bigpanda.cern.ch/job?pandaid=6069242356 [email protected] 1098 failed
https://bigpanda.cern.ch/job?pandaid=6069112757 [email protected] 1305 failed
https://bigpanda.cern.ch/job?pandaid=6069660442 [email protected] 1098 failed
https://bigpanda.cern.ch/job?pandaid=6069661446 [email protected] 0 failed