Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Failure updating submission data" on both public and private queue #1471

Closed
johanneskruse opened this issue Jun 7, 2024 · 4 comments
Closed

Comments

@johanneskruse
Copy link

Hi,

I'm running a competition.

I started encountering the following error on my remote workers:

/usr/local/lib/python3.8/site-packages/celery/platforms.py:800: RuntimeWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!

Please specify a different user using the --uid option.

User information: uid=0 euid=0 gid=0 egid=0

  warnings.warn(RuntimeWarning(ROOT_DISCOURAGED.format(
 
 -------------- compute-worker@63e894b26d8f v4.4.0 (cliffs)
--- ***** ----- 
-- ******* ---- Linux-5.15.0-1062-aws-x86_64-with-glibc2.34 2024-06-07 05:39:21
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         __main__:0x7f2131447a30
- ** ---------- .> transport:   amqp://63a35e45-cb28-4eed-9c2c-af8072bf9d9c:**@www.codabench.org:5672/572d5689-cb2d-4da9-a09c-cb9a0b0284ef
- ** ---------- .> results:     disabled://
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> compute-worker   exchange=compute-worker(direct) key=compute-worker
                

[tasks]
  . compute_worker_run

[2024-06-07 05:39:21,941: INFO/MainProcess] Connected to amqp://63a35e45-cb28-4eed-9c2c-af8072bf9d9c:**@www.codabench.org:5672/572d5689-cb2d-4da9-a09c-cb9a0b0284ef
[2024-06-07 05:39:22,114: INFO/MainProcess] mingle: searching for neighbors
[2024-06-07 05:39:23,500: INFO/MainProcess] mingle: all alone
[2024-06-07 05:39:23,859: INFO/MainProcess] compute-worker@63e894b26d8f ready.
[2024-06-07 05:39:23,860: INFO/MainProcess] Received task: compute_worker_run[1acefb0a-613c-407e-97f1-3cf154e246ae]  
[2024-06-07 05:39:23,983: INFO/ForkPoolWorker-1] Received run arguments: {'user_pk': 6655, 'submissions_api_url': 'https://www.codabench.org/api', 'secret': 'c5a0eb20-fff9-44e7-b564-249169646bb4', 'docker_image': 'codalab/codalab-legacy:py39', 'execution_time_limit': 172800, 'id': 68433, 'is_scoring': False, 'prediction_result': 'https://miniodis-rproxy.lisn.upsaclay.fr/coda-v2-prod-private/prediction_result/2024-06-07-1717738652/a3392ac9edd0/prediction_result.zip?AWSAccessKeyId=EASNOMJFX9QFW4QIY4SL&Signature=vF9HvMIWWGTJm3pjWiJ9L7fZ210%3D&content-type=application%2Fzip&Expires=1717825053', 'input_data': 'https://miniodis-rproxy.lisn.upsaclay.fr/coda-v2-prod-private/dataset/2024-04-04-1712207092/b24e0df11261/input_data.zip?AWSAccessKeyId=EASNOMJFX9QFW4QIY4SL&Signature=XKd7EXpMFP6peWuiQrqohb7b7YE%3D&Expires=1717825053', 'ingestion_only_during_scoring': False, 'program_data': 'https://miniodis-rproxy.lisn.upsaclay.fr/coda-v2-prod-private/dataset/2024-06-06-1717708585/bd4f130fdb04/random_ranking.zip?AWSAccessKeyId=EASNOMJFX9QFW4QIY4SL&Signature=zjUjb3MaPYOtogB39Wbm34wLc3U%3D&Expires=1717825053', 'prediction_stdout': 'https://miniodis-rproxy.lisn.upsaclay.fr/coda-v2-prod-private/submission_details/2024-06-07-1717738653/b1943afa9ea7/prediction_stdout.txt?AWSAccessKeyId=EASNOMJFX9QFW4QIY4SL&Signature=OcT60osyU%2FCvCm928PVqGvdO6%2B8%3D&content-type=application%2Fzip&Expires=1717825053', 'prediction_stderr': 'https://miniodis-rproxy.lisn.upsaclay.fr/coda-v2-prod-private/submission_details/2024-06-07-1717738653/8b8b1af78be3/prediction_stderr.txt?AWSAccessKeyId=EASNOMJFX9QFW4QIY4SL&Signature=u82ROiOKQ4QMEn4pc7BE%2BGr9bro%3D&content-type=application%2Fzip&Expires=1717825053', 'prediction_ingestion_stdout': 'https://miniodis-rproxy.lisn.upsaclay.fr/coda-v2-prod-private/submission_details/2024-06-07-1717738653/9624471121ec/prediction_ingestion_stdout.txt?AWSAccessKeyId=EASNOMJFX9QFW4QIY4SL&Signature=V6E54npsbPNMXuI6Q46heMfWVv0%3D&content-type=application%2Fzip&Expires=1717825053', 'prediction_ingestion_stderr': 'https://miniodis-rproxy.lisn.upsaclay.fr/coda-v2-prod-private/submission_details/2024-06-07-1717738653/21a770e56477/prediction_ingestion_stderr.txt?AWSAccessKeyId=EASNOMJFX9QFW4QIY4SL&Signature=O6923QJ%2FqINgdeGyL503Fos3kUA%3D&content-type=application%2Fzip&Expires=1717825053'}
[2024-06-07 05:39:23,984: INFO/ForkPoolWorker-1] Updating submission @ https://www.codabench.org/api/submissions/68433/ with data = {'status': 'Preparing', 'status_details': None, 'secret': 'c5a0eb20-fff9-44e7-b564-249169646bb4'}
[2024-06-07 05:39:24,325: INFO/ForkPoolWorker-1] Submission patch failed with status = 500, and response = 
b'<h1>Server Error (500)</h1>'
[2024-06-07 05:39:24,326: INFO/ForkPoolWorker-1] Updating submission @ https://www.codabench.org/api/submissions/68433/ with data = {'status': 'Failed', 'status_details': 'Failure updating submission data.', 'secret': 'c5a0eb20-fff9-44e7-b564-249169646bb4'}
[2024-06-07 05:39:24,430: INFO/ForkPoolWorker-1] Submission patch failed with status = 500, and response = 
b'<h1>Server Error (500)</h1>'
[2024-06-07 05:39:24,431: INFO/ForkPoolWorker-1] Destroying submission temp dir: /codabench/tmprkx556ah
[2024-06-07 05:39:24,432: ERROR/ForkPoolWorker-1] Task compute_worker_run[1acefb0a-613c-407e-97f1-3cf154e246ae] raised unexpected: SubmissionException('Failure updating submission data.')
Traceback (most recent call last):
  File "/compute_worker.py", line 115, in run_wrapper
    run.prepare()
  File "/compute_worker.py", line 764, in prepare
    self._update_status(STATUS_PREPARING)
  File "/compute_worker.py", line 359, in _update_status
    self._update_submission(data)
  File "/compute_worker.py", line 342, in _update_submission
    raise SubmissionException("Failure updating submission data.")
compute_worker.SubmissionException: Failure updating submission data.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 650, in __protected_call__
    return self.run(*args, **kwargs)
  File "/compute_worker.py", line 123, in run_wrapper
    run._update_status(STATUS_FAILED, str(e))
  File "/compute_worker.py", line 359, in _update_status
    self._update_submission(data)
  File "/compute_worker.py", line 342, in _update_submission
    raise SubmissionException("Failure updating submission data.")
compute_worker.SubmissionException: Failure updating submission data.

Submission goes from Submitting to Submitted.

I tried:

  • Restarting everything
  • Making new queue
  • Using the Public Queue

None of the above seem to be working. Are the an explanation or solution to the problem?

Best,
Johannes

@johanneskruse
Copy link
Author

This error was also seen in #1446

@ObadaS
Copy link
Collaborator

ObadaS commented Jun 7, 2024

@johanneskruse The problem should be fixed for now

@johanneskruse
Copy link
Author

Thank you for the quick action. It seems to be working again!

@Didayolo
Copy link
Member

Didayolo commented Jun 7, 2024

I close this issue then. We still need to find a long-term solution to this problem, but we'll keep track of it in #1446.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants