You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Resource state mismatches between managers and agents
This can happen naturally. However, these InsufficientResource errors lead to immediate session cancellation without any “sync and retry” attempts.
Objective
When InsufficientResource errors occur, managers should sync resource states with agents and retry session creation according to configured retry policies
Implement retry policies that specify intervals, maximum attempts, and whether to enqueue the session or retry creating the kernel(s) with the same agent
Expected Sub Issue
Refactor manager's error handler to detect InsufficientResource errors
Implement sync API that resolve resource mismatches between managers and agents
Add configuration options for retry policies (intervals, max attempts, etc.)
Implement manager-side APIs for retry policy configuration
The text was updated successfully, but these errors were encountered:
fregataa
changed the title
Fix session creation failures due to resource handling issues
Implement a "sync and retry" mechanism to handle manager-agent resource mismatches
Jan 22, 2025
Motivation
InsufficientResource
errors occur in the CREATING phase of session creation, while creating containers in agents. This error may result from:This can happen naturally. However, these
InsufficientResource
errors lead to immediate session cancellation without any “sync and retry” attempts.Objective
InsufficientResource
errors occur, managers should sync resource states with agents and retry session creation according to configured retry policiesExpected Sub Issue
InsufficientResource
errorsThe text was updated successfully, but these errors were encountered: