Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.5.0] Distinguish between input and output prefetching in request metadata #25057

Open
wants to merge 1 commit into
base: release-7.5.0
Choose a base branch
from

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Jan 24, 2025

ea4ad30 added an action as context for prefetches with BwoB, but assumed that all callers would use the prefetcher for inputs to locally executed actions. However, since then, it has also been used to download outputs of completed actions that have been explicitly requested (e.g. as outputs of top-level targets or those matching the regex path patterns). This resulted in downloads with a prefetcher action ID and action details for the action that produced the file, rather than consumed it, resulting in confusing situations for observability tools.

This is fixed by separately tracking the reason for the fetch. Using an action ID of output when the action has the requested file as an output, and "input" when the action has the requested file as an input.

Closes #25040.

PiperOrigin-RevId: 719246746
Change-Id: Ib95ff65ba68112b1a38ab3022e5b1a19ef74cc9f
(cherry picked from commit 998e762)

RELNOTES: CAS requests made when Bazel downloads a blob with Build without the Bytes enabled now provide metadata with an action ID of input if the blob is downloaded as the input to a local action and output if it is a requested action output.

ea4ad30 added an action as context for prefetches with BwoB, but assumed that all callers would use the prefetcher for inputs to locally executed actions. However, since then, it has also been used to download outputs of completed actions that have been explicitly requested (e.g. as outputs of top-level targets or those matching the regex path patterns). This resulted in downloads with a `prefetcher` action ID and action details for the action that produced the file, rather than consumed it, resulting in confusing situations for observability tools.

This is fixed by separately tracking the reason for the fetch. Using an action ID of `output` when the action has the requested file as an output, and "input" when the action has the requested file as an input.

Closes bazelbuild#25040.

PiperOrigin-RevId: 719246746
Change-Id: Ib95ff65ba68112b1a38ab3022e5b1a19ef74cc9f
(cherry picked from commit 998e762)
@fmeum fmeum requested a review from a team as a code owner January 24, 2025 18:48
@fmeum fmeum changed the title Distinguish between input and output prefetching in request metadata [7.5.0] Distinguish between input and output prefetching in request metadata Jan 24, 2025
@github-actions github-actions bot added awaiting-review PR is awaiting review from an assigned reviewer team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team labels Jan 24, 2025
@iancha1992 iancha1992 enabled auto-merge January 24, 2025 19:12
@Wyverald
Copy link
Member

Could you please add the relevant RELNOTES for this? Especially since I saw in the original PR that this is technically a breaking change (I assume we're aware of that and still think it's worth cherry-picking).

@fmeum
Copy link
Collaborator Author

fmeum commented Jan 24, 2025

The change during import made this technically breaking, but as it was the metadata was actually misleading, so I think this is still an improvement. Added RELNOTES.

@fmeum
Copy link
Collaborator Author

fmeum commented Jan 24, 2025

We could make this non-breaking by replacing input with prefetcher in the cherry-picks.

@coeuvre What do you think, do the backwards compatibility guarantees extend this far?

@fmeum
Copy link
Collaborator Author

fmeum commented Jan 24, 2025

That said, since consumers would still need to support 7.4.0 and earlier, which use prefetcher in cases where they don't prefetch, it's probably still better to use entirely new and precise terms. Existing systems must already deal with arbitrary non-digest action IDs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants