-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1529990 - Fix Mobile bustage #313
Bug 1529990 - Fix Mobile bustage #313
Conversation
535df6e
to
4fdb421
Compare
|
||
""" | ||
if isinstance(url, str): | ||
return url.startswith('https://github.com/') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we also need to support the [email protected]
ssh urls. This should work if we don't. I'm fine with addressing this in a followup if we need to support ssh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of not, I'm not aware of the usage of [email protected]
nor ssh://
within chain of trust. Let's add it whenever the need comes up, I believe
to keep this. | ||
|
||
This checks for the following things:: | ||
|
||
* ``task.extra.env.tasks_for`` == "github-pull-request" | ||
* ``task.payload.env.MOBILE_HEAD_REPOSITORY`` doesn't come from an official repo | ||
* ``task.metadata.source`` doesn't come from an official repo, either | ||
* The last 2 items are landed on the official repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good check. A thought: we may want to look into restricting by branch as well (must have landed on official release branches on the official repo). I'm aware that may be difficult -- we may have to restrict it to the last N commits on a branch. If this is doable and wanted, this is probably a followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This implementation paves the way to #273. We just need to parse the bits of HTML returned by the branch_commits
endpoint and we can know what branch the commit lives on.
I'm not sure how we can check if it's part of the last N commits of a branch, yet. The real github APIv3 may have something already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I thought it might be easier to detect the last N commits of a branch, rather than just knowing which branch(es) a commit is on. If the latter is easier, it's preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, I thought there was a way to get the last N commits of a branch, but this doesn't seem to be exposed after all: https://developer.github.com/v3/repos/branches/
branch_commits
is our best alternative.
""" | ||
return any(( | ||
is_try(task, context.config['source_env_prefix']), | ||
is_pull_request(context, task), | ||
# XXX Cron tasks don't usually define 'taskcluster-github' as their schedulerId as they |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we use standalone taskgraph in our github repos, we can look at standardizing how the taskgraph generates cron tasks so we can detect it here.
scriptworker/github.py
Outdated
return official_repo_owner == repo_owner | ||
|
||
|
||
def _is_git_full_hash(commit_hash): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: since the input of this function isn't always a commit hash (it may be a branch or a tag), perhaps this should be called "revision" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 :)
""" | ||
# Revision may be a tag name. `branch_commits` doesn't work on tags | ||
if not _is_git_full_hash(commit_hash=revision): | ||
revision = self.get_tag_hash(tag_name=revision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that in this case, revision
isn't a tag_name
but a branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, that's not a supported case. get_tag_hash()
will raise an error because the tag doesn't exist. I'm not aware of a case where we want to support it. If that comes up, I'm fine adding this support.
|
||
repo = self._github_repository.html_url | ||
|
||
url = '/'.join([repo.rstrip('/'), 'branch_commits', revision]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I probably went on the same goose chase as you:
- "The Github API probably has a way to look up if a commit is in my repository (not including PRs)"
- "No, but maybe the Github API has a way to find out what branch(es) a commit is on?"
- "Ugh, no, but this guy pointed out the semi-API/semi-HTML-y "https://github.com/:user/:repo/branch_commits/:commit"
It's crazy that that's the nicest way of doing this 😠 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's exactly what happened! 😅
|
||
def _check_github_url_is_supported(url): | ||
if not is_github_url(url): | ||
raise ValueError('"{}" is not a supported GitHub URL!'.format(url)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit that isn't relevant for this PR: I wonder if it would make sense to have earlier validation for things like this?
Currently, we don't need if
checks to assert that our variables are strings/numbers/whatever deep within our functions because we perform earlier validation of the task, and that allows the functions that operate on the task to be so much easier to read (and write).
In the same way. it could be beneficial to have validation of github urls in advance? It could make logic cleaner, since we could leverage expectations with a data type (e.g.: "my function parameter is GithubUrl
, so it better be a Github URL") rather than a function call ("what does _check_github_url_is_supported(...)
do? Oh, it can throw an error, gotcha").
This is a potentially subjective note. Not that big of a concern :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If, as you suggest, we weren't using strings everywhere (particularly if we could lean on type-checking), we could get away without these checks.
However, given that this is security sensitive code, I'd rather err on the side of having more sanity checks, rather than less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I'm not recommending removing these checks, but rather moving them to the entry-point of untrusted data and doing validation there.
To represent that the string has been validated as the specialized type (github url string), we could use a type at that point.
But yeah, I agree, I don't think that we should lessen our sanity checks :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on keeping this check here, for now.
scriptworker/task.py
Outdated
|
||
Args: | ||
context (scriptworker.context.Context): the scriptworker context. | ||
task (dict): the task definition to check. | ||
|
||
Returns: | ||
bool: True if it's a pull-request | ||
bool: True if it's a pull-request. False if it either comes from the official repos or if | ||
the origin can't be determined. In fact, some valid scriptworker tasks don't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this comment is cut off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Fixed.
) | ||
for repo_url, revision, can_skip in urls_revisions_and_can_skip: | ||
# XXX In the case of scriptworker tasks, neither the repo nor the revision is defined | ||
if not repo_url and can_skip: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if not repo_url
and can_skip is False
? We expect extract_github_repo_and_name(...)
to throw an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. repo_url
won't be a valid github URL and then _check_github_url_is_supported()
raises a ValueError
|
||
github_repository = GitHubRepository(repo_owner, repo_name, context.config['github_oauth_token']) | ||
conditions.append(not await github_repository.has_commit_landed_on_repository(context, revision)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :)
c492c70
to
df97b06
Compare
Chatted with @tomprince, yesterday on Slack: he's fine with me landing the patch. |
And add more checks to detect PRs