Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Headless release #5554

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Headless release #5554

wants to merge 9 commits into from

Conversation

tom-seqera
Copy link
Contributor

@tom-seqera tom-seqera commented Dec 2, 2024

General design

To try to keep things understandable and maintainable, this change takes the following approach:

  • Avoid putting everything in a single giant bash script
  • Try to avoid embedded scripting in the Github Action definitions, since these are harder to test/debug/reuse
  • Separate the process into more distinct "assemble" and "deploy" phases, and only build the release artifacts once

I've also tried to make some simplifications:

  • Reduce the surface area of the project's Makefile - removing make targets which were only used for manual releases, hopefully making it a bit less confusing.
  • Remove some layers of gradle indirection - in particular, some release-related gradle tasks which were just wrapping or emulating shell scripts have been replaced by a shell script.

Usage

  1. Update the version, launcher and changelog files (for Nextflow & plugins) locally, without committing
  2. Run the make release command locally
  3. Wait for the Release workflow to complete
  4. Update the Nextflow Github release page with the release notes

Github Action

The approach is to use Github Actions for orchestration, but keep the execution details in bash scripts. The Github Actions structure (Workflow > Job > Step) gives us some flexibility for how to approach the orchestration:

Add a job to the existing build.yml workflow, or create a separate release.yml workflow

The release process is quite complicated, with multiple artifacts being deployed to multiple different locations/systems. Given that the existing build workflow is already quite long and complex, a separate release.yml workflow file is used.

A single job, or multiple jobs

Github will automatically run jobs in parallel where possible, whereas running multiple steps in parallel is a bit more awkward. Jobs can also specify dependencies on each other, which Github will use to draw a nice diagram of the workflow:

Screenshot 2024-11-28 at 17 21 22

Jobs are more independent than steps, meaning splitting the workflow into jobs more cleanly separates the different release tasks and the credentials/artifacts required for each one, making the process easier to understand.

This approach does require a bit more yaml boilerplate: each job needs to checkout the code, initialise tooling, etc. Structuring the workflow as a single job would remove some of the yaml boilerplate, but also removes the nice diagram and makes it bit less clear what dependencies and credentials are required for some steps.

Some alternative Github Actions structures are shown in this draft PR and this other draft PR

NOTE: the third option (a single Github Actions job within the existing build.yml workflow) was selected.

Workflow trigger

Following the current project convention, the release workflow is triggered by a commit with message containing the text [release]. The commit is created by the make release command.

I couldn't find a way to create a workflow-level filter based the commit message, only to filter the first job. This would mean the action would still "run" on every push, but then skip all the jobs, which would create a lot of empty skipped workflows in Github.

Instead, release.yml uses a workflow_dispatch trigger which is explicitly activated by a 'Release' job in the existing build.yml when a [release] commit is detected, similar to the Wave process.

Limitations

These changes don't represent full automation of nextflow releases. This is more like a headless version of the build and deploy steps from the current manual release process. It still requires a number of manual steps, notably:

Before running make release:

  • The VERSION file must be updated manually
  • The nextflow launch script must also be manually updated to match the VERSION file
  • The changelog.txt file must be manually updated with change notes
  • For each core plugin which needs to be released:
    • its MANIFEST.MF file must be manually updated with the new version
    • its changelog.txt file must be manually updated with change notes

After the release workflow:

  • Review and publish the draft Github release - the release workflow will try to copy relevant notes for the version from changelog.txt to the automatically created draft Github release

As part of this change, jars are no longer published to maven central - only to the seqera maven repository.

The build-info.properties file contains build metadata including timestamp and commit id and is currently committed to the repo - meaning that always contain incorrect data (ie the commit id before [release]). To improve the accuracy of the metadata in the released Nextflow jars, this release workflow re-generates the build-info file when assembling them. However it doesn't commit it back to the repo since this would create an awkward second release commit after the [release] one.

The existing docker image build downloads the Nextflow runtime using the launcher rather than copying in the assembled artifacts. I'm not sure why this is, so have left it unchanged for now - but it does currently impose a strict ordering requirment between uploading the jars to S3 and building the docker image.

Failure recovery

Rather than implement complex retry/force logic in the workflow or try to anticipate all the possible failure scenarios, an additional strength of using small individual shell scripts for each task is that any specific failure can be debugged and re-run manually as needed. In order to avoid rebuilding the artifacts, they can be downloaded from the Github workflow summary page.

This should be especially useful during the initial transition from manual to automated release while we iron out kinks in the process.

Configuration

This release workflow requires a number of repository variables and secrets to be configured. Although some of them seem to duplicate existing secrets use by the CI build (eg AWS_ACCESS_KEY_ID vs AWS_DEPLOY_ACCESS_KEY_ID), my recommendation would be to use different variables/secrets to better control the different permissions required for a CI build vs a release. Different Github environments could also be used on the project to restrict the release workflow and credentials to specifc branches.

Type Name Notes
secret AWS_DEPLOY_ACCESS_KEY_ID
secret AWS_DEPLOY_SECRET_ACCESS_KEY
secret DEPLOY_GITHUB_TOKEN Distinct from AUTOMATIOn_GITHUB_TOKEN to allow separate permissions (see below)
secret SEQERA_PUBLIC_CR_PASSWORD
secret SEQERA_PUBLIC_CR_USER
var DEPLOY_GITHUB_EMAIL Email address used for publishing plugins to github
var DEPLOY_GITHUB_USER Github username for publishing plugins to github
var MAVEN_PLUGINS_PUBLISH_URL S3 bucket for publishing plugins jars as maven artifacts
var MAVEN_PUBLISH_URL S3 bucket for publishig nextflow jars as maven artifacts
var PLUGINS_GITHUB_ORG Name of github organisation which hosts the plugin projects (ie nextflow-io)
var PLUGINS_INDEX_JSON Url location of the plugins.json metadata index file
var S3_RELEASE_BUCKET S3 bucket which hosts the nextflow launcher script and binaries
var SEQERA_PUBLIC_CR_URL Location of the seqera container registry

Github token

The token provided for DEPLOY_GITHUB_TOKEN should have the following permissions:

  • Contents: read & write
  • Metadata: read only

On the following projects:

  • nextflow-io/nextflow
  • nextflow-io/plugins
  • nextflow-io/nf-amazon
  • nextfow-io/nf-azure
  • nextflow-io/nf-cloudcache
  • nextflow-io/nf-codecommit
  • nextflow-io/nf-console
  • nextflow-io/nf-google
  • nextflow-io/nf-tower
  • nextflow-io/nf-wave

Future enhancements

  • Add some validation to the make release script to check that the versions match each other, etc
  • Replace the remaining manual steps (like updating version files and changelogs) with improved automation
  • Remove build-info.properties from the repo, instead only generating during release and not committing
  • Move the creation of the [release] commit inside the release workflow, fully automating the process
  • Create automated frequent unstable preview builds for testing (eg nightly or weekly)

The target repository is an S3 bucket, jars are no longer
published to maven central repo.x

Signed-off-by: Tom Sellman <[email protected]>
Copy link

netlify bot commented Dec 2, 2024

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 95593a9
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/677e5925822640000842a7e8
😎 Deploy Preview https://deploy-preview-5554--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@bentsherman bentsherman added the github actions Pull requests that update GitHub Actions code label Dec 9, 2024
@pditommaso
Copy link
Member

I guess this should marked ready "Ready for review", right?

@tom-seqera tom-seqera marked this pull request as ready for review December 10, 2024 13:48
@pditommaso
Copy link
Member

Following the current project convention, the release workflow is triggered by a commit with message containing the text [release]. The commit is created by the make release command.

The release script should make the commit, the other way around, it should be assumed the script is invoked by the release commit.

@tom-seqera
Copy link
Contributor Author

Following the current project convention, the release workflow is triggered by a commit with message containing the text [release]. The commit is created by the make release command.

The release script should make the commit, the other way around, it should be assumed the script is invoked by the release commit.

I think that is how it works. The release commit is still what triggers the Github Action to perform the release. I just created a little make release target which someone doing a release can run locally to guide them through creating and pushing the release commit:

  1. On local machine run make release
  2. make release runs a local helper script which shows you the version you're about to release, and tells you what files you should modify if it's not right
  3. You confirm to the local helper script that you want to do the release
  4. The local helper script creates and pushes a commit with message [release]
  5. The Github Action detects the release commit and executes the release workflow, building and deploying the artifacts

I think this local helper script is useful because it provides a nice "entrypoint" to the release process, and acts as documentation for both the required manual steps and how the workflow is triggered.

Over time we can gradually improve the local helper script to automate more of the remaining manual steps. For example, it could prompt for which plugins to release and what versions they should be, and then automatically update the relevant manifest files before creating the release commit.

@pditommaso
Copy link
Member

pditommaso commented Dec 10, 2024

Fair enough, it may be useful. Then what's the main entry for the headless part? gradle makeDigest ?

@tom-seqera
Copy link
Contributor Author

The headless part is in release.yml, which runs as a workflow in Github Actions. It's triggered by a job in build.yml when a release commit is detected.

Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks an excellent work. Is this uploading the plugins to the respective GH projects?

Somehow unrelated, it loos like the DCO bot is not happy with your commit signature.

@tom-seqera
Copy link
Contributor Author

Is this uploading the plugins to the respective GH projects?

Yes, to control the scope of the changes plugins are still uploaded to their own Github projects. I think we should treat alternative upload destinations as a separate piece of work.

Comment on lines 172 to 177
indexUrl = System.getenv('PLUGINS_INDEX_JSON') ?: 'https://github.com/nextflow-io/plugins/main/plugins.json'
repos = allPlugins()
owner = github_organization
githubUser = github_username
githubEmail = github_commit_email
githubToken = github_access_token
owner = System.getenv('GH_ORG') ?: 'nextflow-io'
githubUser = System.getenv('GH_USER') ?: project.findProperty('github_username')
githubEmail = System.getenv('GH_USER_EMAIL') ?: project.findProperty('github_commit_email')
githubToken = System.getenv('GH_TOKEN') ?: project.findProperty('github_access_token')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking how to avoid the proliferation of usage of env variables in the Gradle build. What is the main.sh write a temporary gradle.properties file mapping the variables to the corresponding gradle properties ?

Copy link
Contributor Author

@tom-seqera tom-seqera Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a go at using a temporary gradle.properties but I don't think it's a good idea. It seems like it would be complex and fragile for very minimal benefit.

Ideally:

  • running in CI, the release should use the values from github (ie the env vars)
  • running locally, any values from $GRADLE_USER_HOME/gradle.properties should take priority
  • we don't want to risk accidentally committing secrets to the repo
  • we should try to keep things relatively simple to understand

One option would be to write the env vars into a gradle.properties in the project root. This would make them available to gradle, but anything in $GRADLE_USER_HOME/gradle.properties would still take priority thanks to gradle's built-in precedence rules. To make this safe, we'd need to add gradle.properties to .gitignore. But there's already a committed gradle.properties in the project so we'd have to modify it and then somehow guarantee that it would also remove any secrets from it (including in failure scenarios) to avoid accidentally committing the changes - which is much too risky.

Another option would be to write to properties file with a different name (eg .gradle.release.properties) which we could explicitly .gitignore and load into the gradle build if detected. The problem with this is I think those properties would then override anything from $GRADLE_USER_HOME so we'd have to add logic to only load properties which don't already exist, or only create the properties file if running in CI - at which point it seems like a lot of unecessary complexity.

I think we should also take care to try to keep release/main.sh as just an orchestration script and ensure that the sub-scripts can be run independently rather than assuming they will always run as part of a main script. This will be important for testing/debugging, fixing release errors, and for future flexibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it's worth switching the priority order in the gradle scripts though, so that it looks for a gradle property first and then falls back to an env var:

githubUser = project.findProperty('github_username') ?: System.getenv('GH_USER')

This way any values in $GRADLE_USER_HOME/gradle.properties will take precedence.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option would be to write the env vars into a gradle.properties in the project root.

How? or do you mean using only gradle properties?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was just thinking through options for ways to make main.sh write a temporary gradle properties file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we should reverse the logic and keep all params (except secrets) in the project gradle.properties.

Those could be "imported" in the shell script with basic helper, e.g. (courtesy chatgpt)

# Read the property file line by line
while IFS="=" read -r key value; do
  # Skip empty lines and lines starting with # (comments)
  [[ -z "$key" || "$key" == \#* ]] && continue

  # Convert the key: replace '.' with '_' and convert to uppercase
  env_var_name=$(echo "$key" | tr '.' '_' | tr '[:lower:]' '[:upper:]')

  # Export the variable as an environment variable
  export "$env_var_name=$value"
  echo "Exported: $env_var_name=$value"
done < "$PROPERTY_FILE"

Secrets could be kept both in the user gradle properties or env vars. GitHub actions secrets could be easily added to project gradle.properties.

Bit clunky but it could make the trick.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to implement and test this approach using gradle properties. It does work, but it felt quite convoluted and confusing for a pretty small benefit. Especially when running in CI: the github action writes all the relevant env vars to a gradle properties file, then immediately the release scripts turn those properties back into the same env vars!

On balance, I don't think it's worth it given that the primary goal here is to make the release run in CI, and that running things locally should only be a fallback for debuggging/resolving issues. My suggestion would be that when running locally we can just create a bash file containing the required env vars which can be sourced into a local shell.

I did swap the priority of loading values in the gradle scripts though - now they will look for gradle properties first and env vars second.

(and do some testing of the github action/scripts)

Signed-off-by: Tom Sellman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
github actions Pull requests that update GitHub Actions code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants