Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LFS Fixes #2954

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
2 changes: 1 addition & 1 deletion .github/actions/setup_env/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ runs:
- name: Generate Cache Key
run: |
file_hash=$(cat conda-${{ inputs.os-label }}.lock | shasum -a 256 | cut -d' ' -f1)
echo "file_hash=$file_hash" >> "${GITHUB_OUTPUT}"
echo "file_hash=tardis-conda-env-${{ inputs.os-label }}-${file_hash}-v1" >> "${GITHUB_OUTPUT}"
id: cache-environment-key
shell: bash

Expand Down
40 changes: 19 additions & 21 deletions .github/actions/setup_lfs/action.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
name: "Setup LFS"
description: "Pull LFS repositories and caches them"
description: "Sets up Git LFS, retrieves LFS cache and fails if cache is not available"


inputs:
regression-data-repo:
description: "tardis regression data repository"
description: "Repository containing regression data (format: owner/repo)"
required: false
default: "tardis-sn/tardis-regression-data"
atom-data-sparse:
description: "If true, only downloads atom_data/kurucz_cd23_chianti_H_He.h5 instead of full regression data"
required: false
default: 'false'

runs:
using: "composite"
Expand All @@ -16,37 +20,31 @@ runs:
with:
repository: ${{ inputs.regression-data-repo }}
path: tardis-regression-data
sparse-checkout: ${{ inputs.atom-data-sparse == 'true' && 'atom_data/kurucz_cd23_chianti_H_He.h5' || '' }}
lfs: false

- name: Create LFS file list
run: git lfs ls-files -l | cut -d' ' -f1 | sort > .lfs-assets-id
run: |
if [ "${{ inputs.atom-data-sparse }}" == "true" ]; then
echo "Using atom data sparse checkout"
echo "atom_data/kurucz_cd23_chianti_H_He.h5" > .lfs-files-list
else
echo "Using full repository checkout"
git lfs ls-files -l | cut -d' ' -f1 | sort > .lfs-files-list
fi
working-directory: tardis-regression-data
shell: bash

- name: Restore LFS cache
uses: actions/cache/restore@v4
id: lfs-cache-regression-data
with:
path: tardis-regression-data/.git/lfs
key: ${{ runner.os }}-lfs-${{ hashFiles('tardis-regression-data/.lfs-assets-id') }}-v1

- name: Git LFS Pull
run: git lfs pull
working-directory: tardis-regression-data
if: steps.lfs-cache-regression-data.outputs.cache-hit != 'true'
shell: bash
key: tardis-regression-${{ inputs.atom-data-sparse == 'true' && 'atom-data-sparse' || 'full-data' }}-${{ hashFiles('tardis-regression-data/.lfs-files-list') }}-${{ inputs.regression-data-repo }}-v1
fail-on-cache-miss: true

- name: Git LFS Checkout
run: git lfs checkout
working-directory: tardis-regression-data
if: steps.lfs-cache-regression-data.outputs.cache-hit == 'true'
shell: bash

- name: Save LFS cache if not found
# uses fake ternary
# for reference: https://github.com/orgs/community/discussions/26738#discussioncomment-3253176
if: ${{ steps.lfs-cache-regression-data.outputs.cache-hit != 'true' && !contains(github.ref, 'merge') && always() || false }}
uses: actions/cache/save@v4
id: lfs-cache-regression-data-save
with:
path: tardis-regression-data/.git/lfs
key: ${{ runner.os }}-lfs-${{ hashFiles('tardis-regression-data/.lfs-assets-id') }}-v1
16 changes: 10 additions & 6 deletions .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,12 @@ defaults:
shell: bash -l {0}

jobs:
test-cache:
uses: ./.github/workflows/lfs-cache.yml
with:
atom-data-sparse: false
regression-data-repo: tardis-sn/tardis-regression-data

build:
if: github.repository_owner == 'tardis-sn' &&
(github.event_name == 'push' ||
Expand All @@ -37,6 +43,7 @@ jobs:
(github.event_name == 'pull_request_target' &&
contains(github.event.pull_request.labels.*.name, 'benchmarks')))
runs-on: ubuntu-latest
needs: [test-cache]
steps:
- uses: actions/checkout@v4
if: github.event_name != 'pull_request_target'
Expand All @@ -54,13 +61,10 @@ jobs:
run: git fetch origin master:master
if: github.event_name == 'pull_request_target'

- uses: actions/checkout@v4
- name: Setup LFS
uses: ./.github/actions/setup_lfs
with:
repository: tardis-sn/tardis-regression-data
path: tardis-regression-data
lfs: true
sparse-checkout: |
atom_data/kurucz_cd23_chianti_H_He.h5
atom-data-sparse: true

- name: Setup Mamba
uses: mamba-org/setup-micromamba@v1
Expand Down
17 changes: 10 additions & 7 deletions .github/workflows/build-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@ defaults:
shell: bash -l {0}

jobs:
test-cache:
uses: ./.github/workflows/lfs-cache.yml
with:
atom-data-sparse: true
regression-data-repo: tardis-sn/tardis-regression-data

check-for-changes:
runs-on: ubuntu-latest
if: ${{ !github.event.pull_request.draft }}
Expand Down Expand Up @@ -77,7 +83,7 @@ jobs:

build-docs:
runs-on: ubuntu-latest
needs: check-for-changes
needs: [test-cache, check-for-changes]
if: needs.check-for-changes.outputs.trigger-check-outcome == 'success' || needs.check-for-changes.outputs.docs-check-outcome == 'success'
steps:
- uses: actions/checkout@v4
Expand All @@ -90,13 +96,10 @@ jobs:
ref: ${{ github.event.pull_request.head.sha }}
if: github.event_name == 'pull_request_target'

- uses: actions/checkout@v4
- name: Setup LFS
uses: ./.github/actions/setup_lfs
with:
repository: tardis-sn/tardis-regression-data
path: tardis-regression-data
lfs: true
sparse-checkout: |
atom_data/kurucz_cd23_chianti_H_He.h5
atom-data-sparse: true

- name: Setup environment
uses: ./.github/actions/setup_env
Expand Down
76 changes: 76 additions & 0 deletions .github/workflows/lfs-cache.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: Save LFS Cache

on:
workflow_call:
inputs:
atom-data-sparse:
description: "If true, only downloads atom_data/kurucz_cd23_chianti_H_He.h5"
required: false
default: false
type: boolean
regression-data-repo:
description: "Repository containing regression data (format: owner/repo)"
required: false
default: "tardis-sn/tardis-regression-data"
type: string

defaults:
run:
shell: bash -l {0}

concurrency:
# Only one workflow can run at a time
# the workflow group is a unique identifier and contains the workflow name, pull request number, atom data sparse, and regression data repo
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}-${{ inputs.atom-data-sparse == 'true' && 'atom-data-sparse' || 'full-data' }}-${{ inputs.regression-data-repo }}
cancel-in-progress: true


jobs:
lfs-cache:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
repository: ${{ inputs.regression-data-repo }}
path: tardis-regression-data
sparse-checkout: ${{ inputs.atom-data-sparse == 'true' && 'atom_data/kurucz_cd23_chianti_H_He.h5' || '' }}

- name: Create LFS file list
run: |
if [ "${{ inputs.atom-data-sparse }}" == "true" ]; then
echo "Using atom data sparse checkout"
echo "atom_data/kurucz_cd23_chianti_H_He.h5" > .lfs-files-list
else
echo "Using full repository checkout"
git lfs ls-files -l | cut -d' ' -f1 | sort > .lfs-files-list
fi
working-directory: tardis-regression-data


- name: Test cache availability
uses: actions/cache/restore@v4
id: test-lfs-cache-regression-data
with:
path: tardis-regression-data/.git/lfs
key: tardis-regression-${{ inputs.atom-data-sparse == 'true' && 'atom-data-sparse' || 'full-data' }}-${{ hashFiles('tardis-regression-data/.lfs-files-list') }}-${{ inputs.regression-data-repo }}-v1
lookup-only: true

- name: Git LFS Pull Atom Data
run: git lfs pull --include-ref=atom_data/kurucz_cd23_chianti_H_He.h5
if: ${{ inputs.atom-data-sparse == true && steps.test-lfs-cache-regression-data.outputs.cache-hit != 'true' }}
working-directory: tardis-regression-data

- name: Git LFS Pull Full Data
run: git lfs pull
if: ${{ inputs.atom-data-sparse == false && steps.test-lfs-cache-regression-data.outputs.cache-hit != 'true' }}
working-directory: tardis-regression-data

- name: Git LFS Checkout
run: git lfs checkout
working-directory: tardis-regression-data

- name: Save LFS cache if not found
uses: actions/cache/save@v4
with:
path: tardis-regression-data/.git/lfs
key: tardis-regression-${{ inputs.atom-data-sparse == true && 'atom-data-sparse' || 'full-data' }}-${{ hashFiles('tardis-regression-data/.lfs-files-list') }}-${{ inputs.regression-data-repo }}-v1
7 changes: 7 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,16 @@ concurrency:
cancel-in-progress: true

jobs:
test-cache:
uses: ./.github/workflows/lfs-cache.yml
with:
atom-data-sparse: false
regression-data-repo: tardis-sn/tardis-regression-data

tests:
name: ${{ matrix.continuum }} continuum ${{ matrix.os }} ${{ inputs.pip_git && 'pip tests enabled' || '' }}
if: github.repository_owner == 'tardis-sn'
needs: [test-cache]
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
Expand Down
24 changes: 24 additions & 0 deletions docs/contributing/development/continuous_integration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,30 @@ TARDIS Pipelines

Brief description of pipelines already implemented on TARDIS

Cache Keys in TARDIS CI
-----------------------

TARDIS uses specific cache key formats to efficiently store and retrieve data during CI runs:

1. **Regression Data Cache Keys**
- Format: ``tardis-regression-<data-type>-<hash>-v1``
- Examples:
- ``tardis-regression-atom-data-sparse-<hash>-v1`` - For atomic data cache
- ``tardis-regression-full-data-<hash>-v1`` - For full TARDIS regression data cache
- Used in: ``setup_lfs`` action

2. **Environment Cache Keys**
- Format: ``tardis-conda-env-<os-label>-<hash>-v1``
- Examples:
- ``tardis-conda-env-linux-<hash>-v1`` - For Linux conda environment
- ``tardis-conda-env-macos-<hash>-v1`` - For macOS conda environment
- Used in: ``setup_env`` action

.. warning::
- The version suffix (-v1) allows for future cache invalidation if needed.
- Sometimes the cache might not be saved due to race conditions between parallel jobs. Please check workflow runs when testing new regression data for cache misses to avoid consuming LFS quota.


Streamlined Steps for TARDIS Pipelines
========================================

Expand Down
2 changes: 1 addition & 1 deletion docs/contributing/development/running_tests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Or, to run tests for a particular file or directory
To prevent leaking LFS quota, tests have been disabled on forks.
If, by any chance, you need to run tests on your fork, make sure to run the tests workflow on master branch first.
The LFS cache generated in the master branch should be available in all child branches.
You can check if cache was generated by looking in the ``Restore LFS Cache`` step of the workflow run.
You can check if cache was generated by looking in the ``Setup LFS`` step of the workflow run.
Cache can also be found under the "Management" Section under "Actions" tab.

Generating Plasma Reference
Expand Down
Loading