Skip to content

Commit

Permalink
Merge branch 'david/githubWorkflowTestFix' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
DavidBrownlee committed Nov 19, 2024
2 parents e295780 + 4da6290 commit 975c974
Show file tree
Hide file tree
Showing 5 changed files with 19 additions and 13 deletions.
9 changes: 5 additions & 4 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,15 @@ jobs:
- name: Test with pytest
run: |
pytest
- name: Compare moh_template.csv
- name: Compare to moh_v3_template.csv
shell: bash {0}
run: |
python generate_schema.py
diff template.csv moh_template.csv > curr_diff.txt
# Script based largely on update_moh_template.sh
python src/clinical_etl/generate_schema.py --out moh_template
diff moh_template.csv moh_v3_template.csv > curr_diff.txt
bytes=$(head -5 curr_diff.txt | wc -c)
dd if=curr_diff.txt bs="$bytes" skip=1 conv=notrunc of=new_diff.txt
diff new_diff.txt test_data/moh_diffs.txt
diff new_diff.txt tests/moh_diffs.txt
if [[ $? == 1 ]]; then echo MoH template checking needs to be updated! See https://github.com/CanDIG/clinical_ETL_code#mapping-template for information.
exit 1
fi
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ You'll need to create a mapping template that defines the mapping between the fi

Each line in the mapping template is composed of comma separated values with two components. The first value is an `element` or field from the target schema and the second value contains a suggested `mapping method` or function to map a field from an input sheet to a valid value for the identified `element`. Each `element`, shows the full object linking path to each field required by the model. These values should not be edited.

If you are generating a mapping for the current CanDIG MoH model, you can use the pre-generated [`moh_template.csv`](moh_template.csv) file. This file is modified from the auto-generated template to update a few fields that require specific handling.
If you are generating a mapping for the current CanDIG MoH model, you can use the pre-generated [`moh_v3_template.csv`](moh_v3_template.csv) file. This file is modified from the auto-generated template to update a few fields that require specific handling.

You will need to edit the `mapping method` values in each line in the following ways:
1. Replace the generic sheet names (e.g. `DONOR_SHEET`, `SAMPLE_REGISTRATIONS_SHEET`) with the sheet/csv names you are using as your input to `CSVConvert.py`
Expand Down
4 changes: 2 additions & 2 deletions src/clinical_etl/generate_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ def parse_args():
default="https://raw.githubusercontent.com/CanDIG/katsu/develop/chord_metadata_service/mohpackets/docs/schemas/schema.json")
parser.add_argument('--schema', type=str, help="Name of schema class", default="MoHSchemaV3")
parser.add_argument('--out', type=str,
help="name of output file; csv extension will be added. Default is template",
default="template")
help="name of output file; csv extension will be added. Default is moh_template",
default="moh_template")
args = parser.parse_args()
return args

Expand Down
4 changes: 4 additions & 0 deletions tests/moh_diffs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,7 @@
< DONOR.INDEX.biomarkers.INDEX.pr_percent_positive, {floating(BIOMARKERS_SHEET.pr_percent_positive)}
---
> DONOR.INDEX.biomarkers.INDEX.pr_percent_positive, {set_neg_99_blank_float(BIOMARKERS_SHEET.pr_percent_positive)}
161c161
< DONOR.INDEX.followups.INDEX, {indexed_on(FOLLOWUPS_SHEET.submitter_donor_id)}
---
> DONOR.INDEX.followups.INDEX, {moh_indexed_on_donor_if_others_absent(FOLLOWUPS_SHEET.submitter_donor_id)}
13 changes: 7 additions & 6 deletions update_moh_template.sh
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
#!/usr/bin/env bash
# Updates the moh_template based on the schema.
# Manual differences are recorded in tests/moh_diffs.txt


python src/clinical_etl/generate_schema.py --out tmp_template
diff tmp_template.csv moh_v3_template.csv > tests/moh_diffs.txt
bytes=$(head -5 tests/moh_diffs.txt | wc -c)
dd if=tests/moh_diffs.txt bs="$bytes" skip=1 conv=notrunc of=tests/moh_diffs1.txt
python src/clinical_etl/generate_schema.py --out moh_template
diff moh_template.csv moh_v3_template.csv > curr_diff.txt
bytes=$(head -5 curr_diff.txt | wc -c)
dd if=curr_diff.txt bs="$bytes" skip=1 conv=notrunc of=tests/moh_diffs1.txt
mv tests/moh_diffs1.txt tests/moh_diffs.txt
rm tmp_template.csv
rm curr_diff.txt

0 comments on commit 975c974

Please sign in to comment.