[Read_Screen] Comprehensively run QC checks regardless of failure status then output results #736

xonq · 2025-01-23T22:10:29Z

This PR closes #567

🗑️ This dev branch should be deleted after merging to main.

🧠 Summary

Updates task_screen (Read_screen) to run all screening checks regardless of intermittent failures. Outputs a TSV table of read screening results that propagates for both raw and clean read screening in Theia-pipelines

⚡ Impacted Workflows/Tasks

tasks/quality_control/comparisons/task_screen.wdl and all downstream workflows

This PR may lead to different results in pre-existing outputs: No

This PR uses an element that could cause duplicate runs to have different results: No

🛠️ Changes

Run all read screening checks regardless of previous checks missing thresholds
output each result to a single sample-level TSV
expose output to downstream workflows
update documentation

⚙️ Algorithm

conditional expression for failure check moved to end of task
create a TSV (read_screen.tsv) for read QC results
expose TSV to downstream WDL tasks/workflows for both raw and clean read screening

➡️ Inputs

n/a

⬅️ Outputs

Added output TSV filepath and exposed to downstream workflows

File read_screen_tsv = "read_screen.tsv"

🧪 Testing

Suggested Scenarios for Reviewer to Test

n/a

🔬 Final Developer Checklist

The workflow/task has been tested and results, including file contents, are as anticipated
The CI/CD has been adjusted and tests are passing (Theiagen developers)
Code changes follow the style guide
Documentation and/or workflow diagrams have been updated if applicable
- You have updated the "Last Known Changes" field for any affected workflows in the respective workflow documentation page and for every entry in the three workflows_overview tables to be the tag for the next upcoming release. If you do not know the tag, please put "vX.X.X"

🎯 Reviewer Checklist

All changed results have been confirmed
You have tested the PR appropriately (see the testing guide for more information)
All code adheres to the style guide
MD5 sums have been updated
The PR author has addressed all comments
The documentation has been updated

…vely populate a failure log "fail_log"

xonq added 3 commits January 23, 2025 22:09

complete all quality checks regardless of failure status; comprehensi…

0a12031

…vely populate a failure log "fail_log"

output metrics logging to TSV

d28cd4a

fix: update metrics logging to use echo -e for proper formatting

b586821

xonq changed the title ~~[Read_Screen] Comprehensively run QC checks regardless of failure status & output results~~ [Read_Screen] Comprehensively run QC checks regardless of failure status then output results Jan 24, 2025

xonq added 7 commits January 24, 2025 00:31

add read_screen to output

d90688a

expose screen_reads stats tsv path to theia-level workflows

21c0a8b

update documentation tables with read screen stats TSV

54cfe38

failure can be plural now

7f8f7a7

stats output should be tsv to delineate it is a file

9a1c026

rename read_screen stats variables to use '_tsv' suffix for consistency

6a8ddcc

terminate if statement, fix eof error

8e4c520

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Read_Screen] Comprehensively run QC checks regardless of failure status then output results #736

[Read_Screen] Comprehensively run QC checks regardless of failure status then output results #736

xonq commented Jan 23, 2025 •

edited

Loading

[Read_Screen] Comprehensively run QC checks regardless of failure status then output results #736

Are you sure you want to change the base?

[Read_Screen] Comprehensively run QC checks regardless of failure status then output results #736

Conversation

xonq commented Jan 23, 2025 • edited Loading

🧠 Summary

⚡ Impacted Workflows/Tasks

🛠️ Changes

⚙️ Algorithm

➡️ Inputs

⬅️ Outputs

🧪 Testing

Suggested Scenarios for Reviewer to Test

🔬 Final Developer Checklist

🎯 Reviewer Checklist

xonq commented Jan 23, 2025 •

edited

Loading