-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for generating taxprofiler/funcscan input samplesheets for preprocessed FASTQs/FASTAs #688
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, I think getting this blueprint in place will be very helpful in the future!
nextflow.config
Outdated
@@ -194,6 +194,9 @@ params { | |||
validationShowHiddenParams = false | |||
validate_params = true | |||
|
|||
// Generate downstream samplesheets | |||
generate_downstream_samplesheets = false | |||
generate_pipeline_samplesheets = "funcscan,taxprofiler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this default to null so that users have to opt-in to samplesheet generation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a good idea! I will set that in createtaxdb too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference? In both cases null/false as default it would be an opt-in by the user, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah maybe I misunderstood Carson... not sure :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like @jfy133 used only one workflow, which will selectively generate samplesheets based on params.generate_pipeline_samplesheets. Do you think it would be best to keep that consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, since FastQ files are being pulled from the publishDir, it might be a good idea to include options that override user inputs for params.publish_dir_mode (so that it is always 'copy' if a samplesheet is generated) and params.save_clipped_reads, params.save_phixremoved_reads ...etc so that the preprocessed FastQ files are published to the params.outdir if a downstream samplesheet is generated
nextflow.config
Outdated
@@ -194,6 +194,9 @@ params { | |||
validationShowHiddenParams = false | |||
validate_params = true | |||
|
|||
// Generate downstream samplesheets | |||
generate_downstream_samplesheets = false | |||
generate_pipeline_samplesheets = "funcscan,taxprofiler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a good idea! I will set that in createtaxdb too
ch_assemblies | ||
|
||
main: | ||
def downstreampipeline_names = params.generate_pipeline_samplesheets.split(",") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also implemented the same system in createtaxdb now, but with an additional input validation thing that you should also adopt here (i.e., to check that someone doesn't add an unsupported pipeline, or makes a typo).
Check the utils_nfcore_createtaxdb_pipeline
file there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ch_assemblies | ||
|
||
main: | ||
ch_list_for_samplesheet = ch_assemblies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next thing which I don't think will be so complicated is to add another input channel for bins, and here make an if/else statement if they want to send just the raw assemblies (all contigs) or binned contigs to the samplesheet.
It will need another pipeline level parameter too though --generate_samplesheet_funcscan_seqtype
or something
Co-authored-by: James A. Fellows Yates <[email protected]>
|
||
// Validate samplesheet generation parameters | ||
if (params.generate_downstream_samplesheets && !params.generate_pipeline_samplesheets) { | ||
error('[nf-core/createtaxdb] If supplying `--generate_downstream_samplesheets`, you must also specify which pipeline to generate for with `--generate_pipeline_samplesheets! Check input.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nf-core/mag ?
workflows/mag.nf
Outdated
@@ -25,6 +25,7 @@ include { ANCIENT_DNA_ASSEMBLY_VALIDATION } from '../subworkflows/local/ancient_ | |||
include { DOMAIN_CLASSIFICATION } from '../subworkflows/local/domain_classification' | |||
include { DEPTHS } from '../subworkflows/local/depths' | |||
include { LONGREAD_PREPROCESSING } from '../subworkflows/local/longread_preprocessing' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing :D
…correct assemblies files (Funcsan)
Something somewhere is not liking my new channel of gzipped assemblies...
|
Almost there,
though includes a file |
|
Closes #687 and #686
This adds the local subworkflow (and other relevant code and docs) for generating samplesheets for the downstream pipelines funcscan and taxprofiler.
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).