diff --git a/CHANGELOG.md b/CHANGELOG.md index 520d2732..a5857adc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,22 +2,21 @@ # Version 2.0.0 - Kagoshima -:warning: This version contains a number of breaking changes. Please read the changelog carefully before upgrading. :warning: - -To migrate your schemas please follow the [migration guide](https://nextflow-io.github.io/nf-validation/latest/migration_guide/) +To migrate from nf-validation please follow the [migration guide](https://nextflow-io.github.io/nf-schema/latest/migration_guide/) ## New features - Added the `uniqueEntries` keyword. This keyword takes a list of strings corresponding to names of fields that need to be a unique combination. e.g. `uniqueEntries: ['sample', 'replicate']` will make sure that the combination of the `sample` and `replicate` fields is unique. ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) +- Added `samplesheetToList` which is the function equivalent of `.fromSamplesheet` [#3](https://github.com/nextflow-io/nf-schema/pull/3) ## Changes - Changed the used draft for the schema from `draft-07` to `draft-2020-12`. See the [2019-09](https://json-schema.org/draft/2019-09/release-notes) and [2020-12](https://json-schema.org/draft/2020-12/release-notes) release notes for all changes ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) -- Removed all validation code from the `.fromSamplesheet()` channel factory. The validation is now solely done in the `validateParameters()` function. A custom error message will now be displayed if any error has been encountered during the conversion ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) +- Removed the `fromSamplesheet` channel operator and added a `samplesheetToList` function instead. This function validates the samplesheet and returns a list of it. [#3](https://github.com/nextflow-io/nf-schema/pull/3) - Removed the `unique` keyword from the samplesheet schema. You should now use [`uniqueItems`](https://json-schema.org/understanding-json-schema/reference/array#uniqueItems) or `uniqueEntries` instead ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) -- Removed the `skip_duplicate_check` option from the `fromSamplesheet()` channel factory and the `--validationSkipDuplicateCheck` parameter. You should now use the `uniqueEntries` or [`uniqueItems`](https://json-schema.org/understanding-json-schema/reference/array#uniqueItems) keywords in the schema instead ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) -- `.fromSamplesheet()` now does dynamic typecasting instead of using the `type` fields in the JSON schema. This is done due to the complexity of `draft-2020-12` JSON schemas. This should not have that much impact but keep in mind that some types can be different between this and earlier versions because of this ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) -- `.fromSamplesheet()` will now set all missing values as `[]` instead of the type specific defaults (because of the changes in the previous point). This should not change that much as this will also result in `false` when used in conditions. ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) +- Removed the `skip_duplicate_check` option from the `samplesheetToList()` function and the `--validationSkipDuplicateCheck` parameter. You should now use the `uniqueEntries` or [`uniqueItems`](https://json-schema.org/understanding-json-schema/reference/array#uniqueItems) keywords in the schema instead ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) +- `samplesheetToList()` now does dynamic typecasting instead of using the `type` fields in the JSON schema. This is done due to the complexity of `draft-2020-12` JSON schemas. This should not have that much impact but keep in mind that some types can be different between this version and older versions in nf-validation ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) +- `samplesheetToList()` will now set all missing values as `[]` instead of the type specific defaults (because of the changes in the previous point). This should not change that much as this will also result in `false` when used in conditions. ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) ## Improvements @@ -25,133 +24,3 @@ To migrate your schemas please follow the [migration guide](https://nextflow-io. - The `schema` keyword will now work in all schemas. ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) - Improved the error messages ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) - `.fromSamplesheet()` now supports deeply nested samplesheets ([#141](https://github.com/nextflow-io/nf-validation/pull/141)) - -# Version 1.1.3 - Asahikawa - -## Improvements - -- Added support for double quotes (`"`) in CSV and TSV samplesheets ([#134](https://github.com/nextflow-io/nf-validation/pull/134)) - -# Version 1.1.2 - Wakayama - -## Bug fixes - -- Fixed an issue with inputs using `file-path-pattern` where only one file was found (`Path` casting to `ArrayList` error) ([#132](https://github.com/nextflow-io/nf-validation/pull/132)) - -# Version 1.1.1 - Shoyu - -## Bug fixes - -- Fixed an issue where samplesheet with a lot of null values would take forever to validate ([#120](https://github.com/nextflow-io/nf-validation/pull/120)) => Thanks @awgymer for fixing this! -- Now YAML files are actually validated instead of skipped ([#124](https://github.com/nextflow-io/nf-validation/pull/120)) - -# Version 1.1.0 - Miso - -## Features - -- Add support for samplesheets with no header ([#115](https://github.com/nextflow-io/nf-validation/pull/115)) - -## Bug fixes - -- Floats and doubles should now be created when using the `number` type in the schema ([#113](https://github.com/nextflow-io/nf-validation/pull/113/)) -- When `0` is used as a default value in the schema, a `0` will now be used as the value in the `.fromSamplesheet()` channel instead of `null` ([#114](https://github.com/nextflow-io/nf-validation/pull/114)) - -## New features - -- Added `file-path-pattern` format to check every file fetched using a glob pattern. Using a glob is now also possible in the samplesheet and will create a list of all files found using that glob pattern. ([#118](https://github.com/nextflow-io/nf-validation/pull/118)) - -# Version 1.0.0 - Tonkotsu - -The nf-validation plugin is now in production use across many pipelines and has (we hope) now reached a point of relative stability. The bump to major version v1.0.0 signifies that it is suitable for use in production pipelines. - -This version also introduces a small breaking change of syntax when providing optional arguments to the functions. You can now provide optional arguments such as the nextflow parameters schema path as: -`validateParameters(parameters_schema: 'my_file.json')` - -(previous syntax used positional arguments instead). - -## Bug fixes - -- The path to a custom parameters schema must be provided through a map '`parameters_schema: 'my_file.json'`' in `validateParameters()` and `paramsSummaryMap()` ([#108](https://github.com/nextflow-io/nf-validation/pull/108)) - -# Version 0.3.4 - -This version introduced a bug which made all pipeline runs using the function `validateParameters()` without providing any arguments fail. - -This bug causes Nextflow to exit with an error on launch for most pipelines. It should not be used. It was [removed](https://github.com/nextflow-io/plugins/pull/40) from the Nextflow Plugin registry to avoid breaking people's runs. - -### Bug fixes - -- Do not check S3 URL paths with `PathValidator` `FilePathValidator` and `DirectoryPathValidator` ([#106](https://github.com/nextflow-io/nf-validation/pull/106)) -- Make monochrome_logs an option in `paramsSummaryLog()`, `paramsSummaryMap()` and `paramsHelp()` instead of a global parameter ([#101](https://github.com/nextflow-io/nf-validation/pull/101)) - -# Version 0.3.3 - -### Bug fixes - -- Do not check if S3 URL paths exists to avoid AWS errors, and add a new parameter `validationS3PathCheck` ([#104](https://github.com/nextflow-io/nf-validation/pull/104)) - -# Version 0.3.2 - -### Bug fixes - -- Add parameters defined on the top level of the schema and within the definitions section as expected params ([#79](https://github.com/nextflow-io/nf-validation/pull/79)) -- Fix error when a parameter is not present in the schema and evaluates to false ([#89](https://github.com/nextflow-io/nf-validation/pull/89)) -- Changed the `schema_filename` option of `fromSamplesheet` to `parameters_schema` to make this option more clear to the user ([#91](https://github.com/nextflow-io/nf-validation/pull/91)) - -## Version 0.3.1 - -### Bug fixes - -- Don't check if path exists if param is not true ([#74](https://github.com/nextflow-io/nf-validation/pull/74)) -- Don't validate a file if the parameter evaluates to false ([#75](https://github.com/nextflow-io/nf-validation/pull/75)) - -## Version 0.3.0 - -### New features - -- Check that a sample sheet doesn't have duplicated entries by default. Can be disabled with `--validationSkipDuplicateCheck` ([#72](https://github.com/nextflow-io/nf-validation/pull/72)) - -### Bug fixes - -- Only validate a path if it is not null ([#50](https://github.com/nextflow-io/nf-validation/pull/50)) -- Only validate a file with a schema if the file path is provided ([#51](https://github.com/nextflow-io/nf-validation/pull/51)) -- Handle errors when sample sheet not provided or doesn't have a schema ([#56](https://github.com/nextflow-io/nf-validation/pull/56)) -- Silently ignore samplesheet fields that are not defined in samplesheet schema ([#59](https://github.com/nextflow-io/nf-validation/pull/59)) -- Correctly handle double-quoted fields containing commas in csv files by `.fromSamplesheet()` ([#63](https://github.com/nextflow-io/nf-validation/pull/63)) -- Print param name when path does not exist ([#65](https://github.com/nextflow-io/nf-validation/pull/65)) -- Fix file or directory does not exist error not printed when it was the only error in a samplesheet ([#65](https://github.com/nextflow-io/nf-validation/pull/65)) -- Do not return parameter in summary if it has no default in the schema and is set to 'false' ([#66](https://github.com/nextflow-io/nf-validation/pull/66)) -- Skip the validation of a file if the path is an empty string and improve error message when the path is invalid ([#69](https://github.com/nextflow-io/nf-validation/pull/69)) - -### Deprecated - -- The meta map of input channels is not an ImmutableMap anymore ([#68](https://github.com/nextflow-io/nf-validation/pull/68)). Reason: [Issue #52](https://github.com/nextflow-io/nf-validation/issues/52) - -## Version 0.2.1 - -### Bug fixes - -- Fixed a bug where `immutable_meta` option in `fromSamplesheet()` wasn't working when using `validateParameters()` first. (@nvnieuwk) - -## Version 0.2.0 - -### New features - -- Added a new [documentation site](https://nextflow-io.github.io/nf-validation/). (@ewels and @mashehu) -- Removed the `file-path-exists`, `directory-path-exists` and `path-exists` and added a [`exists`](https://nextflow-io.github.io/nf-validation/nextflow_schema/nextflow_schema_specification/#exists) parameter to the schema. (@mirpedrol) -- New [`errorMessage`](https://nextflow-io.github.io/nf-validation/nextflow_schema/nextflow_schema_specification/#errormessage) parameter for the schema which can be used to create custom error messages. (@mirpedrol) -- Samplesheet validation now happens in `validateParameters()` using the schema specified by the `schema` parameter in the parameters schema. (@mirpedrol) - -### Improvements - -- The `meta` maps are now immutable by default, see [`ImmutableMap`](https://nextflow-io.github.io/nf-validation/samplesheets/immutable_map/) for more info (@nvnieuwk) -- `validateAndConvertSamplesheet()` has been renamed to `fromSamplesheet()` -- Refactor `--schema_ignore_params` to `--validationSchemaIgnoreParams` - -### Bug fixes - -- Fixed a bug where an empty meta map would be created when no meta values are in the samplesheet schema. (@nvnieuwk) - -## Version 0.1.0 - -Initial release. diff --git a/Makefile b/Makefile index 4fc1c914..7fad7429 100644 --- a/Makefile +++ b/Makefile @@ -1,5 +1,6 @@ config ?= compileClasspath +version ?= $(shell grep 'Plugin-Version' plugins/nf-schema/src/resources/META-INF/MANIFEST.MF | awk '{ print $$2 }') ifdef module mm = :${module}: @@ -45,6 +46,10 @@ else endif +install: + ./gradlew copyPluginZip + rm -rf ${HOME}/.nextflow/plugins/nf-schema-${version} + cp -r build/plugins/nf-schema-${version} ${HOME}/.nextflow/plugins/nf-schema-${version} # # Upload JAR artifacts to Maven Central diff --git a/README.md b/README.md index 8b6b13ac..b1871e8c 100644 --- a/README.md +++ b/README.md @@ -36,7 +36,7 @@ This is all that is needed - Nextflow will automatically fetch the plugin code a You can now include the plugin helper functions into your Nextflow pipeline: ```groovy title="main.nf" -include { validateParameters; paramsHelp; paramsSummaryLog; fromSamplesheet } from 'plugin/nf-schema' +include { validateParameters; paramsHelp; paramsSummaryLog; samplesheetToList } from 'plugin/nf-schema' // Print help message, supply typical command line usage for the pipeline if (params.help) { @@ -50,9 +50,8 @@ validateParameters() // Print summary of supplied parameters log.info paramsSummaryLog(workflow) -// Create a new channel of metadata from a sample sheet -// NB: `input` corresponds to `params.input` and associated sample sheet schema -ch_input = Channel.fromSamplesheet("input") +// Create a new channel of metadata from a sample sheet passed to the pipeline through the --input parameter +ch_input = Channel.fromList(samplesheetToList(params.input, "assets/schema_input.json")) ``` ## Dependencies @@ -62,7 +61,7 @@ ch_input = Channel.fromSamplesheet("input") ## Slack channel -There is a dedicated [nf-validation Slack channel](https://nfcore.slack.com/archives/C056RQB10LU) in the [Nextflow Slack workspace](nextflow.slack.com). +There is a dedicated [nf-schema Slack channel](https://nfcore.slack.com/archives/C056RQB10LU) in the [Nextflow Slack workspace](https://nextflow.slack.com). ## Credits diff --git a/docs/background.md b/docs/background.md index c0091c6c..95dc61bf 100644 --- a/docs/background.md +++ b/docs/background.md @@ -15,3 +15,5 @@ In addition to config params, a common best-practice for pipelines is to use a " Nextflow itself does not provide functionality to validate config parameters or parsed sample sheets. To bridge this gap, we developed code within the [nf-core community](https://nf-co.re/) to allow pipelines to work with a standard `nextflow_schema.json` file, written using the [JSON Schema](https://json-schema.org/) format. The file allows strict typing of parameter variables and inclusion of validation rules. The nf-schema plugin moves this code out of the nf-core template into a stand-alone package, to make it easier to use for the wider Nextflow community. It also incorporates a number of new features, such as native Groovy sample sheet validation. + +Earlier versions of the plugin can be found in the [nf-validation](https://github.com/nextflow-io/nf-validation) repository and can still be used in the pipeline. However the nf-validation plugin is no longer supported and all development has been moved to nf-schema. diff --git a/docs/migration_guide.md b/docs/migration_guide.md index 09d74ac4..c5df2e8e 100644 --- a/docs/migration_guide.md +++ b/docs/migration_guide.md @@ -1,21 +1,22 @@ --- title: Migration guide -description: Guide to migrate pipelines using nf-schema pre v2.0.0 to after v2.0.0 +description: Guide to migrate pipelines from nf-validation to nf-schema hide: - toc --- # Migration guide -This guide is intended to help you migrate your pipeline from older versions of the plugin to version 2.0.0 and later. +This guide is intended to help you migrate your pipeline from [nf-validation](https://github.com/nextflow-io/nf-validation) to nf-schema. ## Major changes in the plugin -Following list shows the major breaking changes introduced in version 2.0.0: +Following list shows the major breaking changes introduced in nf-schema: 1. The JSON schema draft has been updated from `draft-07` to `draft-2020-12`. See [JSON Schema draft 2020-12 release notes](https://json-schema.org/draft/2020-12/release-notes) and [JSON schema draft 2019-09 release notes](https://json-schema.org/draft/2019-09/release-notes) for more information. -2. The `unique` keyword for samplesheet schemas has been removed. Please use [`uniqueItems`](https://json-schema.org/understanding-json-schema/reference/array#uniqueItems) or [`uniqueEntries`](nextflow_schema/nextflow_schema_specification.md#uniqueentries) now instead. -3. The `dependentRequired` keyword now works as it's supposed to work in JSON schema. See [`dependentRequired`](https://json-schema.org/understanding-json-schema/reference/conditionals#dependentRequired) for more information +2. The `fromSamplesheet` channel factory has been converted to a function called `samplesheetToList`. See [updating `fromSamplesheet`](#updating-fromsamplesheet) for more information. +3. The `unique` keyword for samplesheet schemas has been removed. Please use [`uniqueItems`](https://json-schema.org/understanding-json-schema/reference/array#uniqueItems) or [`uniqueEntries`](nextflow_schema/nextflow_schema_specification.md#uniqueentries) now instead. +4. The `dependentRequired` keyword now works as it's supposed to work in JSON schema. See [`dependentRequired`](https://json-schema.org/understanding-json-schema/reference/conditionals#dependentRequired) for more information A full list of changes can be found in the [changelog](https://github.com/nextflow-io/nf-schema/blob/master/CHANGELOG.md). @@ -31,9 +32,29 @@ This will replace the old schema draft specification (`draft-07`) by the new one !!! note - Repeat this command for every JSON schema you use in your pipeline. e.g. for the default samplesheet schema: + Repeat this command for every JSON schema you use in your pipeline. e.g. for the default samplesheet schema in nf-core pipelines: `bash sed -i -e 's/http:\/\/json-schema.org\/draft-07\/schema/https:\/\/json-schema.org\/draft\/2020-12\/schema/g' -e 's/definitions/defs/g' assets/schema_input.json ` +Next you should update the `.fromSamplesheet` channel factory to the `samplesheetToList` function. Following tabs shows the difference between the versions: + +=== "nf-validation" + + ```groovy + include { fromSamplesheet } from 'plugin/nf-validation' + Channel.fromSamplesheet("input") + ``` + +=== "nf-schema" + + ```groovy + include { samplesheetToList } from 'plugin/nf-schema' + Channel.fromList(samplesheetToList(params.input, "path/to/samplesheet/schema")) + ``` + +!!! note + + This change was necessary to make it possible for pipelines to be used as pluggable workflows. This also enables the validation and conversion of files generated by the pipeline. + If you are using any special features in your schemas, you will need to update your schemas manually. Please refer to the [JSON Schema draft 2020-12 release notes](https://json-schema.org/draft/2020-12/release-notes) and [JSON schema draft 2019-09 release notes](https://json-schema.org/draft/2019-09/release-notes) for more information. However here are some guides to the more common migration patterns: @@ -44,7 +65,7 @@ When you use `unique` in your schemas, you should update it to use `uniqueItems` If you used the `unique:true` field, you should update it to use `uniqueItems` like this: -=== "Before v2.0" +=== "nf-validation" ```json hl_lines="9" { @@ -62,7 +83,7 @@ If you used the `unique:true` field, you should update it to use `uniqueItems` l } ``` -=== "After v2.0" +=== "nf-schema" ```json hl_lines="12" { @@ -82,7 +103,7 @@ If you used the `unique:true` field, you should update it to use `uniqueItems` l If you used the `unique: ["field1", "field2"]` field, you should update it to use `uniqueEntries` like this: -=== "Before v2.0" +=== "nf-validation" ```json hl_lines="9" { @@ -100,7 +121,7 @@ If you used the `unique: ["field1", "field2"]` field, you should update it to us } ``` -=== "After v2.0" +=== "nf-schema" ```json hl_lines="12" { @@ -122,7 +143,7 @@ If you used the `unique: ["field1", "field2"]` field, you should update it to us When you use `dependentRequired` in your schemas, you should update it like this: -=== "Before v2.0" +=== "nf-validation" ```json hl_lines="12" { @@ -142,7 +163,7 @@ When you use `dependentRequired` in your schemas, you should update it like this } ``` -=== "After v2.0" +=== "nf-schema" ```json hl_lines="14 15 16" { diff --git a/docs/nextflow_schema/create_schema.md b/docs/nextflow_schema/create_schema.md index 2b248e5e..5fd9ea57 100644 --- a/docs/nextflow_schema/create_schema.md +++ b/docs/nextflow_schema/create_schema.md @@ -76,4 +76,6 @@ This web interface is where you should add detail to your schema, customising th There is currently no tooling to help you write sample sheet schema :anguished: + You can find an example in [Example sample sheet schema](sample_sheet_schema_examples.md) + Watch this space.. diff --git a/docs/nextflow_schema/sample_sheet_schema_specification.md b/docs/nextflow_schema/sample_sheet_schema_specification.md index 3d2796c8..b27e6648 100644 --- a/docs/nextflow_schema/sample_sheet_schema_specification.md +++ b/docs/nextflow_schema/sample_sheet_schema_specification.md @@ -59,7 +59,7 @@ Fields that are present in the sample sheet, but not in the schema will be ignor !!! warning The order of properties in the _schema_ **is** important. - This order defines the order of output channel properties when using the `fromSamplesheet` channel factory. + This order defines the order of output channel properties when using the `samplesheetToList()` function. ## Common keys @@ -68,12 +68,6 @@ For example: `type`, `pattern`, `format`, `errorMessage`, `exists` and so on. Please refer to the [Nextflow schema specification](../nextflow_schema/nextflow_schema_specification.md) docs for details. -!!! tip - - Sample sheets are commonly used to define input file paths. - Be sure to set `"type": "string"`, `exists: true`, `"format": "file-path"` and `"schema":"path/to/samplesheet/schema.json"` for these properties, - so that samplesheets are correctly validated and `fromSamplesheet` does not result in any errors. - ## Sample sheet keys Below are the properties that are specific to sample sheet schema. diff --git a/docs/samplesheets/examples.md b/docs/samplesheets/examples.md index 170bea9a..bc00030b 100644 --- a/docs/samplesheets/examples.md +++ b/docs/samplesheets/examples.md @@ -7,7 +7,7 @@ description: Examples of advanced sample sheet creation techniques. ## Introduction -Understanding channel structure and manipulation is critical for getting the most out of Nextflow. nf-schema helps initialise your channels from the text inputs to get you started, but further work might be required to fit your exact use case. In this page we run through some common cases for transforming the output of `.fromSamplesheet`. +Understanding channel structure and manipulation is critical for getting the most out of Nextflow. nf-schema helps initialise your channels from the text inputs to get you started, but further work might be required to fit your exact use case. In this page we run through some common cases for transforming the output of `samplesheetToList()`. ### Glossary @@ -17,7 +17,7 @@ Understanding channel structure and manipulation is critical for getting the mos ## Default mode -Each item in the channel emitted by `.fromSamplesheet()` is a tuple, corresponding with each row of the sample sheet. Each item will be composed of a meta value (if present) and any additional elements from columns in the sample sheet, e.g.: +Each item in the list emitted by `samplesheetToList()` is a tuple, corresponding with each row of the sample sheet. Each item will be composed of a meta value (if present) and any additional elements from columns in the sample sheet, e.g.: ```csv sample,fastq_1,fastq_2,bed @@ -25,7 +25,7 @@ sample1,fastq1.R1.fq.gz,fastq1.R2.fq.gz,sample1.bed sample2,fastq2.R1.fq.gz,fastq2.R2.fq.gz, ``` -Might create a channel where each element consists of 4 items, a map value followed by three files: +Might create a list where each element consists of 4 items, a map value followed by three files: ```groovy // Columns: @@ -36,13 +36,13 @@ Might create a channel where each element consists of 4 items, a map value follo [ [ id: "sample2" ], fastq2.R1.fq.gz, fastq2.R2.fq.gz, [] ] // A missing value from the sample sheet is an empty list ``` -This channel can be used as input of a process where the input declaration is: +This list can be converted to a channel that can be used as input of a process where the input declaration is: ```nextflow tuple val(meta), path(fastq_1), path(fastq_2), path(bed) ``` -It may be necessary to manipulate this channel to fit your process inputs. For more documentation, check out the [Nextflow operator docs](https://www.nextflow.io/docs/latest/operator.html), however here are some common use cases with `.fromSamplesheet()`. +It may be necessary to manipulate this channel to fit your process inputs. For more documentation, check out the [Nextflow operator docs](https://www.nextflow.io/docs/latest/operator.html), however here are some common use cases with `samplesheetToList()`. ## Using a sample sheet with no headers @@ -73,7 +73,7 @@ or this YAML file: - test_2 ``` -The output of `.fromSamplesheet()` will look like this: +The output of `samplesheetToList()` will look like this: ```bash test_1 @@ -82,7 +82,7 @@ test_2 ## Changing the structure of channel items -Each item in the channel will be a tuple, but some processes will use multiple files as a list in their input channel, this is common in nf-core modules. For example, consider the following input declaration in a process, where FASTQ could be > 1 file: +Each item in the list will be a tuple, but some processes will use multiple files as a list in their input channel, this is common in nf-core modules. For example, consider the following input declaration in a process, where FASTQ could be > 1 file: ```groovy process ZCAT_FASTQS { @@ -95,7 +95,7 @@ process ZCAT_FASTQS { } ``` -The output of `.fromSamplesheet()` can be used by default with a process with the following input declaration: +The output of `samplesheetToList()` (converted to a channel) can be used by default with a process with the following input declaration: ```groovy val(meta), path(fastq_1), path(fastq_2) @@ -104,7 +104,7 @@ val(meta), path(fastq_1), path(fastq_2) To manipulate each item within a channel, you should use the [Nextflow `.map()` operator](https://www.nextflow.io/docs/latest/operator.html#map). This will apply a function to each element of the channel in turn. Here, we convert the flat tuple into a tuple composed of a meta and a list of FASTQ files: ```groovy -Channel.fromSamplesheet("input") +Channel.fromList(samplesheetToList(params.input, "path/to/json/schema")) .map { meta, fastq_1, fastq_2 -> tuple(meta, [ fastq_1, fastq_2 ]) } .set { input } @@ -122,7 +122,7 @@ ZCAT_FASTQS(input) For example, to remove the BED file from the channel created above, we could not return it from the map. Note the absence of the `bed` item in the return of the closure below: ```groovy -Channel.fromSamplesheet("input") +Channel.fromList(samplesheetToList(params.input, "path/to/json/schema")) .map { meta, fastq_1, fastq_2, bed -> tuple(meta, fastq_1, fastq_2) } .set { input } @@ -136,7 +136,7 @@ In this way you can drop items from a channel. We could perform this twice to create one channel containing the FASTQs and one containing the BED files, however Nextflow has a native operator to separate channels called [`.multiMap()`](https://www.nextflow.io/docs/latest/operator.html#multimap). Here, we separate the FASTQs and BEDs into two separate channels using `multiMap`. Note, the channels are both contained in `input` and accessed as an attribute using dot notation: ```groovy -Channel.fromSamplesheet("input") +Channel.fromList(samplesheetToList(params.input, "path/to/json/schema")) .multiMap { meta, fastq_1, fastq_2, bed -> fastq: tuple(meta, fastq_1, fastq_2) bed: tuple(meta, bed) @@ -163,7 +163,7 @@ This example shows a channel which can have entries for WES or WGS data. WES dat // Channel with four elements - see docs for examples params.input = "samplesheet.csv" -Channel.fromSamplesheet("input") +Channel.fromList(samplesheetToList(params.input, "path/to/json/schema")) .branch { meta, fastq_1, fastq_2, bed -> // If BED does not exist WGS: !bed @@ -178,13 +178,13 @@ input.WGS.view() // Channel has 3 elements: meta, fastq_1, fastq_2 input.WES.view() // Channel has 4 elements: meta, fastq_1, fastq_2, bed ``` -Unlike `multiMap`, the outputs of `.branch()`, the resulting channels will contain a different number of items. +Unlike `.multiMap()`, the outputs of `.branch()` will contain a different number of items. ## Combining a channel After splitting the channel, it may be necessary to rejoin the channel. There are many ways to join a channel, but here we will demonstrate the simplest which uses the [Nextflow join operator](https://www.nextflow.io/docs/latest/operator.html#join) to rejoin any of the channels from above based on the first element in each item, the `meta` value. -```nextflow +```groovy input.fastq.view() // Channel has 3 elements: meta, fastq_1, fastq_2 input.bed.view() // Channel has 2 elements: meta, bed @@ -204,14 +204,14 @@ It's useful to determine the count of channel entries with similar values when y This example contains a channel where multiple samples can be in the same family. Later on in the pipeline we want to merge the analyzed files so one file gets created for each family. The result will be a channel with an extra meta field containing the count of channel entries with the same family name. ```groovy -// channel created by fromSamplesheet() previous to modification: +// channel created with samplesheetToList() previous to modification: // [[id:example1, family:family1], example1.txt] // [[id:example2, family:family1], example2.txt] // [[id:example3, family:family2], example3.txt] params.input = "sample sheet.csv" -Channel.fromSamplesheet("input") +Channel.fromList(samplesheetToList(params.input, "path/to/json/schema")) .tap { ch_raw } // Create a copy of the original channel .map { meta, txt -> [ meta.family ] } // Isolate the value to count on .reduce([:]) { counts, family -> // Creates a map like this: [family1:2, family2:1] diff --git a/docs/samplesheets/fromSamplesheet.md b/docs/samplesheets/fromSamplesheet.md deleted file mode 100644 index 6a4f56e3..00000000 --- a/docs/samplesheets/fromSamplesheet.md +++ /dev/null @@ -1,153 +0,0 @@ ---- -title: Create a channel -description: Channel factory to create a channel from a sample sheet. ---- - -# Create a channel from a sample sheet - -## `fromSamplesheet` - -This function validates and converts a sample sheet to a ready-to-use Nextflow channel. This is done using information encoded within a sample sheet schema (see the [docs](../nextflow_schema/sample_sheet_schema_specification.md)). - -The function has one mandatory argument: the name of the parameter which specifies the input sample sheet. The parameter specified must have the format `file-path` and include additional field `schema`: - -```json hl_lines="4" -{ - "type": "string", - "format": "file-path", - "schema": "assets/foo_schema.json" -} -``` - -The path specified in the `schema` key determines the JSON used for validation of the sample sheet. - -When using the `.fromSamplesheet` channel factory, one optional arguments can be used: - -- `parameters_schema`: File name for the pipeline parameters schema. (Default: `nextflow_schema.json`) - -```groovy -Channel.fromSamplesheet('input') -``` - -```groovy -Channel.fromSamplesheet('input', parameters_schema: 'custom_nextflow_schema.json') -``` - -## Basic example - -In [this example](https://github.com/nextflow-io/nf-schema/tree/master/examples/fromSamplesheetBasic), we create a simple channel from a CSV sample sheet. - -``` ---8<-- "examples/fromSamplesheetBasic/log.txt" -``` - -=== "main.nf" - - ```groovy - --8<-- "examples/fromSamplesheetBasic/pipeline/main.nf" - ``` - -=== "samplesheet.csv" - - ```csv - --8<-- "examples/fromSamplesheetBasic/samplesheet.csv" - ``` - -=== "nextflow.config" - - ```groovy - --8<-- "examples/fromSamplesheetBasic/pipeline/nextflow.config" - ``` - -=== "nextflow_schema.json" - - ```json hl_lines="19" - --8<-- "examples/fromSamplesheetBasic/pipeline/nextflow_schema.json" - ``` - -=== "assets/schema_input.json" - - ```json - --8<-- "examples/fromSamplesheetBasic/pipeline/assets/schema_input.json" - ``` - -## Order of fields - -[This example](https://github.com/nextflow-io/nf-schema/tree/master/examples/fromSamplesheetOrder) demonstrates that the order of columns in the sample sheet file has no effect. - -!!! danger - - It is the order of fields **in the sample sheet JSON schema** which defines the order of items in the channel returned by `fromSamplesheet()`, _not_ the order of fields in the sample sheet file. - -``` ---8<-- "examples/fromSamplesheetOrder/log.txt" -``` - -=== "samplesheet.csv" - - ```csv - --8<-- "examples/fromSamplesheetOrder/samplesheet.csv" - ``` - -=== "assets/schema_input.json" - - ```json hl_lines="10 15 20 33" - --8<-- "examples/fromSamplesheetOrder/pipeline/assets/schema_input.json" - ``` - -=== "main.nf" - - ```groovy - --8<-- "examples/fromSamplesheetOrder/pipeline/main.nf" - ``` - -=== "nextflow.config" - - ```groovy - --8<-- "examples/fromSamplesheetOrder/pipeline/nextflow.config" - ``` - -=== "nextflow_schema.json" - - ```json - --8<-- "examples/fromSamplesheetOrder/pipeline/nextflow_schema.json" - ``` - -## Channel with meta map - -In [this example](https://github.com/nextflow-io/nf-schema/tree/master/examples/fromSamplesheetMeta), we use the schema to mark two columns as meta fields. -This returns a channel with a meta map. - -``` ---8<-- "examples/fromSamplesheetMeta/log.txt" -``` - -=== "assets/schema_input.json" - - ```json hl_lines="14 38" - --8<-- "examples/fromSamplesheetMeta/pipeline/assets/schema_input.json" - ``` - -=== "main.nf" - - ```groovy - --8<-- "examples/fromSamplesheetMeta/pipeline/main.nf" - ``` - -=== "samplesheet.csv" - - ```csv - --8<-- "examples/fromSamplesheetMeta/samplesheet.csv" - ``` - -=== "nextflow.config" - - ```groovy - --8<-- "examples/fromSamplesheetMeta/pipeline/nextflow.config" - ``` - -=== "nextflow_schema.json" - - ```json - --8<-- "examples/fromSamplesheetMeta/pipeline/nextflow_schema.json" - ``` diff --git a/docs/samplesheets/samplesheetToList.md b/docs/samplesheets/samplesheetToList.md new file mode 100644 index 00000000..bd54cb9d --- /dev/null +++ b/docs/samplesheets/samplesheetToList.md @@ -0,0 +1,148 @@ +--- +title: Create a list +description: Function to create a list from a sample sheet. +--- + +# Create a list from a sample sheet + +## `samplesheetToList` + +This function validates and converts a sample sheet to a list. This is done using information encoded within a sample sheet schema (see the [docs](../nextflow_schema/sample_sheet_schema_specification.md)). + +The function has two mandatory arguments: + +1. The path to the samplesheet +2. The path to the JSON schema file corresponding to the samplesheet. + +These can be either a string with the relative path (from the root of the pipeline) or a file object of the schema. + +```groovy +samplesheetToList("path/to/samplesheet", "path/to/json/schema") +``` + +!!! note + + All data points in the CSV and TSV samplesheets will be converted to their derived type. (e.g. `"true"` will be converted to the Boolean `true` and `"2"` will be converted to the Integer `2`). You can still convert these types back to a String if this is not the expected behaviour with `.map { val -> val.toString() }` + +This function can be used together with existing channel factories/operators to create one channel entry per samplesheet entry. + +### Use as a channel factory + +The function can be given to the `.fromList` channel factory to mimic the functionality of a channel factory: + +```groovy +Channel.fromList(samplesheetToList("path/to/samplesheet", "path/to/json/schema")) +``` + +!!! note + + This will mimic the `fromSamplesheet` channel factory as it was in [nf-validation](https://github.com/nextflow-io/nf-validation). + +### Use as a channel oprator + +The function can be used with the `.flatMap` channel operator to create a channel from samplesheets that are already in a channel: + +```groovy +Channel.of("path/to/samplesheet").flatMap { samplesheetToList(it, "path/to/json/schema") } +``` + +## Basic example + +In [this example](https://github.com/nextflow-io/nf-schema/tree/master/examples/samplesheetToListBasic), we create a simple channel from a CSV sample sheet. + +``` +--8<-- "examples/samplesheetToListBasic/log.txt" +``` + +=== "main.nf" + + ```groovy + --8<-- "examples/samplesheetToListBasic/pipeline/main.nf" + ``` + +=== "samplesheet.csv" + + ```csv + --8<-- "examples/samplesheetToListBasic/samplesheet.csv" + ``` + +=== "nextflow.config" + + ```groovy + --8<-- "examples/samplesheetToListBasic/pipeline/nextflow.config" + ``` + +=== "assets/schema_input.json" + + ```json + --8<-- "examples/samplesheetToListBasic/pipeline/assets/schema_input.json" + ``` + +## Order of fields + +[This example](https://github.com/nextflow-io/nf-schema/tree/master/examples/samplesheetToListOrder) demonstrates that the order of columns in the sample sheet file has no effect. + +!!! danger + + It is the order of fields **in the sample sheet JSON schema** which defines the order of items in the channel returned by `samplesheetToList()`, _not_ the order of fields in the sample sheet file. + +``` +--8<-- "examples/samplesheetToListOrder/log.txt" +``` + +=== "samplesheet.csv" + + ```csv + --8<-- "examples/samplesheetToListOrder/samplesheet.csv" + ``` + +=== "assets/schema_input.json" + + ```json hl_lines="10 15 20 25" + --8<-- "examples/samplesheetToListOrder/pipeline/assets/schema_input.json" + ``` + +=== "main.nf" + + ```groovy + --8<-- "examples/samplesheetToListOrder/pipeline/main.nf" + ``` + +=== "nextflow.config" + + ```groovy + --8<-- "examples/samplesheetToListOrder/pipeline/nextflow.config" + ``` + +## Channel with meta map + +In [this example](https://github.com/nextflow-io/nf-schema/tree/master/examples/samplesheetToListMeta), we use the schema to mark two columns as meta fields. +This returns a channel with a meta map. + +``` +--8<-- "examples/samplesheetToListMeta/log.txt" +``` + +=== "assets/schema_input.json" + + ```json hl_lines="14 30" + --8<-- "examples/samplesheetToListMeta/pipeline/assets/schema_input.json" + ``` + +=== "main.nf" + + ```groovy + --8<-- "examples/samplesheetToListMeta/pipeline/main.nf" + ``` + +=== "samplesheet.csv" + + ```csv + --8<-- "examples/samplesheetToListMeta/samplesheet.csv" + ``` + +=== "nextflow.config" + + ```groovy + --8<-- "examples/samplesheetToListMeta/pipeline/nextflow.config" + ``` diff --git a/docs/samplesheets/validate_sample_sheet.md b/docs/samplesheets/validate_sample_sheet.md index 9fef1eaf..f9c608e2 100644 --- a/docs/samplesheets/validate_sample_sheet.md +++ b/docs/samplesheets/validate_sample_sheet.md @@ -26,4 +26,8 @@ See an example in the `input` field from the [example schema.json](https://raw.g } ``` +!!! note + + The `samplesheetToList` function also validate the files before converting them. If you convert the samplesheet, it's not necessary to add a schema to the parameter corresponding to the samplesheet. + For more information about the sample sheet JSON schema refer to [sample sheet docs](../nextflow_schema/nextflow_schema_specification.md). diff --git a/examples/fromSamplesheetBasic/pipeline/main.nf b/examples/fromSamplesheetBasic/pipeline/main.nf deleted file mode 100644 index a02f1ac8..00000000 --- a/examples/fromSamplesheetBasic/pipeline/main.nf +++ /dev/null @@ -1,5 +0,0 @@ -include { fromSamplesheet } from 'plugin/nf-schema' - -ch_input = Channel.fromSamplesheet("input") - -ch_input.view() diff --git a/examples/fromSamplesheetBasic/pipeline/nextflow_schema.json b/examples/fromSamplesheetBasic/pipeline/nextflow_schema.json deleted file mode 100644 index 6096ceb9..00000000 --- a/examples/fromSamplesheetBasic/pipeline/nextflow_schema.json +++ /dev/null @@ -1,39 +0,0 @@ -{ - "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json", - "title": "nf-core/testpipeline pipeline parameters", - "description": "this is a test", - "type": "object", - "defs": { - "input_output_options": { - "title": "Input/output options", - "type": "object", - "fa_icon": "fas fa-terminal", - "description": "Define where the pipeline should find input data and save output data.", - "required": ["input", "outdir"], - "properties": { - "input": { - "type": "string", - "format": "file-path", - "mimetype": "text/csv", - "schema": "assets/schema_input.json", - "pattern": "^\\S+\\.(csv|tsv|yaml|json)$", - "description": "Path to comma-separated file containing information about the samples in the experiment.", - "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).", - "fa_icon": "fas fa-file-csv" - }, - "outdir": { - "type": "string", - "format": "directory-path", - "description": "The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.", - "fa_icon": "fas fa-folder-open" - } - } - } - }, - "allOf": [ - { - "$ref": "#/defs/input_output_options" - } - ] -} diff --git a/examples/fromSamplesheetMeta/pipeline/main.nf b/examples/fromSamplesheetMeta/pipeline/main.nf deleted file mode 100644 index a02f1ac8..00000000 --- a/examples/fromSamplesheetMeta/pipeline/main.nf +++ /dev/null @@ -1,5 +0,0 @@ -include { fromSamplesheet } from 'plugin/nf-schema' - -ch_input = Channel.fromSamplesheet("input") - -ch_input.view() diff --git a/examples/fromSamplesheetMeta/pipeline/nextflow_schema.json b/examples/fromSamplesheetMeta/pipeline/nextflow_schema.json deleted file mode 100644 index 6096ceb9..00000000 --- a/examples/fromSamplesheetMeta/pipeline/nextflow_schema.json +++ /dev/null @@ -1,39 +0,0 @@ -{ - "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json", - "title": "nf-core/testpipeline pipeline parameters", - "description": "this is a test", - "type": "object", - "defs": { - "input_output_options": { - "title": "Input/output options", - "type": "object", - "fa_icon": "fas fa-terminal", - "description": "Define where the pipeline should find input data and save output data.", - "required": ["input", "outdir"], - "properties": { - "input": { - "type": "string", - "format": "file-path", - "mimetype": "text/csv", - "schema": "assets/schema_input.json", - "pattern": "^\\S+\\.(csv|tsv|yaml|json)$", - "description": "Path to comma-separated file containing information about the samples in the experiment.", - "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).", - "fa_icon": "fas fa-file-csv" - }, - "outdir": { - "type": "string", - "format": "directory-path", - "description": "The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.", - "fa_icon": "fas fa-folder-open" - } - } - } - }, - "allOf": [ - { - "$ref": "#/defs/input_output_options" - } - ] -} diff --git a/examples/fromSamplesheetOrder/pipeline/main.nf b/examples/fromSamplesheetOrder/pipeline/main.nf deleted file mode 100644 index a02f1ac8..00000000 --- a/examples/fromSamplesheetOrder/pipeline/main.nf +++ /dev/null @@ -1,5 +0,0 @@ -include { fromSamplesheet } from 'plugin/nf-schema' - -ch_input = Channel.fromSamplesheet("input") - -ch_input.view() diff --git a/examples/fromSamplesheetOrder/pipeline/nextflow_schema.json b/examples/fromSamplesheetOrder/pipeline/nextflow_schema.json deleted file mode 100644 index 6096ceb9..00000000 --- a/examples/fromSamplesheetOrder/pipeline/nextflow_schema.json +++ /dev/null @@ -1,39 +0,0 @@ -{ - "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json", - "title": "nf-core/testpipeline pipeline parameters", - "description": "this is a test", - "type": "object", - "defs": { - "input_output_options": { - "title": "Input/output options", - "type": "object", - "fa_icon": "fas fa-terminal", - "description": "Define where the pipeline should find input data and save output data.", - "required": ["input", "outdir"], - "properties": { - "input": { - "type": "string", - "format": "file-path", - "mimetype": "text/csv", - "schema": "assets/schema_input.json", - "pattern": "^\\S+\\.(csv|tsv|yaml|json)$", - "description": "Path to comma-separated file containing information about the samples in the experiment.", - "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).", - "fa_icon": "fas fa-file-csv" - }, - "outdir": { - "type": "string", - "format": "directory-path", - "description": "The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.", - "fa_icon": "fas fa-folder-open" - } - } - } - }, - "allOf": [ - { - "$ref": "#/defs/input_output_options" - } - ] -} diff --git a/examples/fromSamplesheetBasic/launch.sh b/examples/samplesheetToListBasic/launch.sh similarity index 100% rename from examples/fromSamplesheetBasic/launch.sh rename to examples/samplesheetToListBasic/launch.sh diff --git a/examples/fromSamplesheetBasic/log.txt b/examples/samplesheetToListBasic/log.txt similarity index 100% rename from examples/fromSamplesheetBasic/log.txt rename to examples/samplesheetToListBasic/log.txt diff --git a/examples/fromSamplesheetBasic/pipeline/assets/schema_input.json b/examples/samplesheetToListBasic/pipeline/assets/schema_input.json similarity index 84% rename from examples/fromSamplesheetBasic/pipeline/assets/schema_input.json rename to examples/samplesheetToListBasic/pipeline/assets/schema_input.json index aa527ed5..56f6a959 100644 --- a/examples/fromSamplesheetBasic/pipeline/assets/schema_input.json +++ b/examples/samplesheetToListBasic/pipeline/assets/schema_input.json @@ -19,16 +19,8 @@ }, "fastq_2": { "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.f(ast)?q\\.gz$" - }, - { - "type": "string", - "maxLength": 0 - } - ] + "type": "string", + "pattern": "^\\S+\\.f(ast)?q\\.gz$" }, "strandedness": { "type": "string", diff --git a/examples/samplesheetToListBasic/pipeline/main.nf b/examples/samplesheetToListBasic/pipeline/main.nf new file mode 100644 index 00000000..ea045517 --- /dev/null +++ b/examples/samplesheetToListBasic/pipeline/main.nf @@ -0,0 +1,5 @@ +include { samplesheetToList } from 'plugin/nf-schema' + +ch_input = Channel.fromList(samplesheetToList(params.input, "assets/schema_input.json")) + +ch_input.view() diff --git a/examples/fromSamplesheetBasic/pipeline/nextflow.config b/examples/samplesheetToListBasic/pipeline/nextflow.config similarity index 100% rename from examples/fromSamplesheetBasic/pipeline/nextflow.config rename to examples/samplesheetToListBasic/pipeline/nextflow.config diff --git a/examples/fromSamplesheetBasic/samplesheet.csv b/examples/samplesheetToListBasic/samplesheet.csv similarity index 100% rename from examples/fromSamplesheetBasic/samplesheet.csv rename to examples/samplesheetToListBasic/samplesheet.csv diff --git a/examples/fromSamplesheetMeta/launch.sh b/examples/samplesheetToListMeta/launch.sh similarity index 100% rename from examples/fromSamplesheetMeta/launch.sh rename to examples/samplesheetToListMeta/launch.sh diff --git a/examples/fromSamplesheetMeta/log.txt b/examples/samplesheetToListMeta/log.txt similarity index 100% rename from examples/fromSamplesheetMeta/log.txt rename to examples/samplesheetToListMeta/log.txt diff --git a/examples/fromSamplesheetMeta/pipeline/assets/schema_input.json b/examples/samplesheetToListMeta/pipeline/assets/schema_input.json similarity index 85% rename from examples/fromSamplesheetMeta/pipeline/assets/schema_input.json rename to examples/samplesheetToListMeta/pipeline/assets/schema_input.json index 7a931a25..ab42363a 100644 --- a/examples/fromSamplesheetMeta/pipeline/assets/schema_input.json +++ b/examples/samplesheetToListMeta/pipeline/assets/schema_input.json @@ -20,16 +20,8 @@ }, "fastq_2": { "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.f(ast)?q\\.gz$" - }, - { - "type": "string", - "maxLength": 0 - } - ] + "type": "string", + "pattern": "^\\S+\\.f(ast)?q\\.gz$" }, "strandedness": { "type": "string", diff --git a/examples/samplesheetToListMeta/pipeline/main.nf b/examples/samplesheetToListMeta/pipeline/main.nf new file mode 100644 index 00000000..ea045517 --- /dev/null +++ b/examples/samplesheetToListMeta/pipeline/main.nf @@ -0,0 +1,5 @@ +include { samplesheetToList } from 'plugin/nf-schema' + +ch_input = Channel.fromList(samplesheetToList(params.input, "assets/schema_input.json")) + +ch_input.view() diff --git a/examples/fromSamplesheetMeta/pipeline/nextflow.config b/examples/samplesheetToListMeta/pipeline/nextflow.config similarity index 100% rename from examples/fromSamplesheetMeta/pipeline/nextflow.config rename to examples/samplesheetToListMeta/pipeline/nextflow.config diff --git a/examples/fromSamplesheetMeta/samplesheet.csv b/examples/samplesheetToListMeta/samplesheet.csv similarity index 100% rename from examples/fromSamplesheetMeta/samplesheet.csv rename to examples/samplesheetToListMeta/samplesheet.csv diff --git a/examples/fromSamplesheetOrder/launch.sh b/examples/samplesheetToListOrder/launch.sh similarity index 100% rename from examples/fromSamplesheetOrder/launch.sh rename to examples/samplesheetToListOrder/launch.sh diff --git a/examples/fromSamplesheetOrder/log.txt b/examples/samplesheetToListOrder/log.txt similarity index 100% rename from examples/fromSamplesheetOrder/log.txt rename to examples/samplesheetToListOrder/log.txt diff --git a/examples/fromSamplesheetOrder/pipeline/assets/schema_input.json b/examples/samplesheetToListOrder/pipeline/assets/schema_input.json similarity index 84% rename from examples/fromSamplesheetOrder/pipeline/assets/schema_input.json rename to examples/samplesheetToListOrder/pipeline/assets/schema_input.json index a51e24f6..fbbd703e 100644 --- a/examples/fromSamplesheetOrder/pipeline/assets/schema_input.json +++ b/examples/samplesheetToListOrder/pipeline/assets/schema_input.json @@ -19,16 +19,8 @@ }, "fastq_2": { "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.f(ast)?q\\.gz$" - }, - { - "type": "string", - "maxLength": 0 - } - ] + "type": "string", + "pattern": "^\\S+\\.f(ast)?q\\.gz$" }, "fastq_1": { "type": "string", diff --git a/examples/samplesheetToListOrder/pipeline/main.nf b/examples/samplesheetToListOrder/pipeline/main.nf new file mode 100644 index 00000000..ea045517 --- /dev/null +++ b/examples/samplesheetToListOrder/pipeline/main.nf @@ -0,0 +1,5 @@ +include { samplesheetToList } from 'plugin/nf-schema' + +ch_input = Channel.fromList(samplesheetToList(params.input, "assets/schema_input.json")) + +ch_input.view() diff --git a/examples/fromSamplesheetOrder/pipeline/nextflow.config b/examples/samplesheetToListOrder/pipeline/nextflow.config similarity index 100% rename from examples/fromSamplesheetOrder/pipeline/nextflow.config rename to examples/samplesheetToListOrder/pipeline/nextflow.config diff --git a/examples/fromSamplesheetOrder/samplesheet.csv b/examples/samplesheetToListOrder/samplesheet.csv similarity index 100% rename from examples/fromSamplesheetOrder/samplesheet.csv rename to examples/samplesheetToListOrder/samplesheet.csv diff --git a/plugins/nf-schema/src/main/nextflow/validation/SamplesheetConverter.groovy b/plugins/nf-schema/src/main/nextflow/validation/SamplesheetConverter.groovy index 5fe8b21b..3fbe3c29 100644 --- a/plugins/nf-schema/src/main/nextflow/validation/SamplesheetConverter.groovy +++ b/plugins/nf-schema/src/main/nextflow/validation/SamplesheetConverter.groovy @@ -3,9 +3,10 @@ package nextflow.validation import groovy.json.JsonSlurper import groovy.transform.CompileStatic import groovy.util.logging.Slf4j - import java.nio.file.Path +import org.json.JSONArray + import nextflow.Nextflow /** @@ -20,10 +21,14 @@ class SamplesheetConverter { private static Path samplesheetFile private static Path schemaFile + private static nextflow.script.ScriptBinding$ParamsMap params + private static Map options - SamplesheetConverter(Path samplesheetFile, Path schemaFile) { + SamplesheetConverter(Path samplesheetFile, Path schemaFile, nextflow.script.ScriptBinding$ParamsMap params, Map options) { this.samplesheetFile = samplesheetFile this.schemaFile = schemaFile + this.params = params + this.options = options } private static List rows = [] @@ -62,8 +67,37 @@ class SamplesheetConverter { /* Convert the samplesheet to a list of entries based on a schema */ - public static List convertToList() { + public static List validateAndConvertToList() { + + // Logging + def Boolean useMonochromeLogs = this.options?.containsKey('monochrome_logs') ? this.options.monochrome_logs as Boolean : + this.params.monochrome_logs ? this.params.monochrome_logs as Boolean : + this.params.monochromeLogs ? this.params.monochromeLogs as Boolean : + false + def colors = Utils.logColours(useMonochromeLogs) + + // Some checks before validating + if(!this.schemaFile.exists()) { + def msg = "${colors.red}JSON schema file ${this.schemaFile.toString()} does not exist\n${colors.reset}\n" + throw new SchemaValidationException(msg) + } + + if(!this.samplesheetFile.exists()) { + def msg = "${colors.red}Samplesheet file ${this.samplesheetFile.toString()} does not exist\n${colors.reset}\n" + throw new SchemaValidationException(msg) + } + + // Validate + final validator = new JsonSchemaValidator() + def JSONArray samplesheet = Utils.fileToJsonArray(this.samplesheetFile, this.schemaFile) + def List validationErrors = validator.validate(samplesheet, this.schemaFile.text) + if (validationErrors) { + def msg = "${colors.red}The following errors have been detected in ${this.samplesheetFile.toString()}:\n\n" + validationErrors.join('\n').trim() + "\n${colors.reset}\n" + log.error("Validation of samplesheet failed!") + throw new SchemaValidationException(msg, validationErrors) + } + // Convert def LinkedHashMap schemaMap = new JsonSlurper().parseText(this.schemaFile.text) as LinkedHashMap def List samplesheetList = Utils.fileToList(this.samplesheetFile, this.schemaFile) diff --git a/plugins/nf-schema/src/main/nextflow/validation/SchemaValidationException.groovy b/plugins/nf-schema/src/main/nextflow/validation/SchemaValidationException.groovy index f772ec68..987c2594 100644 --- a/plugins/nf-schema/src/main/nextflow/validation/SchemaValidationException.groovy +++ b/plugins/nf-schema/src/main/nextflow/validation/SchemaValidationException.groovy @@ -14,7 +14,7 @@ class SchemaValidationException extends AbortOperationException { List getErrors() { errors } - SchemaValidationException(String message, List errors) { + SchemaValidationException(String message, List errors=[]) { super(message) this.errors = new ArrayList<>(errors) } diff --git a/plugins/nf-schema/src/main/nextflow/validation/SchemaValidator.groovy b/plugins/nf-schema/src/main/nextflow/validation/SchemaValidator.groovy index 6d755574..f8fbac12 100644 --- a/plugins/nf-schema/src/main/nextflow/validation/SchemaValidator.groovy +++ b/plugins/nf-schema/src/main/nextflow/validation/SchemaValidator.groovy @@ -6,15 +6,17 @@ import groovy.json.JsonGenerator import groovy.transform.CompileStatic import groovy.util.logging.Slf4j import groovyx.gpars.dataflow.DataflowWriteChannel +import groovyx.gpars.dataflow.DataflowReadChannel import java.nio.file.Files import java.nio.file.Path import java.util.regex.Matcher import java.util.regex.Pattern import nextflow.extension.CH +import nextflow.extension.DataflowHelper import nextflow.Channel import nextflow.Global import nextflow.Nextflow -import nextflow.plugin.extension.Factory +import nextflow.plugin.extension.Operator import nextflow.plugin.extension.Function import nextflow.plugin.extension.PluginExtensionPoint import nextflow.script.WorkflowMetadata @@ -133,78 +135,46 @@ class SchemaValidator extends PluginExtensionPoint { m.findResult { k, v -> v instanceof Map ? findDeep(v, key) : null } } - @Factory - public DataflowWriteChannel fromSamplesheet( - Map options = null, - String samplesheetParam + @Function + public List samplesheetToList( + final CharSequence samplesheet, + final CharSequence schema, + final Map options = null ) { - def Map params = session.params - - // Set defaults for optional inputs - def String schemaFilename = options?.containsKey('parameters_schema') ? options.parameters_schema as String : 'nextflow_schema.json' - def String baseDir = session.baseDir.toString() - - // Get the samplesheet schema from the parameters schema - def slurper = new JsonSlurper() - def Map parsed = (Map) slurper.parse( Path.of(Utils.getSchemaPath(baseDir, schemaFilename)) ) - def Map samplesheetValue = (Map) findDeep(parsed, samplesheetParam) - def Path samplesheetFile = params[samplesheetParam] as Path - - // Some safeguard to make sure the channel factory runs correctly - if (samplesheetValue == null) { - log.error """ -Parameter '--$samplesheetParam' was not found in the schema ($schemaFilename). -Unable to create a channel from it. - -Please make sure you correctly specified the inputs to `.fromSamplesheet`: - --------------------------------------------------------------------------------------- -Channel.fromSamplesheet("input") --------------------------------------------------------------------------------------- - -This would create a channel from params.input using the schema specified in the parameters JSON schema for this parameter. -""" - throw new SchemaValidationException("", []) - } - else if (samplesheetFile == null) { - log.error "Parameter '--$samplesheetParam' was not provided. Unable to create a channel from it." - throw new SchemaValidationException("", []) - } - else if (!samplesheetValue.containsKey('schema')) { - log.error "Parameter '--$samplesheetParam' does not contain a schema in the parameter schema ($schemaFilename). Unable to create a channel from it." - throw new SchemaValidationException("", []) - } - - // Convert to channel - final channel = CH.create() - def List arrayChannel = [] - try { - def Path schemaFile = Path.of(Utils.getSchemaPath(baseDir, samplesheetValue['schema'].toString())) - def SamplesheetConverter converter = new SamplesheetConverter(samplesheetFile, schemaFile) - arrayChannel = converter.convertToList() - } catch (Exception e) { - log.error( - """ Following error has been found during samplesheet conversion: - ${e} - ${e.getStackTrace().join("\n\t")} - -Please run validateParameters() first before trying to convert a samplesheet to a channel. -Reference: https://nextflow-io.github.io/nf-schema/parameters/validation/ + def Path samplesheetFile = Nextflow.file(samplesheet) as Path + return samplesheetToList(samplesheetFile, schema, options) + } -Also make sure that the same schema is used for validation and conversion of the samplesheet -""" as String - ) - } + @Function + public List samplesheetToList( + final Path samplesheet, + final CharSequence schema, + final Map options = null + ) { + def String fullPathSchema = Utils.getSchemaPath(session.baseDir.toString(), schema as String) + def Path schemaFile = Nextflow.file(fullPathSchema) as Path + return samplesheetToList(samplesheet, schemaFile, options) + } - session.addIgniter { - arrayChannel.each { - channel.bind(it) - } - channel.bind(Channel.STOP) - } - return channel + @Function + public List samplesheetToList( + final CharSequence samplesheet, + final Path schema, + final Map options = null + ) { + def Path samplesheetFile = Nextflow.file(samplesheet) as Path + return samplesheetToList(samplesheetFile, schema, options) } + @Function + public List samplesheetToList( + final Path samplesheet, + final Path schema, + final Map options = null + ) { + def SamplesheetConverter converter = new SamplesheetConverter(samplesheet, schema, session.params, options) + return converter.validateAndConvertToList() + } // // Initialise expected params if not present @@ -338,7 +308,7 @@ Also make sure that the same schema is used for validation and conversion of the } // Colors - def colors = logColours(useMonochromeLogs) + def colors = Utils.logColours(useMonochromeLogs) // Validate List validationErrors = validator.validate(paramsJSON, schema_string) @@ -407,7 +377,7 @@ Also make sure that the same schema is used for validation and conversion of the params.monochromeLogs ? params.monochromeLogs as Boolean : false - def colors = logColours(useMonochromeLogs) + def colors = Utils.logColours(useMonochromeLogs) Integer num_hidden = 0 String output = '' output += 'Typical pipeline command:\n\n' @@ -587,7 +557,7 @@ Also make sure that the same schema is used for validation and conversion of the params.monochromeLogs ? params.monochromeLogs as Boolean : false - def colors = logColours(useMonochromeLogs) + def colors = Utils.logColours(useMonochromeLogs) String output = '' def LinkedHashMap params_map = paramsSummaryMap(workflow, parameters_schema: schemaFilename) def max_chars = paramsMaxChars(params_map) @@ -725,72 +695,4 @@ Also make sure that the same schema is used for validation and conversion of the } return max_chars } - - // - // ANSII Colours used for terminal logging - // - private static Map logColours(Boolean monochrome_logs) { - Map colorcodes = [:] - - // Reset / Meta - colorcodes['reset'] = monochrome_logs ? '' : "\033[0m" - colorcodes['bold'] = monochrome_logs ? '' : "\033[1m" - colorcodes['dim'] = monochrome_logs ? '' : "\033[2m" - colorcodes['underlined'] = monochrome_logs ? '' : "\033[4m" - colorcodes['blink'] = monochrome_logs ? '' : "\033[5m" - colorcodes['reverse'] = monochrome_logs ? '' : "\033[7m" - colorcodes['hidden'] = monochrome_logs ? '' : "\033[8m" - - // Regular Colors - colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" - colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" - colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" - colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" - colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" - colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" - colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" - colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" - - // Bold - colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" - colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" - colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" - colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" - colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" - colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" - colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" - colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" - - // Underline - colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" - colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" - colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" - colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" - colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" - colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" - colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" - colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" - - // High Intensity - colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" - colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" - colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" - colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" - colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" - colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" - colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" - colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" - - // Bold High Intensity - colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" - colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" - colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" - colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" - colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" - colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" - colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" - colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" - - return colorcodes - } } diff --git a/plugins/nf-schema/src/main/nextflow/validation/Utils.groovy b/plugins/nf-schema/src/main/nextflow/validation/Utils.groovy index eea33003..79f2c4e6 100644 --- a/plugins/nf-schema/src/main/nextflow/validation/Utils.groovy +++ b/plugins/nf-schema/src/main/nextflow/validation/Utils.groovy @@ -201,4 +201,72 @@ public class Utils { return "" } } + + // + // ANSII Colours used for terminal logging + // + public static Map logColours(Boolean monochrome_logs) { + Map colorcodes = [:] + + // Reset / Meta + colorcodes['reset'] = monochrome_logs ? '' : "\033[0m" + colorcodes['bold'] = monochrome_logs ? '' : "\033[1m" + colorcodes['dim'] = monochrome_logs ? '' : "\033[2m" + colorcodes['underlined'] = monochrome_logs ? '' : "\033[4m" + colorcodes['blink'] = monochrome_logs ? '' : "\033[5m" + colorcodes['reverse'] = monochrome_logs ? '' : "\033[7m" + colorcodes['hidden'] = monochrome_logs ? '' : "\033[8m" + + // Regular Colors + colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" + colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" + colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" + colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" + colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" + colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" + colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" + colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" + + // Bold + colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" + colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" + colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" + colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" + colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" + colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" + colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" + colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" + + // Underline + colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" + colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" + colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" + colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" + colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" + colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" + colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" + colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" + + // High Intensity + colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" + colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" + colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" + colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" + colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" + colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" + colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" + colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" + + // Bold High Intensity + colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" + colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" + colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" + colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" + colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" + colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" + colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" + colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" + + return colorcodes + } } \ No newline at end of file diff --git a/plugins/nf-schema/src/test/nextflow/validation/SamplesheetConverterTest.groovy b/plugins/nf-schema/src/test/nextflow/validation/SamplesheetConverterTest.groovy index 1c7fd250..2be61836 100644 --- a/plugins/nf-schema/src/test/nextflow/validation/SamplesheetConverterTest.groovy +++ b/plugins/nf-schema/src/test/nextflow/validation/SamplesheetConverterTest.groovy @@ -59,12 +59,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'should work fine - CSV' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/correct.csv' + params.input = "src/testResources/correct.csv" + params.schema = "src/testResources/schema_input.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_converter.json").view().first().map {println(it[0].getClass())} + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -86,12 +88,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'should work fine - quoted CSV' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/correct_quoted.csv' + params.input = "src/testResources/correct_quoted.csv" + params.schema = "src/testResources/schema_input.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_converter.json").view().first().map {println(it[0].getClass())} + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -113,12 +117,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'should work fine - TSV' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/correct.tsv' + params.input = "src/testResources/correct.tsv" + params.schema = "src/testResources/schema_input.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_converter.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -140,12 +146,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'should work fine - YAML' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/correct.yaml' + params.input = "src/testResources/correct.yaml" + params.schema = "src/testResources/schema_input.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_converter.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -167,12 +175,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'should work fine - JSON' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/correct.json' + params.input = "src/testResources/correct.json" + params.schema = "src/testResources/schema_input.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_converter.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -194,12 +204,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'arrays should work fine - YAML' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/correct_arrays.yaml' + params.input = "src/testResources/correct_arrays.yaml" + params.schema = "src/testResources/schema_input_with_arrays.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_converter_arrays.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -220,12 +232,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'arrays should work fine - JSON' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/correct_arrays.json' + params.input = "src/testResources/correct_arrays.json" + params.schema = "src/testResources/schema_input_with_arrays.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_converter_arrays.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -246,12 +260,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'no header - CSV' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/no_header.csv' + params.input = "src/testResources/no_header.csv" + params.schema = "src/testResources/no_header_schema.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_no_header.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -270,12 +286,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'no header - YAML' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/no_header.yaml' + params.input = "src/testResources/no_header.yaml" + params.schema = "src/testResources/no_header_schema.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_no_header.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -294,12 +312,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'no header - JSON' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/no_header.json' + params.input = "src/testResources/no_header.json" + params.schema = "src/testResources/no_header_schema.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_no_header.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -318,12 +338,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'extra field' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/extraFields.csv' + params.input = "src/testResources/extraFields.csv" + params.schema = "src/testResources/schema_input.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_converter.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -338,7 +360,7 @@ class SamplesheetConverterTest extends Dsl2Spec{ then: noExceptionThrown() - stdout.contains("Found the following unidentified headers in src/testResources/extraFields.csv:") + stdout.contains("Found the following unidentified headers in ${getRootString()}/src/testResources/extraFields.csv:" as String) stdout.contains("\t- extraField") stdout.contains("[[string1:fullField, string2:fullField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, [], unique1, 1, itDoesExist]" as String) stdout.contains("[[string1:value, string2:value, integer1:0, integer2:0, boolean1:true, boolean2:true], string1, 25, false, [], [], [], [], [], itDoesExist]") @@ -349,12 +371,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'no meta' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/no_meta.csv' + params.input = "src/testResources/no_meta.csv" + params.schema = "src/testResources/no_meta_schema.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_no_meta.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -373,12 +397,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'deeply nested samplesheet - YAML' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/deeply_nested.yaml' + params.input = "src/testResources/deeply_nested.yaml" + params.schema = "src/testResources/samplesheet_schema_deeply_nested.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_deeply_nested_samplesheet.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -397,12 +423,14 @@ class SamplesheetConverterTest extends Dsl2Spec{ def 'deeply nested samplesheet - JSON' () { given: def SCRIPT_TEXT = ''' - include { fromSamplesheet } from 'plugin/nf-schema' + include { samplesheetToList } from 'plugin/nf-schema' - params.input = 'src/testResources/deeply_nested.json' + params.input = "src/testResources/deeply_nested.json" + params.schema = "src/testResources/samplesheet_schema_deeply_nested.json" workflow { - Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_deeply_nested_samplesheet.json").view() + Channel.fromList(samplesheetToList(params.input, params.schema)) + .view() } ''' @@ -417,4 +445,96 @@ class SamplesheetConverterTest extends Dsl2Spec{ noExceptionThrown() stdout.contains("[[mapMeta:this is in a map, arrayMeta:[metaString45, metaString478], otherArrayMeta:[metaString45, metaString478], meta:metaValue, metaMap:[entry1:entry1String, entry2:12.56]], [[string1, string2], string3, 1, 1, ${getRootString()}/file1.txt], [string4, string5, string6], [[string7, string8], [string9, string10]], test]" as String) } + + def 'samplesheetToList - String, String' () { + given: + def SCRIPT_TEXT = ''' + include { samplesheetToList } from 'plugin/nf-schema' + + println(samplesheetToList("src/testResources/correct.csv", "src/testResources/schema_input.json").join("\\n")) + ''' + + when: + dsl_eval(SCRIPT_TEXT) + def stdout = capture + .toString() + .readLines() + .findResults {it.startsWith('[[') ? it : null } + + then: + noExceptionThrown() + stdout.contains("[[string1:fullField, string2:fullField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25.12, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, ${getRootString()}/src/testResources/test.txt, unique1, 1, itDoesExist]" as String) + stdout.contains("[[string1:value, string2:value, integer1:0, integer2:0, boolean1:true, boolean2:true], string1, 25.08, false, [], [], [], [], [], itDoesExist]") + stdout.contains("[[string1:dependentRequired, string2:dependentRequired, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, [], [], [], unique2, 1, itDoesExist]") + stdout.contains("[[string1:extraField, string2:extraField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, ${getRootString()}/src/testResources/testDir, unique3, 1, itDoesExist]" as String) + } + + def 'samplesheetToList - Path, String' () { + given: + def SCRIPT_TEXT = ''' + include { samplesheetToList } from 'plugin/nf-schema' + + println(samplesheetToList(file("src/testResources/correct.csv", checkIfExists:true), "src/testResources/schema_input.json").join("\\n")) + ''' + + when: + dsl_eval(SCRIPT_TEXT) + def stdout = capture + .toString() + .readLines() + .findResults {it.startsWith('[[') ? it : null } + + then: + noExceptionThrown() + stdout.contains("[[string1:fullField, string2:fullField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25.12, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, ${getRootString()}/src/testResources/test.txt, unique1, 1, itDoesExist]" as String) + stdout.contains("[[string1:value, string2:value, integer1:0, integer2:0, boolean1:true, boolean2:true], string1, 25.08, false, [], [], [], [], [], itDoesExist]") + stdout.contains("[[string1:dependentRequired, string2:dependentRequired, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, [], [], [], unique2, 1, itDoesExist]") + stdout.contains("[[string1:extraField, string2:extraField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, ${getRootString()}/src/testResources/testDir, unique3, 1, itDoesExist]" as String) + } + + def 'samplesheetToList - String, Path' () { + given: + def SCRIPT_TEXT = ''' + include { samplesheetToList } from 'plugin/nf-schema' + + println(samplesheetToList("src/testResources/correct.csv", file("src/testResources/schema_input.json", checkIfExists:true)).join("\\n")) + ''' + + when: + dsl_eval(SCRIPT_TEXT) + def stdout = capture + .toString() + .readLines() + .findResults {it.startsWith('[[') ? it : null } + + then: + noExceptionThrown() + stdout.contains("[[string1:fullField, string2:fullField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25.12, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, ${getRootString()}/src/testResources/test.txt, unique1, 1, itDoesExist]" as String) + stdout.contains("[[string1:value, string2:value, integer1:0, integer2:0, boolean1:true, boolean2:true], string1, 25.08, false, [], [], [], [], [], itDoesExist]") + stdout.contains("[[string1:dependentRequired, string2:dependentRequired, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, [], [], [], unique2, 1, itDoesExist]") + stdout.contains("[[string1:extraField, string2:extraField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, ${getRootString()}/src/testResources/testDir, unique3, 1, itDoesExist]" as String) + } + + def 'samplesheetToList - Path, Path' () { + given: + def SCRIPT_TEXT = ''' + include { samplesheetToList } from 'plugin/nf-schema' + + println(samplesheetToList(file("src/testResources/correct.csv", checkIfExists:true), file("src/testResources/schema_input.json", checkIfExists:true)).join("\\n")) + ''' + + when: + dsl_eval(SCRIPT_TEXT) + def stdout = capture + .toString() + .readLines() + .findResults {it.startsWith('[[') ? it : null } + + then: + noExceptionThrown() + stdout.contains("[[string1:fullField, string2:fullField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25.12, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, ${getRootString()}/src/testResources/test.txt, unique1, 1, itDoesExist]" as String) + stdout.contains("[[string1:value, string2:value, integer1:0, integer2:0, boolean1:true, boolean2:true], string1, 25.08, false, [], [], [], [], [], itDoesExist]") + stdout.contains("[[string1:dependentRequired, string2:dependentRequired, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, [], [], [], unique2, 1, itDoesExist]") + stdout.contains("[[string1:extraField, string2:extraField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, ${getRootString()}/src/testResources/test.txt, ${getRootString()}/src/testResources/testDir, ${getRootString()}/src/testResources/testDir, unique3, 1, itDoesExist]" as String) + } }