From b901cc618be8eb17284ccb0cf6ef9ee428d909c3 Mon Sep 17 00:00:00 2001 From: Robert Petit Date: Sun, 25 Aug 2024 12:37:52 -0600 Subject: [PATCH] add params into the YAML update readme --- CHANGELOG | 27 +---- README.md | 327 ++++++++++++++++++++++++++++++++++++++--------------- bin/sccmec | 8 +- 3 files changed, 240 insertions(+), 122 deletions(-) diff --git a/CHANGELOG b/CHANGELOG index 6f2ffc2..16842e6 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,6 +1,6 @@ # Changelog -## v1.2.0 rpetit3/sccmec "" 2024/08/25 +## v1.2.0 rpetit3/sccmec "Newman" 2024/08/25 - Utilize both targets and full cassettes for classification - update default thresholds based on `camlhmp-blast-thresholds` @@ -17,28 +17,3 @@ ## v1.0.0 rpetit3/sccmec "MRSA" 2024/04/30 - Initial release - -@click.option( - "--min-targets-pident", - default=90, - show_default=True, - help="Minimum percent identity of targets to count a hit", -) -@click.option( - "--min-targets-coverage", - default=80, - show_default=True, - help="Minimum percent coverage of targets to count a hit", -) -@click.option( - "--min-regions-pident", - default=85, - show_default=True, - help="Minimum percent identity of regions to count a hit", -) -@click.option( - "--min-regions-coverage", - default=83, - show_default=True, - help="Minimum percent coverage of regions to count a hit", -) diff --git a/README.md b/README.md index f019aa7..9e3cf8a 100644 --- a/README.md +++ b/README.md @@ -37,6 +37,33 @@ The following SCCmec types are supported by `sccmec`. | XIV | [Urushibara et al. 2020](https://doi.org/10.1093/jac/dkz406) | | XV | [Wang et al. 2022](https://doi.org/10.1093/jac/dkab500) | +The following SCCmec subtypes are supported by `sccmec`. + +| SubType | Citation | +|------|----------| +| Ia | [Ito et al. 2001](https://doi.org/10.1128/AAC.45.5.1323-1336.2001) | +| Ib | [Han et al. 2009](https://doi.org/10.1128/AAC.00772-08), [Oliveira et.al. 2006](https://doi.org/10.1093/jac/dkl208) | +| IIa | [Katayama et al. 2000](https://doi.org/10.1128/aac.44.6.1549-1555.2000), [Ito et al. 2001](https://doi.org/10.1128/AAC.45.5.1323-1336.2001) | +| IIb | [Hisata et al. 2005](https://doi.org/10.1128/JCM.43.7.3364-3372.2005) | +| IIc | [Shore et al. 2005](https://doi.org/10.1128/AAC.49.5.2070-2083.2005) | +| IId | [Kondp et al. 2007](https://doi.org/10.1128/AAC.00165-06) | +| IIe | [Han et al. 2009](https://doi.org/10.1128/AAC.00772-08) | +| IVa | [Ma et al. 2002](https://doi.org/10.1128/AAC.46.4.1147-1152.2002) | +| IVb | [Ma et al. 2002](https://doi.org/10.1128/AAC.46.4.1147-1152.2002) | +| IVc | [Ma et al. 2006](https://doi.org/10.1128/JCM.00985-06) | +| IVd | [Ma et al. 2006](https://doi.org/10.1128/JCM.00985-06) | +| IVg | [Kwon et al. 2005](https://doi.org/10.1093/jac/dki306) | +| IVh | [Milheirico et al. 2007](https://doi.org/10.1093/jac/dkm112) | +| IVi | [Berglund et al. 2009](https://doi.org/10.1093/jac/dkn435) | +| IVj | [Berglund et al. 2009](https://doi.org/10.1093/jac/dkn435) | +| IVk | - | +| IVl | [Iwao et al. 2012](https://doi.org/10.1007/s10156-012-0379-6) | +| IVm | [Hosoya et al. 2014](https://doi.org/10.11150/kansenshogakuzasshi.88.840) | +| IVn | - | +| Va | [Ito et al. 2004](https://doi.org/10.1128/AAC.48.7.2637-2651.2004) | +| Vb | [Hisata et al. 2011](https://doi.org/10.1007/s10156-011-0223-4) | +| Vc | [Li et al. 2011](https://doi.org/10.1128/AAC.01475-10) | + ## Installation You can install `sccmec` using `conda`: @@ -47,38 +74,49 @@ conda activate sccmec sccmec --help ``` -__Note:__ `sccmec` is just a wrapper around [camlhmp](https://github.com/rpetit3/camlhmp) -with the defaults for `--yaml` and `--targets` already set. Please don't let this confuse you -when you see all the camels! +__Note:__ `sccmec` is utilizes the API from [camlhmp](https://github.com/rpetit3/camlhmp) +with the defaults for `--yaml-targets`, `--yaml-regions`, `--regions` and `--targets` +already set. Please don't let this confuse you when you see all the camels! ## Usage ```bash - Usage: camlhmp-blast [OPTIONS] - - ๐Ÿช camlhmp-blast-targets ๐Ÿช - Classify assemblies with a camlhmp schema using BLAST - -โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ -โ”‚ --version -V Show the version and exit. โ”‚ -โ”‚ * --input -i TEXT Input file in FASTA format to classify [required] โ”‚ -โ”‚ * --yaml -y TEXT YAML file documenting the targets and types | -| [default: bin/../data/sccmec.yaml] [required] โ”‚ -โ”‚ * --targets -t TEXT Query targets in FASTA format | -| [default: bin/../data/sccmec.fasta] [required] โ”‚ -โ”‚ --outdir -o PATH Directory to write output [default: ./] โ”‚ -โ”‚ --prefix -p TEXT Prefix to use for output files [default: camlhmp] โ”‚ -โ”‚ --min-pident INTEGER Minimum percent identity to count a hit [default: 95] โ”‚ -โ”‚ --min-coverage INTEGER Minimum percent coverage to count a hit [default: 95] โ”‚ -โ”‚ --force Overwrite existing reports โ”‚ -โ”‚ --verbose Increase the verbosity of output โ”‚ -โ”‚ --silent Only critical errors will be printed โ”‚ -โ”‚ --help Show this message and exit. โ”‚ -โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ + Usage: sccmec [OPTIONS] + + sccmec - typing SCCmec cassettes in assemblies + +โ•ญโ”€ Required Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ +โ”‚ * --input -i TEXT Input file in FASTA format to classify [required] โ”‚ +โ”‚ * --yaml-targets -yt TEXT YAML file documenting the targets and types [required] โ”‚ +โ”‚ * --yaml-regions -yr TEXT YAML file documenting the regions and types [required] โ”‚ +โ”‚ * --targets -t TEXT Query targets in FASTA format [required] โ”‚ +โ”‚ * --regions -r TEXT Query regions in FASTA format [required] โ”‚ +โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ +โ•ญโ”€ Filtering Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ +โ”‚ --min-targets-pident INTEGER Minimum percent identity of targets to count a hit โ”‚ +โ”‚ [default: 90] โ”‚ +โ”‚ --min-targets-coverage INTEGER Minimum percent coverage of targets to count a hit โ”‚ +โ”‚ [default: 80] โ”‚ +โ”‚ --min-regions-pident INTEGER Minimum percent identity of regions to count a hit โ”‚ +โ”‚ [default: 85] โ”‚ +โ”‚ --min-regions-coverage INTEGER Minimum percent coverage of regions to count a hit โ”‚ +โ”‚ [default: 83] โ”‚ +โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ +โ•ญโ”€ Additional Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ +โ”‚ --prefix -p TEXT Prefix to use for output files [default: sccmec] โ”‚ +โ”‚ --outdir -o PATH Directory to write output [default: ./] โ”‚ +โ”‚ --force Overwrite existing reports โ”‚ +โ”‚ --verbose Increase the verbosity of output โ”‚ +โ”‚ --silent Only critical errors will be printed โ”‚ +โ”‚ --version Print schema and camlhmp version โ”‚ +โ”‚ --help Show this message and exit. โ”‚ +โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ ``` -As mentioned above, `sccmec` is just a wrapper around `camlhmp-blast`. Except, please note that -the `--yaml` and `--targets` options are already set to the SCCmec defaults. This means you only -need to provide the `--input` option with your assembly file. +As mentioned above, `sccmec` utilizes the `camlhmp` API. Except, please note that the +`--yaml-targets`, `--yaml-regions`, `--regions` and `--targets` options are already set to +the SCCmec defaults. This means you only need to provide the `--input` option with your +assembly file. ### Example Usage @@ -86,110 +124,215 @@ Here's an example of how to use `sccmec` using an assembly file (both uncompress compressed are supported): ```bash -sccmec --input tests/fasta/type-Va-AB121219.fasta.gz --prefix type-v --force -Running camlhmp with following parameters: +sccmec --input tests/fasta/type-Va-AB121219.fasta.gz --prefix type-v + +Running sccmec (via camlhmp) with following parameters: --input tests/fasta/type-Va-AB121219.fasta.gz - --yaml bin/../data/sccmec.yaml - --targets bin/../data/sccmec.fasta + --yaml-targets /home/rpetit3/repos/sccmec/data/sccmec-targets.yaml + --yaml-regions /home/rpetit3/repos/sccmec/data/sccmec-regions.yaml + --targets /home/rpetit3/repos/sccmec/data/sccmec-targets.fasta + --regions /home/rpetit3/repos/sccmec/data/sccmec-regions.fasta --outdir ./ --prefix type-v - --min-pident 95 - --min-coverage 95 - -Starting camlhmp for SCCmec Typing... + --min-targets-pident 90 + --min-targets-coverage 80 + --min-regions-pident 85 + --min-regions-coverage 83 +Starting camlhmp for SCCmec Typing (targets)... +Running blastn... +Processing target hits... +Starting camlhmp for SCCmec Typing (regions)... Running blastn... -Processing hits... +Processing region hits... Final Results... - SCCmec Typing -โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“ -โ”ƒ sample โ”ƒ type โ”ƒ targets โ”ƒ schema โ”ƒ version โ”ƒ comment โ”ƒ -โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ -โ”‚ type-v โ”‚ V โ”‚ ccrC1,IS431,IS431_1,IS431_2,mecA,mecR1 โ”‚ sccmec โ”‚ 1.0.0 โ”‚ โ”‚ -โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ -Writing outputs... + SCCmec Typing +โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”“ +โ”ƒ saโ€ฆ โ”ƒ tyโ€ฆ โ”ƒ suโ€ฆ โ”ƒ meโ€ฆ โ”ƒ taโ€ฆ โ”ƒ reโ€ฆ โ”ƒ coโ€ฆ โ”ƒ hiโ€ฆ โ”ƒ taโ€ฆ โ”ƒ tโ€ฆ โ”ƒ reโ€ฆ โ”ƒ rโ€ฆ โ”ƒ caโ€ฆ โ”ƒ pโ€ฆ โ”ƒ taโ€ฆ โ”ƒ rโ€ฆ โ”ƒ coโ€ฆ โ”ƒ +โ”กโ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”ฉ +โ”‚ tyโ€ฆ โ”‚ V โ”‚ Va โ”‚ + โ”‚ ccโ€ฆ โ”‚ Va โ”‚ 10โ€ฆ โ”‚ 12 โ”‚ scโ€ฆ โ”‚ 1โ€ฆ โ”‚ scโ€ฆ โ”‚ 1โ€ฆ โ”‚ 1.โ€ฆ โ”‚ mโ€ฆ โ”‚ โ”‚ Cโ€ฆ โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ bโ€ฆ โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ on โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ 12 โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ hโ€ฆ โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ wโ€ฆ โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ oโ€ฆ โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ or โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ mโ€ฆ โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ oโ€ฆ โ”‚ โ”‚ +โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ hโ€ฆ โ”‚ โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”˜ Final predicted type written to ./type-v.tsv -Results against each type written to ./type-v.details.tsv -blastn results written to ./type-v.blastn.tsv +Target-based results against each type written to ./type-v.targets.details.tsv +Target-based blastn results written to ./type-v.targets.blastn.tsv +Region-based results against each type written to ./type-v.regions.details.tsv +Region-based blastn results written to ./type-v.regions.blastn.tsv ``` -If needed you could adjust the `--min-pident` and `--min-coverage` options to be more or less -depending on your needs. +If needed, you could adjust the `--min-targets-pident`, `--min-targets-coverage`, +`--min-regions-pident` and/or `--min-regions-coverage` options to be more or less +depending on your needs. But please note the defaults are set to the recommended values. -Once the tool has completed, you will find three output files in the current directory which +Once the tool has completed, you will find five output files in the current directory which described below. ### Output Files `camlhmp-blast` will generate three output files: -| File Name | Description | -|------------------------|-------------------------------------------------| -| `{PREFIX}.tsv` | A tab-delimited file with the predicted type | -| `{PREFIX}.blast.tsv` | A tab-delimited file of all blast hits | -| `{PREFIX}.details.tsv` | A tab-delimited file with details for each type | +| File Name | Description | +|--------------------------------|-------------------------------------------------------------------------| +| `{PREFIX}.tsv` | A tab-delimited file with the predicted type | +| `{PREFIX}.targets.blastn.tsv` | A tab-delimited file of all target-specific blast hits | +| `{PREFIX}.targets.details.tsv` | A tab-delimited file with details for each type based on targets | +| `{PREFIX}.regions.blastn.tsv` | A tab-delimited file of all full cassette blast hits | +| `{PREFIX}.regions.details.tsv` | A tab-delimited file with details for each type based on full cassettes | #### Example {PREFIX}.tsv ```tsv -sample type targets schema version comment -saureus V ccrC1,IS431,IS431_1,IS431_2,mecA,mecR1 sccmec 1.0.0 +sample type subtype mecA targets regions coverage hits target_schema target_schema_version region_schema region_schema_version camlhmp_version params target_comment region_comment comment +type-v V Va + ccrC1,IS431,IS431_1,IS431_2,mecA,mecR1 Va 100.00 12 sccmec_targets 1.2.0 sccmec_regions 1.2.0 1.0.1 min-targets-coverage=80;min-targets-pident=90;min-regions-coverage=83;min-regions-pident=85 Coverage based on 12 hits;There were one or more overlapping hits ``` -| Column | Description | -|---------|--------------------------------------------------| -| sample | The sample name as determined by `--prefix` | -| type | The predicted type | -| targets | The targets for the given type that had a hit | -| schema | The schema used to determine the type | -| version | The version of the schema used | -| comment | A small comment about the result | - -#### Example {PREFIX}.blast.tsv +| Column | Description | +|-----------------------|------------------------------------------------------------------------------| +| sample | The sample name as determined by `--prefix` | +| type | The predicted type (based on targets and full cassettes) | +| subtype | The predicted subtype (based on full cassettes) | +| mecA | The mecA gene status (+=present or -=absent or not a significant hit) | +| targets | The targets for the given type that had a hit | +| regions | The regions for the given type that had a hit | +| coverage | The coverage of the full cassette in the regions column | +| hits | The number of hits that made up the full cassette coverage | +| target_schema | The schema used to determine the type based on targets | +| target_schema_version | The version of the schema used to determine the type based on targets | +| region_schema | The schema used to determine the type based on full cassettes | +| region_schema_version | The version of the schema used to determine the type based on full cassettes | +| camlhmp_version | The version of camlhmp used to determine the type | +| params | The parameters used to determine the type | +| target_comment | A small comment about the target results | +| region_comment | A small comment about the region results | +| comment | A small comment about the final result | + +#### Example {PREFIX}.targets.blastn.tsv ```tsv qseqid sseqid pident qcovs qlen slen length nident mismatch gapopen qstart qend sstart send evalue bitscore ccrC1 AB121219.1 100.000 100 1623 28612 1623 1623 0 0 1 1623 16132 17754 0.0 2998 +ccrC1 AB121219.1 90.439 100 1677 28612 1684 1523 148 12 1 1677 16132 17809 0.0 2206 IS431_1 AB121219.1 100.000 100 791 28612 791 791 0 0 1 791 8221 9011 0.0 1461 +IS431_1 AB121219.1 98.085 100 791 28612 731 717 14 0 1 731 3423 2693 0.0 1273 IS431_1 AB121219.1 99.704 100 675 28612 675 673 2 0 1 675 2693 3367 0.0 1236 -IS431_1 AB121219.1 98.519 100 675 28612 675 665 10 0 1 675 8951 8277 0.0 1192 ... ``` This is the standard BLAST output with `-outfmt 6` -#### Example {PREFIX}.details.tsv +#### Example {PREFIX}.targets.details.tsv + +```tsv +sample type status targets missing schema schema_version camlhmp_version params comment +type-v I False IS431,mecA,mecR1 ccrA1,ccrB1,IS1272 sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v II False IS431,mecA,mecR1 ccrA2,ccrB2,mecI sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v III False IS431,mecA,mecR1 ccrA3,ccrB3,mecI sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v IV False IS431,mecA,mecR1 ccrA2,ccrB2,IS1272 sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v V True ccrC1,IS431_1,mecA,mecR1,IS431_2 sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v VI False IS431,mecA,mecR1 ccrA4,ccrB4,IS1272 sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v VII False ccrC1,IS431_1,mecA,mecR1,IS431_2 IS12960D sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v VIII False IS431,mecA,mecR1 ccrA4,ccrB4,mecI sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 Excluded target ccrC1 found, failing type VIII +type-v IX False IS431_1,mecA,mecR1,IS431_2 ccrA1,ccrB1 sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v X False IS431_1,mecA,mecR1,IS431_2 ccrA1,ccrB6 sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v XI False mecA,mecR1 ccrA1,ccrB3,blaZ,mecI sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v XII False IS431_1,mecA,mecR1,IS431_2 ccrC2 sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v XIII False IS431,mecA,mecR1 ccrC2,mecI sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v XIV False ccrC1,IS431,mecA,mecR1 mecI sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 +type-v XV False IS431,mecA,mecR1 ccrA1,ccrB6,mecI sccmec_targets 1.2.0 1.0.1 min-coverage=90;min-pident=80 + +``` + +This file provides a detailed view of the results. The columns are: + +| Column | Description | +|-----------------|------------------------------------------------------| +| sample | The sample name as determined by `--prefix` | +| type | The type being tested | +| status | The status of the type (True if failed) | +| targets | The targets for the given type that had a match | +| missing | The targets for the given type that were not found | +| schema | The schema used to determine the type | +| schema_version | The version of the schema used to determine the type | +| camlhmp_version | The version of camlhmp used to determine the type | +| params | The parameters used to determine the type | +| comment | A small comment about the result | + +#### Example {PREFIX}.regions.blastn.tsv + +```tsv +qseqid sseqid pident qcovs qlen slen length nident mismatch gapopen qstart qend sstart send evalue bitscore +III AB121219.1 99.371 25 68256 28612 4132 4106 26 0 24230 28361 8220 4089 0.0 7487 +III AB121219.1 86.738 25 68256 28612 5067 4395 628 42 59204 64248 17954 12910 0.0 5594 +III AB121219.1 94.259 25 68256 28612 3240 3054 172 11 44582 47815 22419 19188 0.0 4940 +III AB121219.1 98.421 25 68256 28612 1837 1808 25 4 27952 29787 4458 2625 0.0 3229 +III AB121219.1 99.494 25 68256 28612 791 787 3 1 34225 35015 3423 2634 0.0 1437 +... +``` + +This is the standard BLAST output with `-outfmt 6` + +#### Example {PREFIX}.regions.details.tsv ```tsv -sample type status targets missing schema version comment -type-v I False IS431,mecA,mecR1 ccrA1,ccrB1,IS1272 sccmec 1.0.0 -type-v II False IS431,mecA,mecR1 ccrA2,ccrB2,mecI sccmec 1.0.0 -type-v III False IS431,mecA,mecR1 ccrA3,ccrB3,mecI sccmec 1.0.0 -type-v IV False IS431,mecA,mecR1 ccrA2,ccrB2,IS1272 sccmec 1.0.0 -type-v V True ccrC1,IS431_1,mecA,mecR1,IS431_2 sccmec 1.0.0 -type-v VI False IS431,mecA,mecR1 ccrA4,ccrB4,IS1272 sccmec 1.0.0 -type-v VII False ccrC1,IS431_1,mecA,mecR1,IS431_2 IS12960D sccmec 1.0.0 -type-v VIII False IS431,mecA,mecR1 ccrA4,ccrB4,mecI sccmec 1.0.0 Excluded target ccrC1 found, failing type VIII -type-v IX False IS431_1,mecA,mecR1,IS431_2 ccrA1,ccrB1 sccmec 1.0.0 -type-v X False IS431_1,mecA,mecR1,IS431_2 ccrA1,ccrB6 sccmec 1.0.0 -type-v XI False mecA,mecR1 ccrA1,ccrB3,blaZ,mecI sccmec 1.0.0 -type-v XII False IS431_1,mecA,mecR1,IS431_2 ccrC2 sccmec 1.0.0 -type-v XIII False IS431,mecA,mecR1 ccrC2,mecI sccmec 1.0.0 -type-v XIV False ccrC1,IS431,mecA,mecR1 mecI sccmec 1.0.0 -type-v XV False IS431,mecA,mecR1 ccrA1,ccrB6,mecI sccmec 1.0.0 +sample type status targets missing coverage hits schema schema_version camlhmp_version params comment +type-v Ia False Ia 17.67 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v Ib False Ib 16.61 2 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 2 hits +type-v IIa False IIa 11.85 11 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 11 hits;There were one or more overlapping hits +type-v IIb False IIb 0.00 0 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 +type-v IIc False IIc 17.39 4 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 4 hits;There were one or more overlapping hits +type-v IId False IId 0.00 0 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 +type-v IIe False IIe 1.54 1 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 +type-v III False III 24.50 18 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 18 hits;There were one or more overlapping hits +type-v IVa False IVa 29.35 13 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 13 hits;There were one or more overlapping hits +type-v IVb False IVb 33.19 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v IVc False IVc 23.56 14 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 14 hits;There were one or more overlapping hits +type-v IVd False IVd 7.78 1 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 +type-v IVg False IVg 30.66 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v IVi False IVi 30.85 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v IVj False IVj 30.58 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v IVk False IVk 16.00 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v IVl False IVl 19.79 13 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 13 hits;There were one or more overlapping hits +type-v IVm False IVm 25.73 14 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 14 hits;There were one or more overlapping hits +type-v IVn False IVn 28.15 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v Va True Va 100.00 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v Vb False Vb 64.55 17 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 17 hits;There were one or more overlapping hits +type-v Vc False Vc 50.14 17 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 17 hits;There were one or more overlapping hits +type-v VI False VI 29.79 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v VII False VII 45.86 15 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 15 hits;There were one or more overlapping hits +type-v VIII False VIII 16.95 9 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 9 hits;There were one or more overlapping hits +type-v IX False IX 15.33 11 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 11 hits;There were one or more overlapping hits +type-v X False X 13.68 16 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 16 hits;There were one or more overlapping hits +type-v XI False XI 0.00 0 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 +type-v XII False XII 19.37 15 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 15 hits;There were one or more overlapping hits +type-v XIII False XIII 28.39 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits +type-v XIV False XIV 14.50 16 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 16 hits;There were one or more overlapping hits +type-v XV False XV 17.21 11 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 11 hits;There were one or more overlapping hits ``` This file provides a detailed view of the results. The columns are: -| Column | Description | -|---------|----------------------------------------------------| -| sample | The sample name as determined by `--prefix` | -| type | The predicted type | -| status | The status of the type (True if failed) | -| targets | The targets for the given type that had a match | -| missing | The targets for the given type that were not found | -| schema | The schema used to determine the type | -| version | The version of the schema used | -| comment | A small comment about the result | +| Column | Description | +|-----------------|------------------------------------------------------------| +| sample | The sample name as determined by `--prefix` | +| type | The type being tested | +| status | The status of the type (True if failed) | +| targets | The targets for the given type that had a match | +| missing | The targets for the given type that were not found | +| coverage | The coverage of the full cassette | +| hits | The number of hits that made up the full cassette coverage | +| schema | The schema used to determine the type | +| schema_version | The version of the schema used to determine the type | +| camlhmp_version | The version of camlhmp used to determine the type | +| params | The parameters used to determine the type | +| comment | A small comment about the result | ## Citations diff --git a/bin/sccmec b/bin/sccmec index 6cdcb30..a281ef6 100755 --- a/bin/sccmec +++ b/bin/sccmec @@ -222,17 +222,17 @@ def sccmec( # Check if params are set in the YAML (only change if not set on the command line) if "--min-targets-pident" not in sys.argv: if "min_pident" in targets_framework["engine"]["params"]: - min_pident = targets_framework["engine"]["params"]["min_pident"] + min_targets_pident = targets_framework["engine"]["params"]["min_pident"] if "--min-targets-coverage" not in sys.argv: if "min_coverage" in targets_framework["engine"]["params"]: - min_coverage = targets_framework["engine"]["params"]["min_coverage"] + min_targets_coverage = targets_framework["engine"]["params"]["min_coverage"] if "--min-regions-pident" not in sys.argv: if "min_pident" in regions_framework["engine"]["params"]: - min_pident = regions_framework["engine"]["params"]["min_pident"] + min_regions_pident = regions_framework["engine"]["params"]["min_pident"] if "--min-regions-coverage" not in sys.argv: if "min_coverage" in regions_framework["engine"]["params"]: - min_coverage = regions_framework["engine"]["params"]["min_coverage"] + min_regions_coverage = regions_framework["engine"]["params"]["min_coverage"] # Describe the command line arguments console = rich.console.Console(stderr=True)