Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution failed when --kmerfinder_db and --ncbi_assembly_metadata are provided #188

Open
IBEXCluster opened this issue Dec 5, 2024 · 5 comments

Comments

@IBEXCluster
Copy link

Dear developers,
We are trying to use additional parameters --kmerfinder_db and --ncbi_assembly_metadata for bacass workflow with release 2.4.0 or development version. Both are failed with the following error:

WARN: The following invalid input values have been detected:
 
* --kmerfinder_db: DATABASES/bacteria.tar.gz
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  Kmerfinder database and NCBI assembly metadata not provided.

  Please specify the '--kmerfinderdb' and '--ncbi_assembly_metadata' parameters.

  Both are required to run Kmerfinder.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Any advise?

Here is the run script:

rm -rf ~/.nextflow/assets/nf-core/bacass ;

nextflow run nf-core/bacass -r dev -c nextflow.config -profile singularity --input samplesheet.tsv --kraken2db /ibex/ai/reference/KSL/kraken2/kraken2_dbs/scripts_download/k2_nt_20230502.tar.gz --kmerfinder_db DATABASES/bacteria.tar.gz --ncbi_assembly_metadata ASSEMBLY-REPORTS/assembly_summary_refseq.txt --outdir outputs/2024-12-04_15-KSA-samples__local-downloads__dev

Here is the complete log file:

Nextflow 24.10.2 is available - Please consider updating your version to it

 

N E X T F L O W   ~  version 24.04.4

 

Pulling nf-core/bacass ...

downloaded from [https://github.com/nf-core/bacass.git](https://urldefense.com/v3/__https://github.com/nf-core/bacass.git__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOd6Vy7Ht0$)

Launching `[https://github.com/nf-core/bacass](https://urldefense.com/v3/__https://github.com/nf-core/bacass__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOdQdoeFKY$)` [thirsty_bhabha] DSL2 - revision: ad892edcdb [dev]

 

 

------------------------------------------------------

                                        ,--./,-.

        ___     __   __   __   ___     /,-._.--~'

  |\ | |__  __ /  ` /  \ |__) |__         }  {

  | \| |       \__, \__/ |  \ |___     \`-._,-`-,

                                        `._,._,'

  nf-core/bacass 2.5.0dev

------------------------------------------------------

Input/output options

  input                 : samplesheet.tsv

  outdir                : outputs/2024-12-04_15-KSA-samples__local-downloads__dev

 

Contamination Screening

  kraken2db             : /ibex/ai/reference/KSL/kraken2/kraken2_dbs/scripts_download/k2_nt_20230502.tar.gz

  ncbi_assembly_metadata: ASSEMBLY-REPORTS/assembly_summary_refseq.txt

 

Assembly parameters

  canu_mode             : -nanopore

 

Annotation

  dfast_config          : /home/pampum/.nextflow/assets/nf-core/bacass/assets/test_config_dfast.py

 

Core Nextflow options

  revision              : dev

  runName               : thirsty_bhabha

  containerEngine       : singularity

  launchDir             : /ibex/user/pampum/2024-11-26_KSA-lib-ONT-assemblies

  workDir               : /ibex/user/pampum/2024-11-26_KSA-lib-ONT-assemblies/work

  projectDir            : /home/pampum/.nextflow/assets/nf-core/bacass

  userName              : pampum

  profile               : singularity

  configFiles           :

 

!! Only displaying parameters that differ from the pipeline defaults !!

------------------------------------------------------* The pipeline

  [https://doi.org/10.5281/zenodo.2669428](https://urldefense.com/v3/__https://doi.org/10.5281/zenodo.2669428__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOdfnSEiLo$)

 

* The nf-core framework

    [https://doi.org/10.1038/s41587-020-0439-x](https://urldefense.com/v3/__https://doi.org/10.1038/s41587-020-0439-x__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOdNw1r3WM$)

 

* Software dependencies

    [https://github.com/nf-core/bacass/blob/master/CITATIONS.md](https://urldefense.com/v3/__https://github.com/nf-core/bacass/blob/master/CITATIONS.md__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOdJfshqHM$)

 

WARN: The following invalid input values have been detected:

 

* --kmerfinder_db: DATABASES/bacteria.tar.gz

 

 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  Kmerfinder database and NCBI assembly metadata not provided.

  Please specify the '--kmerfinderdb' and '--ncbi_assembly_metadata' parameters.

  Both are required to run Kmerfinder.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We may not possible to use custom parameters.
Can you please help us to fix this error?

@m-jahn
Copy link

m-jahn commented Dec 10, 2024

@IBEXCluster
Copy link
Author

Dear @m-jahn
Thanks for your recommendations.
I was downloaded the Kmerfinder database from https://zenodo.org/records/13447056.
However, this kmerfinder job step was failed to locate bacteria.tax and it's not part of the distribution.

i.e.,

Command executed:

  kmerfinder.py \
      --infile SRR10093029_1.fastp.fastq.gz SRR10093029_2.fastp.fastq.gz \
      --output_folder . \
      --db_path 20190108_kmerfinder_stable_dirs/bacteria.ATG \
      -tax 20190108_kmerfinder_stable_dirs/bacteria.tax \
      -x
  
  mv results.txt SRR10093029_results.txt
  mv data.json SRR10093029_data.json
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_BACASS:BACASS:KMERFINDER_SUBWORKFLOW:KMERFINDER":
      kmerfinder: $(echo "3.0.2")
  END_VERSIONS

Command exit status:
  1

Command output:
  # Time used to run KMA for species identifation: 0.016 s
  Cant open file: [Errno 2] No such file or directory: '20190108_kmerfinder_stable_dirs/bacteria.tax'

As you may refer this Kmerfinder database (2019/01/08 - 17GB) - stable dir, which may not have bacteria.tax.

Version: 20190108_stable_dirs
Website: ftp://ftp.cbs.dtu.dk/public/CGE/databases/KmerFinder/version/

Content 20190108_stable_dirs.tar.gz:

bacteria
├── bacteria.ATG.comp.b
├── bacteria.ATG.length.bp
├── bacteria.ATG.name
├── bacteria.ATG.seq.b
└── bacteria.name

Update: Same KmerFinder version, but the previous database was corrupted and resulted in untar errors. This version should fix that.

Any further suggestions?
Thanks in advance.

@m-jahn
Copy link

m-jahn commented Dec 10, 2024

In your work directory for this module, rename bacteria.name to bacteria.tax. Then it will work.
This is due to a change in the kmerfinder database structure. Strange enough, the bacass pipe should actually work with both as this module looks for both .name and .tax ending, but it doesn't.

@m-jahn
Copy link

m-jahn commented Dec 10, 2024

More specifically, this line in modules/local/kmerfinder/main.nf looks for both file endings:

def db_tax = file("${kmerfinderdb_path}/${tax_group}.name").exists() ? "${kmerfinderdb_path}/${tax_group}.name" : "${kmerfinderdb_path}/${tax_group}.tax"

I can not explain why it doesn't accept the .name file.

@IBEXCluster
Copy link
Author

IBEXCluster commented Dec 11, 2024

Many thanks @m-jahn for your recommendations.
I noticed, the taxonomy file was unavailable at ftp://ftp.cbs.dtu.dk/public/CGE/databases/KmerFinder/version/.
However, I was trying to install the required database fromKmerFinder : bash ~/kmerfinder/src/kmerfinder_db/INSTALL.sh $PWD bacteria latest and it's helped to create a bacteria taxonomy file bacteria.tax. Thanks for your suggestions again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants