Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error during the dada2 step #68

Open
ankurnaqib opened this issue Jan 3, 2025 · 6 comments
Open

error during the dada2 step #68

ankurnaqib opened this issue Jan 3, 2025 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@ankurnaqib
Copy link

Hi, I am getting this error while running the dada2 step.

Jan-01 19:59:47.133 [TaskFinalizer-8] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
task: name=pb16S:dada2_denoise (1); work-dir=/home/anaqib/Projects/GMCF_2378_Sona/work/db/fcd5418a72e8691f04adb9937191b1
error [nextflow.exception.ProcessFailedException]: Process pb16S:dada2_denoise (1) terminated with an error exit status (1)
Jan-01 19:59:47.174 [TaskFinalizer-8] ERROR nextflow.processor.TaskProcessor - Error executing process > 'pb16S:dada2_denoise (1)'

Caused by:
Process pb16S:dada2_denoise (1) terminated with an error exit status (1)

Command executed:

Use custom script that can skip primer trimming

mkdir -p dada2_custom_script
cp run_dada_2023.2_SingletonTRUE.R dada2_custom_script/run_dada.R

sed 's/minQ\ =\ 0/minQ=0/g' dada2_custom_script/run_dada_ccs_original.R > # dada2_custom_script/run_dada_ccs_SingletonTRUE.R

chmod +x dada2_custom_script/run_dada.R
export PATH="./dada2_custom_script:$PATH"
which run_dada.R
qiime dada2 denoise-ccs --i-demultiplexed-seqs samples.qza --o-table dada2-ccs_table.qza --o-representative-sequences dada2-ccs_rep.qza --o-denoising-stats dada2-ccs_stats.qza --p-min-len 1000 --p-max-len 1600 --p-max-ee 2 --p-front 'none' --p-adapter 'none' --p-n-threads 8 --p-pooling-method 'pseudo'

Command exit status:
1

Command output:
dada2_custom_script/run_dada.R

Command error:
dada2_custom_script/run_dada.R
Plugin error from dada2:

An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Debug info has been saved to /tmp/qiime2-q2cli-err-i71rx1hc.log

Work dir:
/home/anaqib/Projects/GMCF_2378_Sona/work/db/fcd5418a72e8691f04adb9937191b1

I am attaching the nextflow log file as well. When I look into the particular process and open the dada2.log file, I see the following error in the 4th step -
4) Denoise samples
.......................................Error in derepFastq(filts[[j]]) : Not all provided files exist.

I am attaching the dada2.log file as well. Is it a memory issue? Please help. Thank you.
dada2.log
.nextflow 2.log

@proteinosome
Copy link
Collaborator

@ankurnaqib did you by any chance modified the DADA2 script? I can see run_dada_2023.2_SingletonTRUE.R in the log, which is not the name of the script provided by default. If you did, can you let me know what did you change?

That step happens at a location where it's supposed to denoise each FASTQ but I'm not sure why it's complaining that it's missing a file. I don't think it's memory issue (exit status 1 usually does not associate with OOM).

@ankurnaqib
Copy link
Author

Thank you so much for responding. Yes, we did modify the original run_dada_2023.2.R and made it into the run_dada_2023.2_SingletonTRUE.R (attached in a zip file). We wanted to include singletons detection within our analysis. I was able to do that with my standalone R script, so I modified it to allow detection of singletons. Since I modified it, I have ran projects in the past and they have run successfully. I dont know why I am getting this error this time around. Yes, allowing singletons and modifying the R script has made the process slower (I would love to hear your comments on that too) but we do see more data post denoising than we saw before.

Additionally, I also modified the run_dada_ccs.R script the same way into run_dada_ccs_SingletonTRUE.R, but I dont think we are bothered with that one.

Thank you so much for your help.
run_dada_2023.2_SingletonTRUE.R.zip

@proteinosome
Copy link
Collaborator

Hi @ankurnaqib , a similar issue on DADA2 here. It looks like some of your FASTQ files may be empty after the filtering step in DADA2 pipeline. You can try running DADA2 workflow (follow the run_dada.R script), or we can try adding a line to the DADA2 step in the Nextflow pipeline so that it'll output the intermediate file for you to check which sample is giving you an issue.

Add export TMPDIR=\$(pwd) to main.nf line 525. Note that I have not tested this and QIIME2 sometimes can have weird issues with tmpdir, but I think it should work.

@ankurnaqib
Copy link
Author

ankurnaqib commented Jan 10, 2025

Hi. Yes, I saw that forum too. I wasn't sure which sample was throwing that problem. In the past, I saw that the negative control was throwing that problem so I removed it.

Can you please confirm where should I add the line export TMPDIR=$(pwd)?
Should I add it above line 525 or should line 525 now read -
export TMPDIR=$(pwd) which run_dada.R

Another issue with this is that if the dataset has 20 samples, and this error happens at the 19th samples, and I remove the sample from the manifest and the metadata, can I then use the -resume option? If the process has to start from the beginning, is there an easier solution to that? In the usual DADA2 (not ccs), it just ignores the sample and puts an "x" instead of a "." in the dada2 log file. Thanks a lot for your help and guidance.

@proteinosome
Copy link
Collaborator

@ankurnaqib Add it above (or after, doesn't matter).

Re: resuming, I think it should resume all the steps prior to DADA2 just fine, but I can't be 100% sure as I have not tried it before. Nonetheless can you let me know if you manage to figure out what's going on with the samples that failed? If it's happening frequently it'll probably be worth it for me to add a step inside the pipeline to check for that.

@ankurnaqib
Copy link
Author

Thank you. I will try that. I am starting the pipeline again with the export TMPDIR=$(pwd) added to the nextflow script. I will let you know as soon as I get an error. Thank you for your help again.

@proteinosome proteinosome self-assigned this Jan 15, 2025
@proteinosome proteinosome added the bug Something isn't working label Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants