-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run CRISP-Correct with R1 fastq only #3
Comments
Hi Thomas, I just updated the repository, please see the updated README and install the latest version of CRISPR-Correct: Here is a draft of what the function should look like with R1 only:
The updated version isn't as robustly tested so let me know if you still have issues. |
Dear Basheer, thank you for the quick reply. I installed version 0.0.177 but it still requires a second fastq file:
I did then change line 99 in main_mapping.py Then I get this:
I have the feeling that the guides df might not be in the correct format, how should it look like? Currently it is:
Best regards, |
Hi Thomas, I have updated the package: 1) added the "= None" to the function to accept missing fastq_r2_fn, good catch, and 2) added a pre-processing to strip all trailing whitespace in the guide library dataframe sequences. RE point 2, I think the error came from there being trailing whitespace in one or multiple of the protospacer sequences in your library, so there was an issue with my tool encoding spaces. You don't have to change anything with your library since the tool will be able to handle trailing whitespaces, just need to rerun function with updated version. Install updated version: Best, |
Dear Basheer, yes, you were right somehow some white spaces slipped into my guide list. Thank you for handling them! Unfortunately I ran into another error:
So I did initialize those four variables in the get_whitelist_reporter_counts_with_umi function in crispr_guide_contring.py:
but ended up with:
I guess observed_guide_reporter_sequence_input comes as string but the infer_whitelist_sequence function requires a tuple. Best, |
Hi Thomas, I set encoded_whitelist_surrogate_sequences_series=None. Hopefully this error is fixed. There may be some other variables regarding the surrogate sequence and barcode sequence that I mistakenly forgot to initialize when surrogate and barcode isn't inputted, let me know if these errors pop up new version 0.0.184 Best, |
Dear Basheer, yes, there are more not initialized variables I guess.
Best, |
Hi Thomas, I believe I fixed all these issues (tested on one of my protospacer-only samples). Try running again. version 0.0.191 Then you can retrieve your counts from using either of the three: depending on how you want to treat sequences ambiguously mapped to more than one protospacer (tally +1 to all mapped protospacer, ignore ambiguous mapping and don't tally any protospacer, tally fractional +1/#_mapped to all mapped sequence). Best, |
Dear Basheer, thank you that´s great! I did test it and there was no more error message and I do receive counts. I would have two more questions:
Best, |
Hi,
I am trying to run CRISPR-Correct with just one fastq file (R1 only). According to the readme it should be possible to do so.
How should I define the fastq_r2_fn parameter as it seems to be required? None, NULL, or an empty string '' does not work.
My python lines:
and this is my fastq file:
Best regards,
Thomas
The text was updated successfully, but these errors were encountered: