-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in msa2vcf ? #232
Comments
FWIW, it seems like that switch happens after an indel, as observed with a second example not shown. |
Hi, when using a fasta as input, the REFERENCE among the fasta records must be specified using option then option
|
furthermore, you should also have a look at : https://github.com/sanger-pathogens/snp_sites |
Hi, Thanks for the quick response. You're right, I hadn't quite appreciated about the consensus. From the help it says
So as I understand, that only changes what appears in the first column, and doesn't impact what the sequence of the reference is. Are you saying it also changes the consensus? Anyway, I tried to workaround by adding the same ref sequence twice, slightly changing the name of one to maintain uniqueness. Then finally the non-ref sequence. So the consensus should always be the reference. That worked for the above example, but it seemed to do something unexpected when the reference has gaps. Specifically, including gaps, the input sequences were 5002 bp, and 5000 bp not including them. I was expecting the consensus to be 5000 bp, but it comes out as 5002 bp. Is that expected behaviour? If you don't quite understand I can cook up a reduced example as above? snp_sites is what I tried first, but that doesn't seem to have any option to report INDELs, only SNPs. Maybe there's something I missed? Anyway, happy to help further here if helpful for you, but I just decided to write this code myself, since I only have 2 sequences for my use-case. Let me know. |
Subject of the issue
Output seems incorrect.
Your environment
On Linux, see output for version info
Steps to reproduce
With this input:
and running
The first 2 are right, but the 2nd two aren't - they are T->C not C->T, and T->A not A->T. Did I get that right? Here's the blast output to help visualise
Otherwise this tools seems like exactly what we need, so any help much appreciated.
Thanks, ben
The text was updated successfully, but these errors were encountered: