You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The query sequence in the A3M format should be gapless as discussed in #96. The hhstuite provides a script reformat.pl capable of dealing with this, take a look and try to add this into conkit.
The text was updated successfully, but these errors were encountered:
I was thinking about this code and it might be more simple than it looks. Here's how to do it:
Parse the sequences in the alignment, convert them into list and store them in a tuple. Create a variable called gap_postions = [].
Find all the gap indexes in the first sequence, store them in gap_positions and then remove the gaps using the pop() method.
In all the other non-query sequences, run a for loop and inquire if the position in gap_positions is equal to a gap. If Yes, remove it using pop(), else convert letter into lowercase.
That's it. Of course, there might be more efficient ways to do it, but those are the basics.
However, there are many programs that don't use the lowercase letters (which indicate insertions relative to the query sequence) and wrongly ask for alignments in the a3m format (which necessarily must display the insertions). Thus, conkit_convert could also provide an output without insertions, in which case you just need to remove the letters instead of converting them to lowercase.
The query sequence in the A3M format should be gapless as discussed in #96. The
hhstuite
provides a script reformat.pl capable of dealing with this, take a look and try to add this into conkit.The text was updated successfully, but these errors were encountered: