I would like to remove all insertion columns from a multiple sequence alignment in Stockholm format (https://en.wikipedia.org/wiki/Stockholm_format). Are there any tools / scripts out there that facilitate this task?
It depends on what exactly you want. To create a gapless FASTA file? There is a utility called esl-reformat in HMMer package that can do that. Also a Perl script reformat.pl in HH-suite.
As I said, I want to remove insertion columns. These are annotated as ~ or . in the secondary-structure annotation in Stockholm format.
This is not the same as removing all gaps: I want to preserve deletions.
Unfortunately I have not found a utility in HMMer or Infernal that let's me do this. But maybe I'm missing something. Thanks!
I don't think what you want can be done without a reference sequence. In most cases the first sequence in the alignment -- usually a query that was used to collect all sequences -- serves as a reference, and all the insertion columns with regard to reference can be removed. I don't know of a program that can do so based on SS annotation in Stockholm or any other format.
After compiling HMMer, esl-reformat will be in easel/miniapps subdirectory.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy