Removing insertions from Stockholm format multiple sequence alignment file
1
0
Entering edit mode
11 weeks ago
becko • 0

I would like to remove all insertion columns from a multiple sequence alignment in Stockholm format (https://en.wikipedia.org/wiki/Stockholm_format). Are there any tools / scripts out there that facilitate this task?

multiple-sequence-alignment • 301 views
ADD COMMENT
1
Entering edit mode
11 weeks ago
Mensur Dlakic ★ 21k

It depends on what exactly you want. To create a gapless FASTA file? There is a utility called esl-reformat in HMMer package that can do that. Also a Perl script reformat.pl in HH-suite.

ADD COMMENT
0
Entering edit mode

As I said, I want to remove insertion columns. These are annotated as ~ or . in the secondary-structure annotation in Stockholm format. This is not the same as removing all gaps: I want to preserve deletions.

Unfortunately I have not found a utility in HMMer or Infernal that let's me do this. But maybe I'm missing something. Thanks!

ADD REPLY
0
Entering edit mode

I don't think what you want can be done without a reference sequence. In most cases the first sequence in the alignment -- usually a query that was used to collect all sequences -- serves as a reference, and all the insertion columns with regard to reference can be removed. I don't know of a program that can do so based on SS annotation in Stockholm or any other format.

After compiling HMMer, esl-reformat will be in easel/miniapps subdirectory.

ADD REPLY

Login before adding your answer.

Traffic: 1814 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6