How to generate a multiple sequence alignment that retains the annotations from the original sequences?
1
0
Entering edit mode
18 months ago
DNAlias ▴ 30

I want to generate a multiple sequence alignment that retains the annotations I made on a list of protein sequences (eg. Stockholm file format). Is there a way to do this? Maybe with Biopython or Biostrings?

In the end, all I want is a MSA with the same annotations as the orignal sequences, so any way that I can add annotations to a MSA without annotations will work as well.

sequence alignment • 918 views
ADD COMMENT
2
Entering edit mode
18 months ago
Mensur Dlakic ★ 12k

Muscle can do this. If you save your alignment in aligned FASTa (.afa) format, which is default, all sequence headers will be preserved. Assuming your starting file is protein.fas:

muscle -in protein.fas -out protein.afa

After that the alignment can be converted to Stockholm format using HMMer's esl-reformat utility:

esl-reformat stockholm protein.afa > protein.sto
ADD COMMENT
0
Entering edit mode

My files are currently in genbank format, is there a way to transfer them to fasta while retaining the annotations so that they are compatible with MUSCLE?

ADD REPLY
0
Entering edit mode

Also, these are annotations I made on domains in the sequence as opposed to the sequence as a whole

ADD REPLY
0
Entering edit mode

My sequences are currrently in Geneious. So I have an annotation table, but I can only export the file with annotations to genbank

ADD REPLY
0
Entering edit mode

The old version of esl-reformat is sreformat, and it can convert GenBank to FASTa.

sreformat fasta genome.gbk > genome.fas

You will need to go to older HMMer version, I think v2.3, to find this program.

If this doesn't work, Google is your friend. There should be plenty of programs or scripts to convert GenBank to FASTa.

ADD REPLY

Login before adding your answer.

Traffic: 1226 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6