Translating Nucleotide MSA to Amino Acid MSA
1
1
Entering edit mode
8.1 years ago
weslfield ▴ 90

Hi, so I have a nucleotide mutliple sequence alignment that I would like to translate into an amino acid MSA based on the reading frame of a reference sequence in that alignment. Looking for the best way to do this, preferably a Biopython way. Thanks!

alignment msa sequence • 2.8k views
0
Entering edit mode

Why is it better to translate a nucleotide sequence to an amino acid sequence for MSA?

0
Entering edit mode

DNA codons can are redundant, amino acids are not.

1
Entering edit mode
7.9 years ago
Whetting ★ 1.6k

not sure you are still interested but...this solution assumes that the nucleotide alignment is a codon alignment. If not, you will end up with a bunch of "X" as aminoacid

from Bio import SeqIO

with open("translated.fas","w") as out:
for record in SeqIO.parse("alignment.phy","phylip"):  ##change this to whichever format
sequence=[]
for c in range(0,len(record.seq),3): change to 0, 1, 2 depending on the frame of the reference
codon = record.seq[c:c+3]
if "-" not in str(codon):
sequence.append( str(codon.translate()) )
elif str(codon)=="---":
sequence.append( "-" )
else:
sequence.append( "X" )
print >>out, ">"+record.id
print >>out, "".join(sequence)