Get consensus from Muscle alignment with IUPAC ambiguities in python

2

Entering edit mode

8.7 years ago

rimjhim.roy.ch ▴ 80

Hello,

I aligned my sequences using muscle and now I want to get the consensus sequence. The code I am using is:

alphabet = Gapped(IUPAC.ambiguous_dna)

input_sequences = "input.fas"
output_alignment = "output.fas"
def align_v1 (Fasta):
        muscle_cline = MuscleCommandline(muscle_exe, input=Fasta, out=output_alignment)
        stdout, stderr = muscle_cline()
        MultipleSeqAlignment = AlignIO.read(output_alignment, "fasta")
        summary_align = AlignInfo.SummaryInfo(MultipleSeqAlignment)
        consensus = summary_align.dumb_consensus( ambiguous = 'N', consensus_alpha= alphabet)

align_v1(input_sequences)

It just adds Ns where it finds any ambiguity. However, I want to incorporate R, Y, M, K, W, S, etc as well. Is there a work around for this?

Thanks a lot

muscle nucleotide consensus IUPAC alignment • 3.9k views

ADD COMMENT • link updated 18 months ago by Ram 43k • written 8.7 years ago by rimjhim.roy.ch ▴ 80

Login before adding your answer.