Get consensus from a MSA fasta file with IUPAC ambiguities in python
0
0
Entering edit mode
2.2 years ago
statamn • 0

Hello,

I have an almost similar question to the topic Get consensus from Muscle alignment with IUPAC ambiguities in python

I have a fasta file with align sequence and I want to generate a consensus.

So far I wrote :

import sys
from Bio import AlignIO
from Bio.Align import AlignInfo

alignment = AlignIO.read("input.fasta", 'fasta')
summary_align = AlignInfo.SummaryInfo(alignment)
summary_align.dumb_consensus(0.3)

The "0.3" is the threshold of allele frequency to be considered in the consensus. It means that for each column, if a base is represented at least in 30% of the aligment, it will be taken into account; and if more than one base fit this criteria, the corresponding IUPAC ambiguity code is used.

But the "dumb_consensus" only generate the highest represented base and de facto don't use the IUPAC code.

So do you have a way to do such a consensus using Biopython ( or Python in general ) ?

Thanks

python alignment IUPAC biopython consensus • 462 views
ADD COMMENT

Login before adding your answer.

Traffic: 2107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6