Question: Biopython: Getting Positions In A Multiple Sequence Alignment From A Reference Sequence In The Alignment
gravatar for weslfield
4.7 years ago by
European Union
weslfield90 wrote:

Ok, so I am writing an application that examines mutations at user defined positions within an MSA. Using BioPython to do this. It is fairly simple to examine columns in the alignment but their indexes are relative to the alignment and not a particular sequence (i.e. includes gap). I would like to be able to reference a sequence in the MSA by its ID and translate a sequence-specific position to a MSA-specific position. Bioperl allows you to do this like so —

$pos = $alignment -> column_from_residue_number('gi|388479282|ref|YP_491474.1|', 87);

This would assign some integer to $pos representing the column number in the gapped alignment of the ungapped sequence denoted by gi|388479282|ref|YP_491474.1|

Does anyone know a way to do this in BioPython given and alignment object? Thanks!

ADD COMMENTlink modified 4.7 years ago by a.zielezinski8.5k • written 4.7 years ago by weslfield90

I didn't find a straightforward way to do it. What I do is to count the gaps in the reference sequence until the desired position and add this number to the position number.

ADD REPLYlink written 4.7 years ago by Asaf5.0k
gravatar for a.zielezinski
4.7 years ago by
a.zielezinski8.5k wrote:

Maybe something like this:

from Bio.Alphabet import generic_dna
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
from Bio.Align import MultipleSeqAlignment

def column_from_residue_number(aln, id, res_no):
    rec = next((r for r in aln if == id), None)
    j = 0
    for i, res in enumerate(rec.seq):
        if res!='-':
            if j==res_no:
                return i

alignment = MultipleSeqAlignment([
   SeqRecord(Seq("ACTGCTAGCTAG", generic_dna), id="Alpha"),
   SeqRecord(Seq("ACT-CTAGCTAG", generic_dna), id="Beta"),
   SeqRecord(Seq("ACTGCTAGDTAG", generic_dna), id="Gamma"),

print column_from_residue_number(alignment, "Beta", 3)

You will get:

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by a.zielezinski8.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2059 users visited in the last hour