Pairwise Sequence Alignment Using Pairwise2 Module With Biopython
1
1
Entering edit mode
9.9 years ago
Mkl ▴ 100

Hello,

I am trying to get the pairwise sequence alignment with biopython. I tried the following code as example.This code is only for two sequences. My data set contains 300 sequences.I have to get pairwise sequence alignment for each sequence. That is alignment between first and second sequence, first and third sequence,---------etc up to last sequence.How can I rearrange this code? Is it possible with pairwise2 module or any other module available in Biopython.

from Bio import pairwise2
from Bio.SubsMat import MatrixInfo as matlist
matrix = matlist.blosum62
gap_open = -10
gap_extend = -0.5
p53_human = "MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP"
p53_mouse = "MEESQSDISLELPLSQETFSGLWKLLPPEDILPSPHCMDDLLLPQDVEEFFEGPSEALRV"
alns = pairwise2.align.globalds(p53_human, p53_mouse, matrix, gap_open, gap_extend)
top_aln = alns[0]
aln_human, aln_mouse, score, begin, end = top_aln
print aln_human+'\n'+aln_mouse


output:

MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP----
MEESQSDISLELPLSQETFSGLWKLLPPEDIL-PSP-HCMDDLLL-PQDVEEFF-EGPSEALRV

biopython python pairwise alignment • 8.1k views
1
Entering edit mode

Have you tried 2 nested for loops?

0
Entering edit mode

No.I haven't tried nested for loop.

0
Entering edit mode

from Bio import SeqIO How to import this package?

0
Entering edit mode

Please do not warm up 7-year old threads, especially if the question is not directly related to the original topic. Please browse the Web and StackExchange for python-related questions/answers.

3
Entering edit mode
9.9 years ago

You can do it with something quick and dirty like this:

import sys
from Bio import SeqIO

inFile = open(sys.argv[1],'r')

done = {}
entries = []
for entry in SeqIO.parse(inFile,'fasta'):
entries.appendentry.id,entry.seq))

for entryA in entries:
for entryB in entries:
idA = entryA[0]
idB = entryB[0]

if not done.has_key(idA + idB):
seqA = entryA[1]
seqB = entryB[1]

#DO YOUR PAIRWISE ALIGNMENT STUFF HERE WITH seqA and seqB

done[idA + idB] = True
done[idB + idA] = True


But why do you need to do these pair-wise alignments? Are you trying to making a tree? There might be easier ways to accomplish what you want if you tell us your goal.

0
Entering edit mode

@DK Thank you very much for your code.I am trying to do hierarchial clustering.First I have to create a distance matrix based on this formula "distance=100-sequence identity". After that I have to do clustering and has to create a dendrogram also.

0
Entering edit mode

@DK.Thank you very much for your code.I didn't get any output using your code. I am trying to do hierarchical clustering. First I have to create a distance matrix based on this formula "distance=100-sequence identity". After that I have to create a Dendrogram based on hierarchial clustering.