Pairwise Sequence Alignment Using Pairwise2 Module With Biopython
1
1
Entering edit mode
12.9 years ago
Mkl ▴ 100

Hello,

I am trying to get the pairwise sequence alignment with biopython. I tried the following code as example.This code is only for two sequences. My data set contains 300 sequences.I have to get pairwise sequence alignment for each sequence. That is alignment between first and second sequence, first and third sequence,---------etc up to last sequence.How can I rearrange this code? Is it possible with pairwise2 module or any other module available in Biopython.

from Bio import pairwise2
from Bio.SubsMat import MatrixInfo as matlist
matrix = matlist.blosum62
gap_open = -10
gap_extend = -0.5
p53_human = "MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP"
p53_mouse = "MEESQSDISLELPLSQETFSGLWKLLPPEDILPSPHCMDDLLLPQDVEEFFEGPSEALRV" 
alns = pairwise2.align.globalds(p53_human, p53_mouse, matrix, gap_open, gap_extend)  
top_aln = alns[0]
aln_human, aln_mouse, score, begin, end = top_aln
print aln_human+'\n'+aln_mouse

output:

MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP----
MEESQSDISLELPLSQETFSGLWKLLPPEDIL-PSP-HCMDDLLL-PQDVEEFF-EGPSEALRV
biopython python pairwise alignment • 10k views
ADD COMMENT
1
Entering edit mode

Have you tried 2 nested for loops?

ADD REPLY
0
Entering edit mode

No.I haven't tried nested for loop.

ADD REPLY
0
Entering edit mode

from Bio import SeqIO How to import this package?

ADD REPLY
0
Entering edit mode

Please do not warm up 7-year old threads, especially if the question is not directly related to the original topic. Please browse the Web and StackExchange for python-related questions/answers.

ADD REPLY
3
Entering edit mode
12.9 years ago

You can do it with something quick and dirty like this:

import sys
from Bio import SeqIO

inFile = open(sys.argv[1],'r')

done = {}
entries = []
for entry in SeqIO.parse(inFile,'fasta'):
    entries.appendentry.id,entry.seq))

for entryA in entries:
    for entryB in entries:
        idA = entryA[0]
        idB = entryB[0]

        if not done.has_key(idA + idB):
            seqA = entryA[1]
            seqB = entryB[1]

            #DO YOUR PAIRWISE ALIGNMENT STUFF HERE WITH seqA and seqB

            done[idA + idB] = True
            done[idB + idA] = True

But why do you need to do these pair-wise alignments? Are you trying to making a tree? There might be easier ways to accomplish what you want if you tell us your goal.

ADD COMMENT
0
Entering edit mode

@DK Thank you very much for your code.I am trying to do hierarchial clustering.First I have to create a distance matrix based on this formula "distance=100-sequence identity". After that I have to do clustering and has to create a dendrogram also.

ADD REPLY
0
Entering edit mode

@DK.Thank you very much for your code.I didn't get any output using your code. I am trying to do hierarchical clustering. First I have to create a distance matrix based on this formula "distance=100-sequence identity". After that I have to create a Dendrogram based on hierarchial clustering.

ADD REPLY

Login before adding your answer.

Traffic: 889 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6