Local alignment in biopython
1
0
Entering edit mode
8.2 years ago
tinysnippets ▴ 40

Hi there, folks! I have a misunderstanding about biopython's local alignment function (localxx from here). So I have a rather short sequence of DNA - about 300 bp, lets call it S. And I have a large set of short (about 75 bp) seqs that considered as candidates to be Ss prefix, lets call this guys R. What Id like to do is to align R members to S by local alignment algorithm. Lets say S is "PREFIXPART_LOOONGSUFFIXPART" and R member is "PREFIX" it yields following

>>> for res in pairwise2.align.localxx("PREFIX", "PREFIXPART_LOOONGSUFFIXPART"): print(res)
('PREFI-----------------X----', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 23)
('PRE----------------F-IX----', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 23)
('PREF-----------------IX----', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 23)
('PRE-----------------FIX----', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 23)
('PREFIX---------------------', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 6)

As you can see it actually works as expected for this two strings, but in my case it refuses to see, that prefix actually aligns to template one - to - one (or with very minor modifications) and instead of getting "compact" alignments I get very distributed ones. How to handle this situation? Should I use some kind of score matrix or something?

sequence alignment • 2.3k views
ADD COMMENT
3
Entering edit mode
8.2 years ago
Eric T. ★ 2.8k

The function localxx aligns the sequences with no gap penalties, but you do want to penalize gaps. You could try localxs instead:

>>> for res in pairwise2.align.localxs("PREFIX", "PREFIXPART_LOOONGSUFFIXPART", -1, -1): print(res)
('PREFIX---------------------', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 6)
ADD COMMENT

Login before adding your answer.

Traffic: 1785 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6