Question: Biopythons PairwiseAligner, ".aligned" attribute doesn't work.
0
gravatar for Mick
12 days ago by
Mick10
Mick10 wrote:

Hey guys,

i'm trying to use the Biopython PairwiseAligner as such:

from Bio import Align
aligner = Align.PairwiseAligner()

my_target = "ACTTGATCTTTCGT"
my_query = "CTTGATCT"

aligner.gap_score = -1
aligner.match = 1
aligner.mismatch = -1
aligner.query_end_gap_score = 0

alignments = aligner.align(my_target, my_query)
alignment = alignments[0]
print(alignment.aligned)

This should give me a representation of where the alignments occured relative to the other sequence:

Use the aligned property to find the start and end indices of subsequences in the target and query sequence that were aligned to each other. Generally, if the alignment between target (t) and query (q) consists of N chunks, you get two tuples of length N: (((t_start1, t_end1), (t_start2, t_end2), ..., (t_startN, t_endN)), ((q_start1, q_end1), (q_start2, q_end2), ..., (q_startN, q_endN))) In the current example, ‘alignment.aligned‘ returns two tuples of length 2:

 >>> alignment.aligned
 (((0, 2), (4, 5)), ((0, 2), (2, 3)))
  

However I'm getting this error:

AttributeError: 'PairwiseAlignment' object has no attribute 'aligned'

Can someone please explain where I'm going wrong?

alignment • 96 views
ADD COMMENTlink modified 12 days ago by Joe14k • written 12 days ago by Mick10

To the best of my assessment, this may just be some deprecated functionality that has been missed in the docs.

Could you post what python and biopython versions you're using, and I'll try to draw this thread to the attention of the biopython devs.

ADD REPLYlink written 12 days ago by Joe14k

Not deprecated(?):

From the cookbook (16 July 2019) applied on your code:

from Bio import Align
aligner = Align.PairwiseAligner()

my_target = "ACTTGATCTTTCGT"
my_query = "CTTGATCT"

aligner.gap_score = -1
aligner.match = 1
aligner.mismatch = -1
aligner.query_end_gap_score = 0

alignments = aligner.align(my_target, my_query)
for alignment in alignments:
    print(alignment)

https://biopython.org/DIST/docs/tutorial/Tutorial.html

ADD REPLYlink modified 12 days ago • written 12 days ago by gb1.0k

That doesn't appear to address the issue of the .aligned attribute being missing?

I can emulate the problem on Biopython 1.73, and the online cookbook does indeed seem to suggest this functionality still exists, in section 6.5.2.7 Alignment object:

https://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc88

The aligner.align method returns PairwiseAlignment objects, each representing one alignment between the two sequences.

>>> from Bio import Align
>>> aligner = Align.PairwiseAligner()
>>> seq1 = "GAACT"
>>> seq2 = "GAT"
>>> alignments = aligner.align(seq1, seq2)
>>> alignment = alignments[0]
>>> alignment
<Bio.Align.PairwiseAlignment object at 0x10204d250>
Each alignment stores the alignment score:

>>> alignment.score
3.0
as well as pointers to the sequences that were aligned:

>>> alignment.target
'GAACT'
>>> alignment.query
'GAT'
Print the PairwiseAlignment object to show the alignment explicitly:

>>> print(alignment)
GAACT
||--|
GA--T
<BLANKLINE>
You can also represent the alignment as a string in PSL (Pattern Space Layout, as generated by BLAT [26]) format:

>>> format(alignment, 'psl')
'3\t0\t0\t0\t0\t0\t1\t2\t+\tquery\t3\t0\t3\ttarget\t5\t0\t5\t2\t2,1,\t0,2,\t0,4,\n'
Use the aligned property to find the start and end indices of subsequences in the target and query sequence that were aligned to each other. Generally, if the alignment between target (t) and query (q) consists of N chunks, you get two tuples of length N:

(((t_start1, t_end1), (t_start2, t_end2), ..., (t_startN, t_endN)),
 ((q_start1, q_end1), (q_start2, q_end2), ..., (q_startN, q_endN)))
In the current example, ‘alignment.aligned‘ returns two tuples of length 2:

>>> alignment.aligned
(((0, 2), (4, 5)), ((0, 2), (2, 3)))
while for the alternative alignment, two tuples of length 3 are returned:

>>> alignment = alignments[1]
>>> print(alignment)
GAACT
|-|-|
G-A-T
<BLANKLINE>
>>> alignment.aligned
(((0, 1), (2, 3), (4, 5)), ((0, 1), (1, 2), (2, 3)))
  
ADD REPLYlink modified 12 days ago • written 12 days ago by Joe14k

Update:

Biopython devs on twitter asked for it to be placed as a github issue. You can follow its progress here: https://github.com/biopython/biopython/issues/2294

EDIT, resolved and closed.

ADD REPLYlink modified 12 days ago • written 12 days ago by Joe14k
0
gravatar for Mick
12 days ago by
Mick10
Mick10 wrote:

Hey Joe,

thank you so much for the help. It was indeed an issue with an older version of biopython on my system. Apparently I had biopython on my system twice and I updated the wrong installation.

Here is the versions and system I used:

>>>import sys; print(sys.version)
3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0]
>>>import platform; print(platform.python_implementation()); print(platform.platform())
CPython
Linux-4.15.0-48-generic-x86_64-with-Ubuntu-18.04-bionic
>>>import Bio; print(Bio.__version__)
1.73

I tried the update again and now I have the latest Biopython version installed.

>>>import Bio; print(Bio.__version__)
1.74

Now the output of my code snippet above is as expected:

(((1, 9),), ((0, 8),))

Thank you guys very much for the help! :)

P.S. Just out of curiosity. How did people work with the Pairwise Aligner before this functionality was added? Knowing where the query string aligns seems like an integral part in many applications of the algorithm?

ADD COMMENTlink modified 12 days ago • written 12 days ago by Mick10

It would be fairly trivial to work that out from the sequence representations of the alignments, something like the zip approach I used here comes to mind: A: How to display mismatched sequences from alignment when using Biopython

To be honest, I've never actually used the PairwiseAligner() class. I've always used pairwise2 and parsed the outputs. If I'm doing any serious alignment task, I probably wouldn't reach to do it in python anyway, since the speed would not scale brilliantly with large datasets which is not uncommon if you're trying to do things like all-vs-all pairwise alignments. I'd just find a commandline tool for it.

ADD REPLYlink modified 12 days ago • written 12 days ago by Joe14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1413 users visited in the last hour