Question

Biopythons PairwiseAligner, ".aligned" attribute doesn't work.

0

Entering edit mode

4.5 years ago

Mick ▴ 30

Hey guys,

i'm trying to use the Biopython PairwiseAligner as such:

from Bio import Align
aligner = Align.PairwiseAligner()

my_target = "ACTTGATCTTTCGT"
my_query = "CTTGATCT"

aligner.gap_score = -1
aligner.match = 1
aligner.mismatch = -1
aligner.query_end_gap_score = 0

alignments = aligner.align(my_target, my_query)
alignment = alignments[0]
print(alignment.aligned)

This should give me a representation of where the alignments occured relative to the other sequence:

Use the aligned property to find the start and end indices of subsequences in the target and query sequence that were aligned to each other. Generally, if the alignment between target (t) and query (q) consists of N chunks, you get two tuples of length N: (((t_start1, t_end1), (t_start2, t_end2), ..., (t_startN, t_endN)), ((q_start1, q_end1), (q_start2, q_end2), ..., (q_startN, q_endN))) In the current example, ‘alignment.aligned‘ returns two tuples of length 2:
 >>> alignment.aligned
 (((0, 2), (4, 5)), ((0, 2), (2, 3)))
  

However I'm getting this error:

AttributeError: 'PairwiseAlignment' object has no attribute 'aligned'

Can someone please explain where I'm going wrong?

alignment • 1.9k views

ADD COMMENT • link updated 4.5 years ago by Joe 21k • written 4.5 years ago by Mick ▴ 30

0

Entering edit mode

To the best of my assessment, this may just be some deprecated functionality that has been missed in the docs.

Could you post what python and biopython versions you're using, and I'll try to draw this thread to the attention of the biopython devs.

ADD REPLY • link 4.5 years ago by Joe 21k

0

Entering edit mode

Not deprecated(?):

From the cookbook (16 July 2019) applied on your code:

from Bio import Align
aligner = Align.PairwiseAligner()

my_target = "ACTTGATCTTTCGT"
my_query = "CTTGATCT"

aligner.gap_score = -1
aligner.match = 1
aligner.mismatch = -1
aligner.query_end_gap_score = 0

alignments = aligner.align(my_target, my_query)
for alignment in alignments:
    print(alignment)

https://biopython.org/DIST/docs/tutorial/Tutorial.html

ADD REPLY • link 4.5 years ago by gb ★ 2.2k

0

Entering edit mode

That doesn't appear to address the issue of the .aligned attribute being missing?

I can emulate the problem on Biopython 1.73, and the online cookbook does indeed seem to suggest this functionality still exists, in section 6.5.2.7 Alignment object:

https://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc88

The aligner.align method returns PairwiseAlignment objects, each representing one alignment between the two sequences.

>>> from Bio import Align
>>> aligner = Align.PairwiseAligner()
>>> seq1 = "GAACT"
>>> seq2 = "GAT"
>>> alignments = aligner.align(seq1, seq2)
>>> alignment = alignments[0]
>>> alignment
<Bio.Align.PairwiseAlignment object at 0x10204d250>
Each alignment stores the alignment score:

>>> alignment.score
3.0
as well as pointers to the sequences that were aligned:

>>> alignment.target
'GAACT'
>>> alignment.query
'GAT'
Print the PairwiseAlignment object to show the alignment explicitly:

>>> print(alignment)
GAACT
||--|
GA--T
<BLANKLINE>
You can also represent the alignment as a string in PSL (Pattern Space Layout, as generated by BLAT [26]) format:

>>> format(alignment, 'psl')
'3\t0\t0\t0\t0\t0\t1\t2\t+\tquery\t3\t0\t3\ttarget\t5\t0\t5\t2\t2,1,\t0,2,\t0,4,\n'
Use the aligned property to find the start and end indices of subsequences in the target and query sequence that were aligned to each other. Generally, if the alignment between target (t) and query (q) consists of N chunks, you get two tuples of length N:

(((t_start1, t_end1), (t_start2, t_end2), ..., (t_startN, t_endN)),
 ((q_start1, q_end1), (q_start2, q_end2), ..., (q_startN, q_endN)))
In the current example, ‘alignment.aligned‘ returns two tuples of length 2:

>>> alignment.aligned
(((0, 2), (4, 5)), ((0, 2), (2, 3)))
while for the alternative alignment, two tuples of length 3 are returned:

>>> alignment = alignments[1]
>>> print(alignment)
GAACT
|-|-|
G-A-T
<BLANKLINE>
>>> alignment.aligned
(((0, 1), (2, 3), (4, 5)), ((0, 1), (1, 2), (2, 3)))

ADD REPLY • link 4.5 years ago by Joe 21k

0

Entering edit mode

Update:

Biopython devs on twitter asked for it to be placed as a github issue. You can follow its progress here: https://github.com/biopython/biopython/issues/2294

EDIT, resolved and closed.

ADD REPLY • link 4.5 years ago by Joe 21k

score 0 · Answer 1 · 2019-10-09

Hey Joe,

thank you so much for the help. It was indeed an issue with an older version of biopython on my system. Apparently I had biopython on my system twice and I updated the wrong installation.

Here is the versions and system I used:

>>>import sys; print(sys.version)
3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0]
>>>import platform; print(platform.python_implementation()); print(platform.platform())
CPython
Linux-4.15.0-48-generic-x86_64-with-Ubuntu-18.04-bionic
>>>import Bio; print(Bio.__version__)
1.73

I tried the update again and now I have the latest Biopython version installed.

>>>import Bio; print(Bio.__version__)
1.74

Now the output of my code snippet above is as expected:

(((1, 9),), ((0, 8),))

Thank you guys very much for the help! :)

P.S. Just out of curiosity. How did people work with the Pairwise Aligner before this functionality was added? Knowing where the query string aligns seems like an integral part in many applications of the algorithm?