Simple visual output of Pairwise Sequence alignment
3
1
Entering edit mode
5.9 years ago

I am writing a program in Python that uses BWA aligner to map reads to a reference plasmid genome which then outputs a consensus file and aligned against the reference genome using Clustal Omega. I would like to output an easy to visualise graphic of the alignment that is similar to the bottom alignment map in Benchling (figure 2 & 3 in the link attached as an example).

http://2017.igem.org/Team:Edinburgh_OG/Results/T7

I have checked on Benchling's GitHub and their source code to see if they are using an existing open source program to create these visuals or they have created their own program to do so but can't seem to find where this graphic is coming from. I haven't had much experience working with creating graphics so if any one has a few ideas of how to go about creating a graphic that is similar to this I would appreciate some guidance. I have looked into using Tkinter and PIL but not quite sure how to go about using them as the sequences are between 8000-12000 bp's long and I am not sure that colouring each individual pixel depending on if it matches the reference is the best course of action as I think that would make the image quite lengthy.

alignment pairwise python sequence visualisation • 2.8k views
ADD COMMENT
0
Entering edit mode

I think you forgot the link?

In addition, what is your reason for using Clustal Omega for the alignment as it produces a multiple sequence alignment - wouldnt, say, EMBOSS needle (or some dedicated genome aligner) make more sense?

Edit: question about usage of Clustal Omega

ADD REPLY
0
Entering edit mode

Link is now visible.

ADD REPLY
0
Entering edit mode

Since the plasmid genomes were so short I used Clustal Omega but EMBOSS sounds like it would be better. I will try that, thanks!

ADD REPLY
2
Entering edit mode
5.8 years ago
vimalkvn ▴ 320

I had implemented something similar in RiboPlot to mark the positions of start and stop codons in three reading frames. This was done using Matplotlib. Code is on github and the plot is generated in the plot_profile function.

enter image description here

ADD COMMENT
0
Entering edit mode

Thank you, I'll definitely take a look at your program!

ADD REPLY
0
Entering edit mode
5.8 years ago

I have come up with a simple solution using PIL. This uses the aligned sequences in one FastA file.

from PIL import Image, ImageDraw
from Bio import SeqIO
from operator import itemgetter
from itertools import groupby

def diff(a, b):
    index_list = []
    for index, (char1, char2) in enumerate(zip(a, b)):
        if char1 != char2:
            index_list.append(index)
    return index_list

fasta_sequences = SeqIO.parse(open("aligned_seqs.fasta"),'fasta')

seq1 = ''
seq2 = ''
count = 0
for fasta in fasta_sequences:
    name, sequence = fasta.id, str(fasta.seq)
    if count == 0:
        seq1 = sequence
    elif count == 1:
        seq2 = sequence
    count +=1

d = diff(seq1, seq2)

ranges = []
for k,g in groupby(enumerate(d),lambda x:x[0]-x[1]):
    group = (map(itemgetter(1),g))
    group = list(map(int,group))
    ranges.append((group[0],group[-1]))
print(ranges)

if len(seq1) >= len(seq2):
    image_len = len(seq1)
else:
    image_len = len(seq2)
# size of image
canvas = (image_len, 600)

# scale ration
scale = 6
thumb = canvas[0]/scale, canvas[1]/5

# init canvas
im = Image.new('RGBA', canvas, (255, 255, 255, 255))
draw = ImageDraw.Draw(im)

draw.rectangle([0, 0, (len(seq1)), 150], fill="gray")
draw.rectangle([0, 250, (len(seq2)), 400], fill="gray")

# draw rectangles that don't match the reference sequence
for start, end in ranges:
    draw.rectangle([start, 250, end, 400], fill="red")

# make thumbnail
im.thumbnail(thumb)

# save image
im.save('./seq_align_PIL.png')
ADD COMMENT
0
Entering edit mode

Could you post an image, would be interesting to see how it looks like.

ADD REPLY
1
Entering edit mode
ADD REPLY
1
Entering edit mode

Aligned Sequence Image

ADD REPLY
0
Entering edit mode
5.7 years ago
gb ★ 2.2k

When I read plasmid and visualization I think off circos.

http://circos.ca/

ADD COMMENT

Login before adding your answer.

Traffic: 2652 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6