Sequence Alignment Visualization
1
0
Entering edit mode
9.3 years ago
BDK_compbio ▴ 140

I have five fasta files and I need to visualize the how a particular segment is shared by the sequences. It is similar to multiple sequence alignment. I have a table which contains the anchor is, its length and the starting position in all/some sequences. The table looks like the following

Anchor     Length     S1     S2     S3     S4     S5
1              49            _       100    102    -        105
2              63           201     -        205   200    -
3              75           324    325      -        -        326

Here the first anchor is shared by Sequence2, Sequence 3 and Sequence 5

What tool I should use to visualize the above information?

alignment sequence • 3.1k views
ADD COMMENT
2
Entering edit mode
9.3 years ago

If you are comparing aligned sequences within an anchor class, perhaps rotate your table 90 degrees and copy results from a per-anchor multiple alignment with NEEDLE (or a similar tool, depending on your parameters) and paste each line into the anchor-by-sequence cell. Then present that table. So long as your sequences are of equal length and you use a monospaced font, the aligned bases will line up.

If you need a graphical Circos-like figure, consider a five-spindle hive plot. Each of the five spindles is a range representing the minimum to maximum of the sequence start and end positions -- from 100 to 401 -- rescaled to a normalized range of 0 to 1.

Draw three colors of hive ribbons for each of the three classes ("anchors") that connect from one spindle/sequence to the other. The width of the ribbon is the (normalized) length parameter in your table, and a ribbon's start position along a spindle is set by the sequence start position.

For example, for Anchor 1, we use spindles 2, 3 and 5 to represent sequences S2, S3 and S5. The length of every spindle is 401 - 100 or 301 units. Ribbons for Anchor 1 are therefore of width (49 / 301) or 16% of the length of a spindle. We draw a ribbon from S2 to S3. Given a "unit" spindle or normalized length 1, spindle 2's ribbon starts at the 0 position and ends at the 0.16 position. It connects with spindle 3, and it starts along spindle 3 at a normalized position of (2 / 301) or 0.01, and ends at 0.01 + 0.16 = 0.17. Another "Anchor 1" ribbon connects S3 to S5 at [0.01, 0.17] on spindle 3, to [0.02, 0.18] on spindle 5.

This process is repeated for the other two anchors. Anchor 3 ribbons would mostly hug the outer range of the spindles, since they start at ((324 - 100) / 301) = 0.74 and work their way out very slightly.

You could use three colors to denote ribbons for each anchor-class. Further, because ribbons would overlap, you could apply a fractional opacity to the three ribbon colors. This allows the viewer to see connections that span intermediate spindles, which might overlap ribbons from other anchor-classes.

One tool I have used to make publication-quality hive plots is hiveR. There is also a d3-based version for web visualizations. This variant is probably a close rendition of your scenario, but with different widths, start and end positions for ribbons, and two more spindles:

< image not found >

ADD COMMENT
0
Entering edit mode

Hi Alex,

Thanks! I am actually looking for a tool similar to http://circos.ca/ or a tool whose output is five horizontal line with edges connecting anchors.

ADD REPLY
0
Entering edit mode

See edited answer.

ADD REPLY
0
Entering edit mode

Thanks again. The figures like the following would be little better.

ADD REPLY
0
Entering edit mode

Yeah, you can definitely make that with Circos!

ADD REPLY
0
Entering edit mode

I was trying that, but still struggling with the data table http://circos.ca/tutorials/lessons/configuration/data_files/

Could you please suggest what modifications need to be done? Also I was referring to the online circos table viewer, but could not understand what the table represents. http://mkweb.bcgsc.ca/tableviewer/

ADD REPLY

Login before adding your answer.

Traffic: 1928 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6