Pangenome visualization: flowerplot?
1
0
Entering edit mode
9.6 years ago
Christian ▴ 30

I have partitioned the gene complement of 21 strains of a particular species of Strep into core, dispensable, and unique sets, but I'm at a loss as how best to represent these data. I originally thought Venn diagram illustrating the total sizes of each partition, but wasn't satisfied that I was able to accurately portray my data in this fashion. I did some searching and found a very interesting method to capture a more appropriate representation (Fig. A, B and C), something the authors called a "flowerplot" 1 (a new one on me). I've been trying to re-create this sort of visualization manually using matplotlib in Python 3.3 as there doesn't seem to be any package that exists to provide comparable output in a more automatic way, but haven't had much success.

What I've tried:

  • Adding Ellipse patches to Cartesian axes. Not satisfactory because the ellipse patches' xy argument centers the ellipse at (x,y), where I'd need some way to rotate the ellipse about the origin to achieve the desired effect.
  • Adding Ellipse patches to polar axes. This was a complete mess; I can get one good ellipse but can't place any others reliably (most likely due to my lack of understanding of using polar coordinates!).

Additionally, using Ellipse patches might not end up being the best move, since I'll need to annotate each ellipse with, at the very least, strain ID and count information.

Is anyone familiar with a way to either effectively visualize these data, or perhaps duplicate the plots I linked?

1 Sugawara, et al. (2013). Comparative genomics of the core and accessory genomes of 48 Sinorhizobium strains comprising five genospecies. Genome Biology 2013, 14:R17. doi:10.1186/gb-2013-14-2-r17 or http://genomebiology.com/2013/14/2/R17.

genome visualization • 3.0k views
ADD COMMENT
0
Entering edit mode

The flowerplots you link to would be better represented as a simple table with a "sum" or "total" column for the genus/species. Then you could sort them in a meaningful order.

ADD REPLY
1
Entering edit mode
9.6 years ago
Ryan Dale 5.0k

One option might be "binary heatmaps". I find them useful for visualization of combinatorial ChIP-seq binding, for example, Figure 6E here.

Using the row/column format of that figure, in your case you would have a row for each strain and a column for each gene. Instead of just a black/white 1/0 as in that example, you could encode the gene type as 1/2/3 for core/dispensable/unique. So you'd have at least 3 colors in the heatmap. The trick would then be to play around with different sort orders or clustering to get some meaningful interpretation.

ADD COMMENT
0
Entering edit mode

I was thinking about something along these lines, but was so enamored of the figure I referenced I couldn't move on without consulting the world at large ;)

ADD REPLY

Login before adding your answer.

Traffic: 3001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6