cell sizes in R plots
1
0
Entering edit mode
10.0 years ago
Assa Yeroslaviz ★ 1.8k

Hi,

this is a follow-up from this Question.

I have created a plot using the ggplot2 package. But as the matrix is very large (almost 146000 rows), the single cells of the image are quite small.

I would like to know how to make the single cell sizes bigger, so I will get a better overview inside the image about the differences between the different cells.

I also would like to know if it is possible to make a bigger (longer) lengend of more than just five elements? I would like to create at least 20 different coloured groups of distinguish different colors

( a logical question - why does it plot it in a triangle?)

This is how I create the plot:

require(ggplot2)
pl1 <- ggplot(data, aes(y = partner1, x = partner2)) + geom_tile(aes(fill = Substract)) + scale_fill_continuous(low = "blue", high = "green") + scale_size(range = c(1, 20000))

Thanks
Assa

plotting ggplot2 R • 3.4k views
ADD COMMENT
0
Entering edit mode

By "legend", do you instead mean tick labels? The actual legend is the color bar on the right.

BTW, the answer to your logic question is that your data is in a triangular shape (well, partner1~partner2 is). This is likely due to how you generated those values.

Edit: You can test what I mentioned above regarding shape with with(data, table(parner1>partner2)).

ADD REPLY
0
Entering edit mode

Actually I do mean the bar at the side of the image :-)

ADD REPLY
0
Entering edit mode

If you just want more colors then try scale_fill_gradientn().

ADD REPLY
0
Entering edit mode

I wonder whether making such a heatmap of your data is the best way to vizualize the results in the first place. In the heatmap, genomic positions are converted to non-numeric values which makes it hard to see the relative distances along the genome. Furthermore, I think there is too much data per pixel and the distribution of "substraction-values" is very scewed towards the lower numbers which makes it hard to see the different colors (maybe you should log-transform the substraction-values).

But what do you really want to show to the viewer? The things I can think of are:

  1. How much interactions are observed at what genomic position
  2. Are there genomic hotspots where there are lots of interactions
  3. What genomic locations co-relate (read: if genomic position A is interacting, often genomic position B is also interacting)

In any case, it is very hard to answer with the tile-plot you are trying to make.

In case of 1 and 2, make a karygram overview of the distribution (histogram) of interactions along the chromosomes. Then use geom_histogram() + facet_wrap(~chr,...)

In case of 3, compure the pairwise correlation coefficients between all positions and make a How Do I Draw A Heatmap In R With Both A Color Key And Multiple Color Side Bars?

ADD REPLY
0
Entering edit mode

It is not really continuous genomic positions, but bins of 1000 (or 5000) positions summarized into one value. And I do have only one chromosome. I thought a heatmap will be better, because of the better (coloured) overview. (Is there a way to do a histograms with different colours for specific values ranges?). I will try the histogram - BTW did you mean karyogram (Is it a GRanges Object?).

ADD REPLY
0
Entering edit mode
10.0 years ago

When making high-density scatterplots, consider looking into hexbin() or sunflowerplot() as alternatives. Alternatively, you can do levelplot() with the lattice library.

ADD COMMENT
0
Entering edit mode

thanks for the tips, but I don't think the sunflowerplot() and the hexbin() are the right plot option for me. Though I find the sunflowerplot a very nice thing - if it is possible to alter the plot to my needs it will be great.

I have three different columns I need to plot. one is for the x-axis, the second for the y-axis and the third one is the informations which come in the plot itself. The sunflowerplot compare two columns with each other.

ADD REPLY

Login before adding your answer.

Traffic: 2657 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6