Question: How to plot the heatmap of gene expression for very large data set ?
3
gravatar for jack
4.3 years ago by
jack720
Germany
jack720 wrote:

Hi,

I have gene expression matrix from NGS data. it's around 30000 genes and 1000 samples.

I want  to create the heatmap of this gene expression matrix. I used heatmap() function in R, but it does not work for very large data. Would someone recommend me a package to create heatmap of big data matrix ?

 

 

 

 

bioinformatics rna-seq next-gen R • 6.1k views
ADD COMMENTlink modified 2.2 years ago by Guangchuang Yu2.0k • written 4.3 years ago by jack720
2

Well you're going to want to subset that anyway, since a 30000x1000 heatmap won't be very interpretable.

ADD REPLYlink written 4.3 years ago by Devon Ryan85k
2

Like Devon Ryan said, it won't make sense to create a heatmap that large. Why don't you find the differentially expressed genes and create a heatmap of those genes instead?

ADD REPLYlink written 4.3 years ago by komal.rathi3.2k

I want to have global veiw about the expression landescap of my genes. that's why I want to look at it in this way.
 

ADD REPLYlink written 4.3 years ago by jack720
7

Although, I still don't think it's a good idea, but if you really want then you can use the R package pheatmap to create & plot clusters of similarly expressed genes. So instead of plotting 30000 genes, you will be plotting x number (can be 25, 50, 100 or more) of clusters of similarly expressed genes by providing a value to k_means parameter in the pheatmap function. If you want to cluster rows, use cluster_rows=T and to cluster columns, use cluster_cols=T (you may want to do both because of the large dataset).

You can cluster both the rows & the columns using either a distance matrix or using a distance measure like "euclidean" or "correlation".

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by komal.rathi3.2k

The pheatmap package is new to me, thanks for pointing it out!

ADD REPLYlink written 4.3 years ago by Devon Ryan85k

I used the package once. Its a very powerful package, and you can do a lot of things with it, provided you read the manual thoroughly. It makes "pretty heatmaps" of your ugly data, hence the name.

ADD REPLYlink written 4.3 years ago by komal.rathi3.2k
1

Your global view won't be changed by subsetting a bit.

ADD REPLYlink written 4.3 years ago by Devon Ryan85k

Then, what is the reasonable subset size ?

ADD REPLYlink written 4.3 years ago by jack720
3

Try a 1000 or so genes and a 100 samples and then increase that by a bit to see if there are any large changes. If there aren't, then you're catching the gist of the global structure in your subset.
 

ADD REPLYlink written 4.3 years ago by Devon Ryan85k
1
gravatar for Chrispin Chaguza
4.1 years ago by
University of Liverpool, UK
Chrispin Chaguza230 wrote:

pheatmap and ggplot's heatmap.2 functions in R could be useful for this task. If these don't work then a better alternative would be to create your own script to draw the heatmap using reportlab graphics module in Python (or any other graphics modules available).

ADD COMMENTlink written 4.1 years ago by Chrispin Chaguza230
1
gravatar for Guangchuang Yu
2.2 years ago by
Guangchuang Yu2.0k
China/Hong Kong/The University of Hong Kong
Guangchuang Yu2.0k wrote:
> ?image
image                 package:graphics                 R Documentation

Display a Color Image

Description:

     Creates a grid of colored or gray-scale rectangles with colors
     corresponding to the values in ‘z’.  This can be used to display
     three-dimensional or spatial data aka _images_.  This is a generic
     function.

     The functions ‘heat.colors’, ‘terrain.colors’ and ‘topo.colors’
     create heat-spectrum (red to white) and topographical color
     schemes suitable for displaying ordered data, with ‘n’ giving the
     number of colors desired.

Usage:

     image(x, ...)

     ## Default S3 method:
     image(x, y, z, zlim, xlim, ylim, col = heat.colors(12),
           add = FALSE, xaxs = "i", yaxs = "i", xlab, ylab,
           breaks, oldstyle = FALSE, useRaster, ...)

image is the fastest command in R to display heatmap.

ADD COMMENTlink written 2.2 years ago by Guangchuang Yu2.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 746 users visited in the last hour