Question: How to plot the heatmap of gene expression for very large data set ?
3
gravatar for jack
4.1 years ago by
jack710
Germany
jack710 wrote:

Hi,

I have gene expression matrix from NGS data. it's around 30000 genes and 1000 samples.

I want  to create the heatmap of this gene expression matrix. I used heatmap() function in R, but it does not work for very large data. Would someone recommend me a package to create heatmap of big data matrix ?

 

 

 

 

bioinformatics rna-seq next-gen R • 5.7k views
ADD COMMENTlink modified 2.0 years ago by Guangchuang Yu1.9k • written 4.1 years ago by jack710
2

Well you're going to want to subset that anyway, since a 30000x1000 heatmap won't be very interpretable.

ADD REPLYlink written 4.1 years ago by Devon Ryan82k
2

Like Devon Ryan said, it won't make sense to create a heatmap that large. Why don't you find the differentially expressed genes and create a heatmap of those genes instead?

ADD REPLYlink written 4.1 years ago by komal.rathi3.2k

I want to have global veiw about the expression landescap of my genes. that's why I want to look at it in this way.
 

ADD REPLYlink written 4.1 years ago by jack710
7

Although, I still don't think it's a good idea, but if you really want then you can use the R package pheatmap to create & plot clusters of similarly expressed genes. So instead of plotting 30000 genes, you will be plotting x number (can be 25, 50, 100 or more) of clusters of similarly expressed genes by providing a value to k_means parameter in the pheatmap function. If you want to cluster rows, use cluster_rows=T and to cluster columns, use cluster_cols=T (you may want to do both because of the large dataset).

You can cluster both the rows & the columns using either a distance matrix or using a distance measure like "euclidean" or "correlation".

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by komal.rathi3.2k

The pheatmap package is new to me, thanks for pointing it out!

ADD REPLYlink written 4.1 years ago by Devon Ryan82k

I used the package once. Its a very powerful package, and you can do a lot of things with it, provided you read the manual thoroughly. It makes "pretty heatmaps" of your ugly data, hence the name.

ADD REPLYlink written 4.1 years ago by komal.rathi3.2k
1

Your global view won't be changed by subsetting a bit.

ADD REPLYlink written 4.1 years ago by Devon Ryan82k

Then, what is the reasonable subset size ?

ADD REPLYlink written 4.1 years ago by jack710
3

Try a 1000 or so genes and a 100 samples and then increase that by a bit to see if there are any large changes. If there aren't, then you're catching the gist of the global structure in your subset.
 

ADD REPLYlink written 4.1 years ago by Devon Ryan82k
1
gravatar for Chrispin Chaguza
3.9 years ago by
University of Liverpool, UK
Chrispin Chaguza220 wrote:

pheatmap and ggplot's heatmap.2 functions in R could be useful for this task. If these don't work then a better alternative would be to create your own script to draw the heatmap using reportlab graphics module in Python (or any other graphics modules available).

ADD COMMENTlink written 3.9 years ago by Chrispin Chaguza220
1
gravatar for Guangchuang Yu
2.0 years ago by
Guangchuang Yu1.9k
China/Hong Kong/The University of Hong Kong
Guangchuang Yu1.9k wrote:
> ?image
image                 package:graphics                 R Documentation

Display a Color Image

Description:

     Creates a grid of colored or gray-scale rectangles with colors
     corresponding to the values in ‘z’.  This can be used to display
     three-dimensional or spatial data aka _images_.  This is a generic
     function.

     The functions ‘heat.colors’, ‘terrain.colors’ and ‘topo.colors’
     create heat-spectrum (red to white) and topographical color
     schemes suitable for displaying ordered data, with ‘n’ giving the
     number of colors desired.

Usage:

     image(x, ...)

     ## Default S3 method:
     image(x, y, z, zlim, xlim, ylim, col = heat.colors(12),
           add = FALSE, xaxs = "i", yaxs = "i", xlab, ylab,
           breaks, oldstyle = FALSE, useRaster, ...)

image is the fastest command in R to display heatmap.

ADD COMMENTlink written 2.0 years ago by Guangchuang Yu1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1075 users visited in the last hour