Visualization Of Data: Multiple 2D Matrices
Entering edit mode
13.1 years ago
jvijai ★ 1.2k


Imagine that my data is a set of genes with their expression values for a disease. However, my disease can be further subcategorized to 3-4 subgroups based on its severity/progression.

[or if you are more comfortable, you can imagine the data is from several different GWAS and you are looking for the top 1000 SNPs and looking for the recurrence and magnitude of the effects]

I want to plot this data in some fashion where the idea of the graph(plot) is to show that there is(or isnt) an overlap of genes between the different subgroups.I also want to show the magnitude of the difference in genexp when an overlap occurs.

Please advise, what kind of plot(s) will be most suited. I looked at and have not found so far, something that can deal with multiple arrays in a distinct fashion. If the data needs to be re-distilled, what should be done.

visualization matrix r • 3.1k views
Entering edit mode
13.1 years ago

How about a standard hierarchically clustered heatmap?

  • genes on one axis
  • arrays on the other
  • expression values as colors
  • branch length for diseases/subtypes

edit: R code would look something like this:

my.topexprs = get_expression_set_to_plot_here
my.dist = function(x) dist(x, method="euclidean")
my.hclust = function(d) hclust(d, method="ward")
heatmap(my.topexprs, distfun=my.dist, hclustfun=my.hclust, labRow=my.labels, 
    main="Heatmap of 62 samples vs 43 class-discriminant genes")

You could also use the function Philippe suggested.

Entering edit mode

Thank you, that may work. Is there a specific package that will accomplish this.

Entering edit mode

What Michael suggested is generally called a heatmap or image plot. You have some default functions i R (image or heatmap) but people generally use the heatmap.2 functions from the gplots package. Heatmaps are a very classical and efficient way to plot 3 dimensional data. The clustering (that can be done directly though the heatmap.2 functions with the Rowv and Colv arguments) can add some information to cluster together subtypes with similar pattern of gene expression.

Entering edit mode

You're right, I didn't put the actual plot name in there. Fixed.

Entering edit mode
13.1 years ago
David W 4.9k

Have you looked at (the very spiffy) ggplot2 library, in particular facet_grid()?

Here's an example of something you might consider as a starting point:

#fake data for 25 genes from 20 patients in 4 subgroups (i.e. 500 datapoints)
patient.ID <- rep(1:20, each=25)
patient.type <- rep(c("control", "early", "middle", "late"), each=125) 
gene <- as.factor(rep(letters[1:20], 25))
expression <- rnorm(500) <- data.framepatient.ID = patient.ID, patient.type=patient.type,
             gene = gene, expression=expression)

p <- ggplot(, aes(gene, expression))
p + geom_point() + facet_grid(patient.type ~ .)

alt text

I don't know exactly what you want to do, (maybe a line chart of a bar chart would be a better fit?) but hopefully that's a helpful start


Login before adding your answer.

Traffic: 2935 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6