Question

Best values to use for a heatmap comparing conditions comprised of biological replicates?

0

Entering edit mode

9.8 years ago

kelgalla • 0

I have some RNASeq data, and for the samples, I have counts, normalized counts (RPKM), or variance stabilized counts from DESeq. I have used DESeq on the various conditions to identify differentially expressed genes, and so I also have the baseMean for each condition. It seems you can create a heatmap of the samples using counts, normalized counts, or variance stabilized counts, though the variance stabilized counts are likely are the best.

However, if I wanted to make a heatmap where I summarize the samples into their conditions (treat all biological replicates as one condition), then what are the best values to use? Do I use the baseMean, the mean of the variance stabilized counts, or the mean of the RPKM values?

Because one of the conditions is a control, and all other conditions are being compared to this same control, it is also possible that I might use the fold change, log2 fold change, or the modified log2 fold changes as described by DESeq?

Clustering Heatmap RNASeq DESeq • 7.5k views

ADD COMMENT • link updated 2.3 years ago by Ram 43k • written 9.8 years ago by kelgalla • 0

Ram · Answer 1 · 2014-06-19

0

Entering edit mode

9.8 years ago

Michael 54k

You could do a multidimensional scaling plot (MDS) on all samples and compare all the different methods, e.g. using the plotMDS function (limma, edgeR). Which MDS plot conveys most biological relevance in sample pairings? I bet on log2 fold change or log2 library-size normalized abundances (CPM). In my experience taking log is essential.

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by Michael 54k

score 0 · Answer 2 · 2014-06-19

you can use any value ( counts normalized to library size, variance stabilized counts, RPKM, log2 Fold change etc....) for making heatmap. heatmap is just using color to represent numbers. for control and treated samples, you may selected genes that are significantly changed (up or down based on adjust-pvalue) and then got the person-correlation distance matrix and make heatmap by heatmap.2.

Ram · Answer 3 · 2014-08-07

To show the results in Excel you can easily plot your RT-qPCR vs RNA-Seq using Log2FoldChange (I think this value is better as DESeq gives you foldchange value more than 0 (between 0 and 1 means underexpression and you have to change it to negative value) so it is not possible to use FoldChange to show transcript abundance toward negative values).

For heatmap you can use MapMan/PageMan for each organism you are working just do a blast with one of the organisms with which your under study organism has close relation in order to have the gene IDs (e.g. for plants use Arabidopsis or Oryza sativa).

Of course for heatmap, you can find heatmap3 as recently it is published. http://www.hindawi.com/journals/bmri/2014/986048/

cran.r-project.org/package=heatmap3

Best regards