Question: same experiment, different values with my heat map. Any help?
0
gravatar for Mozart
24 days ago by
Mozart130
Mozart130 wrote:

Hi there, I am really wrapping my head around a thing that I may have forgotten. Essentially, I have different results (i.e. rld ones) that I will use in my heat map that changes according to the number of samples I consider. I am wondering why this is happening. Given the fact I am sure I haven't explained myself clearly, I will try to paraphrase what I have just said:

I want to generate 2 heat maps: one, from the main comparison I am interested (6 samples) second one, containing results from all samples in my dataset (6 samples as before + 2)

by doing this, in both conditions:

dds <- DESeqDataSetFromTximport(txi.kallisto.tsv, table, ~condition)
dds <- DESeq(dds)
rld <- rlog(dds, blind=FALSE)
top_genes <- head(order(rowVars(assay(rld)), decreasing = TRUE), 100)
mat  <- assay(rld)[ top_genes, ]

I obtain different counts for the same genes in the 2 aforementioned conditions. Is this due to the fact that regularised logarithmic transformation is different according to the number of samples in the dataset?

thanks

heatmap rna-seq • 162 views
ADD COMMENTlink modified 24 days ago • written 24 days ago by Mozart130
3
gravatar for ATpoint
24 days ago by
ATpoint17k
Germany
ATpoint17k wrote:

This is normal and expected given that normalization factors and model fitting will be different if you add or subtract samples. If you want to be independent of that, maybe use something like log2(FPKM+1). For visualization alone this is probably accurate enough. What do you want to show with the heatmaps?

ADD COMMENTlink written 24 days ago by ATpoint17k

Thanks for the quick reply. I've always had this feeling! I just want to show the top variable genes in my dataset...that's it.

Another question: should I stick with the same kind of log transformation (either vst or rlog) for all of the plots in my experiment or can I change the normalisation method each time (e.g. rlog for PCA and vst for heat map?)..thanks!

ADD REPLYlink written 24 days ago by Mozart130
2

I would not switch around as there should be consistency. Use what you prefer (or vst if you have many samples and rlog is too slow) but do not mix at will as they behave quite differently especially for variable genes with low counts.

Alternatively, what I personally find more meaningful is to show only those genes that are significantly different as high variability often comes from the mean-variance dependency for low-count genes. You could show the z-scored log2FCs for those with padj < 0.05. Still, if you prefer counts do not mix methods and be consistent.

ADD REPLYlink written 24 days ago by ATpoint17k

Thank you so much it was a great help. was facing the same

ADD REPLYlink written 24 days ago by stephannie.baker80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 954 users visited in the last hour