Question: Help fixing my DGE heatmap?
0
gravatar for Tania
2.9 years ago by
Tania130
Tania130 wrote:

Hi Everyone

I need your help fixing my heat map. My top 10 differential expressed genes are not the ones that show up in the heatmap. The ones that show in the heat map are the first 10 ordered in the samples counts, although they are not significant at all.
Below is a snapshot of my code and I added my heatmap too: https://ibb.co/gPKepm

labels=c("C1", "C2", "C3", "C4", "C5", "C6","C7", "C8", "C9", "C10", "T1","T2","T3","T4","T5","T6","T7","T8","T9","T10")
dge = DGEList(counts=cts, genes= rownames(cts), group=group)
countsPerMillion <- cpm(dge, prior.count=2, log=TRUE)
CPM = countsPerMillion
colnames(CPM) = labels
o <- order(etp$table$PValue)
CPM <- CPM[o[1:10],]
CPM <- t(scale(t(CPM)))
col.pan <- colorpanel(50, "blue", "white", "red")
heatmap.2(CPM, col=col.pan, Rowv=TRUE, scale="none",trace="none", dendrogram="none", cexRow=0.2, cexCol=0.9, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6))

Thanks

heatmap rna-seq • 1.2k views
ADD COMMENTlink modified 2.9 years ago by Kevin Blighe66k • written 2.9 years ago by Tania130
1

Which numbers are the first ten of "o"? Did you try to associate them to gene names? Which gene names were associated to them? Are those the same that you see in the heatmap? Since I am not an expert, it is really hard to help more by only looking at the code.

ADD REPLYlink written 2.9 years ago by Fabio Marroni2.6k

The o contains 1 to 10 and those are the genes which show up in the heatmap. The first 10 in etp before order which are not significant. This is what I can't fix. So the order in o is wrong I guess.

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Tania130

When I changed O to:

o = order(etp$table$PValue, na.last=TRUE,decreasing= TRUE, method="radix")

I got another list of genes that are significant but not the top either.

ADD REPLYlink written 2.9 years ago by Tania130
1

Try without decreasing=TRUE (you want the smaller p being the first one, not the last one).

ADD REPLYlink written 2.9 years ago by Fabio Marroni2.6k

I tried it and still picks the wrong order. It doesn't even pick from top significant or least significant so I think of reversing, it picks some genes from the middle.

ADD REPLYlink written 2.9 years ago by Tania130
4
gravatar for Kevin Blighe
2.9 years ago by
Kevin Blighe66k
Kevin Blighe66k wrote:

This doesn't work because the gene/transcript indices in the etp object won't match those in your CPM object. You should instead subset CPM based on gene/transcript names.

Wherever you ran the topTags function, the genes should automatically be ordered by PValues there (lowest to highest). You can then just get the rownames of the resulting object and use those to subset CPM.

For example:

topGenes <- rownames(topTags(et, n=10))

CPM[topGenes,]

Take a look at the tutorial here: RNA Sequence Analysis in R: edgeR

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Kevin Blighe66k

Thanks Kevin and Fabio. I think it works now, although it is kinda sparse (not sure if this is okay). [https://ibb.co/b0kbn6]

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Tania130
1

It looks like better scaling of the data would help. You may want to take some ideas from my various postings on heatmaps:

You already scale your data to Z scores but perhaps specifying breaks (like -1, +1) in addition would help.

ADD REPLYlink written 2.9 years ago by Kevin Blighe66k

Great, thanks so much Kevin, very helpful :)

ADD REPLYlink written 2.9 years ago by Tania130
1

Hi Tania. You are welcome

ADD REPLYlink written 2.9 years ago by Kevin Blighe66k

Great, thanks so much Kevin, very helpful :)

ADD REPLYlink written 2.9 years ago by Tania130

Great, thanks so much Kevin, very helpful :)

ADD REPLYlink written 2.9 years ago by Tania130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 930 users visited in the last hour