Heatmap from EdgeR results
2
2
Entering edit mode
3.2 years ago
shaden ▴ 20

Hi, I am doing DE miRNA analysis using EdgeR and I need to make a heatmap for the top 50 DE miRNAs, or the most VARIABLE ones.

EdgeR userguide suggests: logcounts <- cpm(y, log=TRUE) where y is the DGEList object.

The problem is with labelling, I want the names of miRNAs to show on the heatmap but " y" object takes no names, only counts matrix, and I'm not sure how to annotate it with the miRNA names. Any help?

edgeR heatmaps DGEList CPM • 4.8k views
ADD COMMENT
0
Entering edit mode

I cannot follow. With y I guess you mean the DGEList object? Please try to explain better what the problem is, best would be to show code.

ADD REPLY
0
Entering edit mode

The output from your command should have row names containing the gene names, assuming you provided that information when you made the DGEList. What are you using to make the heatmap? Pretty much any heatmap package will have a parameter to show the row names.

ADD REPLY
0
Entering edit mode
Count_data <- read.csv("Count_data.csv", check.names=FALSE) 


Counts <- Count_data
rownames(Counts) <- Count_data$names
counts_IDs <- Count_data
Counts_only <- Count_data %>% select(-names)  # create the table with only counts here

group <- c(rep("SH",4), rep("U2",4), rep("U7",4),rep("RU",4))

y <- DGEList(counts=Counts_only, genes = counts_IDs$names, group=group)

design_edgeR <- model.matrix(~0+group, data=y$samples,genes= Counts$names)
colnames(design_edgeR) <- levels(y$samples$group)


#HEATMAPS:

logcounts <- cpm(y,log=TRUE)

var_genes <- apply(logcounts, 1, var)
head(var_genes)


# Get the gene names for the top 500 most variable genes
select_var <- names(sort(var_genes, decreasing=TRUE))[1:30]
head(select_var)

highly_variable_lcpm <- logcounts[select_var, ]
dim(highly_variable_lcpm)


## Get some nicer colours
library(gplots)
mypalette <- brewer.pal(11,"RdYlBu")
morecols <- colorRampPalette(mypalette)
# Set up colour vector for celltype variable
col.cell <- c("purple","orange")[group]

# Plot the heatmap
heatmap.2(highly_variable_lcpm,col=rev(morecols(50)),trace="none", main="Top variable genes across conditions-Macrophages",ColSideColors=col.cell,scale="row",margins=c(9,9))

The resulting heatmap shows some random numbers as gene names instead of my gene names:

enter image description here

ADD REPLY
0
Entering edit mode
3.2 years ago
dganiewich ▴ 130

Hi,

Have you tried adding rownames to your matrix as row.names(y)<-gene_names_array?

Best,

Daiana

ADD COMMENT
0
Entering edit mode

Oh that actually worked, thank you very much Daiana!!

ADD REPLY
0
Entering edit mode
3.2 years ago
Gordon Smyth ★ 7.0k

Just add whatever gene names you want the heatmap to show as row.names of logcounts. See the Section Heatmap clustering of the edgeR QL workflow for a complete worked example. The workflow uses coolmap but the same advice would apply for any heatmap function. By default, the row.names of logcounts will be the gene IDs.

ADD COMMENT
0
Entering edit mode

That's the problem, by default the row names of logcount is not gene ID, it's just numbers 1,2,3....etc. So after few processing steps I no more know what genes are there, I just have a matrix of values.

ADD REPLY
0
Entering edit mode

The row.names are gene IDs by default. But if you don't supply any row names when the DGEList is created then the IDs will be set to 1, 2, 3 etc.

ADD REPLY

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6