Question

How to use heatmap.2 to plot the top20 abundant OTUs?

0

Entering edit mode

6.6 years ago

xiongyi25 • 0

Hi everyone, I am a totally newbie in anayzing microbial community. I am trying to plot heatmap with heatmap.2 My data contains over 2000 OTUs for 40 samples. And I only want to show the top 20 abundant OTUs in the heatmap.( Something similar to the figure from this.

Anyone has an idea how to do that in R?

Thanks a lot.

sequence • 4.9k views

ADD COMMENT • link updated 6.6 years ago by Forever • 0 • written 6.6 years ago by xiongyi25 • 0

0

Entering edit mode

When Using Heatmap.2 From R To Make A Heatmap Of Microarray Data, How Are The Genes Clustered?
https://www.r-bloggers.com/from-otu-table-to-heatmap/
http://www.molecularecologist.com/2013/08/making-heatmaps-with-r-for-microbiome-analysis/

ADD REPLY • link 6.6 years ago by GenoMax 141k

0

Entering edit mode

Thanks a lot! Thanks a lot! (Finally over 20 characters..)

ADD REPLY • link 6.6 years ago by xiongyi25 • 0

0

Entering edit mode

when you say your data is having top 2000 OTU's are they ranked on the basis of any significance of fold changes or any other metric? If so then your dataframe should have the first 20 rows as your top 20 if there are some rankings. You need to rank them otherwise based on significance or any metric that can make your dataframe or file.txt in order of highest ranking and down. Then select as @Forever suggested and just plot the heatmap of the top 20 rows.

ADD REPLY • link 6.6 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Thanks so much for your reply. I think the 2000 OTUs are ranked. But the thing is, I have over 40 samples which are from different experimental groups. Say, if I want to separate the 40 samples into 3 groups, the rank of the OTUs will definitely be different. If I separate the original OTU table into 3 tables, how can I rank them separately based on the significance or any metric?

ADD REPLY • link 6.6 years ago by xiongyi25 • 0

0

Entering edit mode

so these 2k OTUs are intensity or let's say expression values from particular experiments right and not having any enrichment of conditions or differential among conditions? If not then your entire universe is 2k OTU which is not ranked. In that case, plot the PCA and see how many clusters you have.
This cluster should give an indication of the groups you might have. You can then map your 40 samples around those groups that are clustering in the PCA.
Make differential testing across those groups separately or in contrast based (refer limma for that)
This will give you most differentially OTUs that defines your group differences having significance and some fold changes. They should be ranked, take top20 and plot.

Ignore the above if differential testing is already done and 2k OTU is a result of that, then simply group your 40 samples in a data-frame to 3 groups(am sure you will know which 3 groups comprise to give 40 samples) and use them as label taking the top 20 OTUs (as I mentioned in the previous comment). Use annotation labels that will also provide the group labels of 40 samples into 3.

ADD REPLY • link 6.6 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Thanks very much! Finally figured this out.

ADD REPLY • link 6.6 years ago by xiongyi25 • 0

score 0 · Answer 1 · 2017-09-06

0

Entering edit mode

6.6 years ago

Forever • 0

you can use read.csv or read.table of R to import your data ,i.e mydata <- read.table("filename.txt") then top20_data <- mydata[1:20,],next use top20_data to plot heatmap.

ADD COMMENT • link 6.6 years ago by Forever • 0

0

Entering edit mode

Thank you! If those OTUs are not ranked, I mean, if the first 20 rows are not the top 20 abundant OTUs, what can I do to rank them and plot the heatmap?

ADD REPLY • link 6.6 years ago by xiongyi25 • 0

0

Entering edit mode

You can rank them first then take the first 20. Continuing from the code above:

mydataAbundence <- apply(mydata,1,sum)
mydataAbundence <- sort(mydataAbundence, decreasing = T)
top20 <- mydataAbundence[1:20]
mydataTop20 <- mydata[row.names(mydata) %in% names(top20),]