Question: How to cluster differential expressed genes from RNA-seq analysis?
0
gravatar for Kurban
2.7 years ago by
Kurban170
china/Urumqi/xinjiang academy of animal scinces
Kurban170 wrote:

Hello guys, I have transcriptome data from low temperature treated samples with different time length. And I got different number of DEGs for each time point of strass challenge. Now I want to cluster all these differentially expressed genes. In some papers they did this analysis by heatmap based on genes’ foldchage, and others do this on RPKM value. How can I do the cluster? Is there any good tools and papers? Should I do the cluster on log2 (foldchage) or RPKM value ?

rna-seq • 2.6k views
ADD COMMENTlink written 2.7 years ago by Kurban170
1

You can do both, or even more. I usually get best hierarchical clustering results, using the z-scores of log2 RPKM (or log2 CPM) values.

I use the heatmap.2 function from R gplots. You can try different clustering methods, for example ward.D is pretty good. Or different distance measures if necessary.

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by Benn7.9k

hi @ b.nota, i checked the heatmap.2 in gplots package and got the heatmap of my data based on FPKM values. but the number of my input genes are more than a thousand,and i want to extract the clustering result of the heatmap, how can i do that?

ADD REPLYlink written 2.7 years ago by Kurban170
1

Do you mean you want the clusters that are formed after clustering?

Check previous post about this, you'll need cutree command for that.

How To Get The Subclusters From The Object Of Hclust() Using Cutree() According To The Order On The Map Produced By Heatmap.2?

ADD REPLYlink written 2.7 years ago by Benn7.9k

There are some examples of clustering (in R) in DESeq2 tutorial (https://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.pdf), page 26f.

ADD REPLYlink written 2.7 years ago by e.rempel780

thank you @ e.rempel, DESeq2 dose cluster DEGs based on the count data, but it only accept integer value.

ADD REPLYlink written 2.7 years ago by Kurban170

That's because count data is in integers. Why isn't your data in integers? If you have used salmon/sailfish you should have a look at the tximport package for getting your data into DESeq2.

ADD REPLYlink written 2.7 years ago by WouterDeCoster42k

hi @ WouterDeCoster, i know that count data is in integers, but i want to use RPKM/FPKM value for heatmap

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by Kurban170

Check out the Mfuzz R package.

ADD REPLYlink written 2.7 years ago by theobroma221.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 752 users visited in the last hour