Question: How to cluster differential expressed genes from RNA-seq analysis?
2
gravatar for Kurban
3.5 years ago by
Kurban190
china/Urumqi/xinjiang academy of animal scinces
Kurban190 wrote:

Hello guys, I have transcriptome data from low temperature treated samples with different time length. And I got different number of DEGs for each time point of strass challenge. Now I want to cluster all these differentially expressed genes. In some papers they did this analysis by heatmap based on genes’ foldchage, and others do this on RPKM value. How can I do the cluster? Is there any good tools and papers? Should I do the cluster on log2 (foldchage) or RPKM value ?

rna-seq • 3.2k views
ADD COMMENTlink written 3.5 years ago by Kurban190
1

You can do both, or even more. I usually get best hierarchical clustering results, using the z-scores of log2 RPKM (or log2 CPM) values.

I use the heatmap.2 function from R gplots. You can try different clustering methods, for example ward.D is pretty good. Or different distance measures if necessary.

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by Benn8.0k

hi @ b.nota, i checked the heatmap.2 in gplots package and got the heatmap of my data based on FPKM values. but the number of my input genes are more than a thousand,and i want to extract the clustering result of the heatmap, how can i do that?

ADD REPLYlink written 3.5 years ago by Kurban190
1

Do you mean you want the clusters that are formed after clustering?

Check previous post about this, you'll need cutree command for that.

How To Get The Subclusters From The Object Of Hclust() Using Cutree() According To The Order On The Map Produced By Heatmap.2?

ADD REPLYlink written 3.5 years ago by Benn8.0k

There are some examples of clustering (in R) in DESeq2 tutorial (https://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.pdf), page 26f.

ADD REPLYlink written 3.5 years ago by e.rempel890

thank you @ e.rempel, DESeq2 dose cluster DEGs based on the count data, but it only accept integer value.

ADD REPLYlink written 3.5 years ago by Kurban190

That's because count data is in integers. Why isn't your data in integers? If you have used salmon/sailfish you should have a look at the tximport package for getting your data into DESeq2.

ADD REPLYlink written 3.5 years ago by WouterDeCoster44k

hi @ WouterDeCoster, i know that count data is in integers, but i want to use RPKM/FPKM value for heatmap

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by Kurban190

Check out the Mfuzz R package.

ADD REPLYlink written 3.5 years ago by theobroma221.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 615 users visited in the last hour