Question: Draw the phylogeny from RNA-Seq data
gravatar for wangdp123
3.0 years ago by
wangdp123240 wrote:

Hi there,

I am trying to use RNA-Seq data from a large number of samples to draw the phylogenetic tree based on FPKM values in order to show the relationship among all samples. (The samples include various species, tissues and treatments.)

Are there any elegant tools implemented for this purpose?

Many thanks,



ADD COMMENTlink modified 2.9 years ago by Biostar ♦♦ 20 • written 3.0 years ago by wangdp123240

If you want to compare samples in context of RNAseq, have you thought about heatmaps or PCA? In the former case, you can also built a distance matrix-based phylogenetic tree. See here Pay attention on the fact, that here normalized units are not FPKMs, so you will need to normalize your raw counts as described in the tutorial.

Also keep in mind that its not the same kind of tree that you can make for example from sequences using ML or Max. parsimony methods :)

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by grant.hovhannisyan2.0k

Hi, thanks for this. I have looked through the log2, rlog and VST normalization methods mentioned in this workflow. But my data are from different species, so in this case, I believe that some calculations should be made to normalize the gene lengths. Any thought about this?

ADD REPLYlink written 3.0 years ago by wangdp123240

I think you are right about the normalization by gene length. But I am still wondering what is your ultimate goal: if your goal is to obtain robust phylogeny, than is would be much better (and I'd say the only proper way) to reconstruct phylogeny based on assembled trancriptomes, i.e. based on sequence, like it is done for genomes, and not based on gene expression levels, because gene expression is a relative measure and depends on multiple biological and technical factors. But if you want just to do a basic quality control of your samples to see whether you see the expected grouping (like samples from same tissues and treatment cluster together), then simple PCA or heatmap (with simple Dist. matrix tree) should be enough.

ADD REPLYlink written 2.9 years ago by grant.hovhannisyan2.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1138 users visited in the last hour