Question: Draw the phylogeny from RNA-Seq data
gravatar for wangdp123
4 months ago by
wangdp12360 wrote:

Hi there,

I am trying to use RNA-Seq data from a large number of samples to draw the phylogenetic tree based on FPKM values in order to show the relationship among all samples. (The samples include various species, tissues and treatments.)

Are there any elegant tools implemented for this purpose?

Many thanks,



ADD COMMENTlink modified 4 months ago by Biostar ♦♦ 20 • written 4 months ago by wangdp12360

If you want to compare samples in context of RNAseq, have you thought about heatmaps or PCA? In the former case, you can also built a distance matrix-based phylogenetic tree. See here Pay attention on the fact, that here normalized units are not FPKMs, so you will need to normalize your raw counts as described in the tutorial.

Also keep in mind that its not the same kind of tree that you can make for example from sequences using ML or Max. parsimony methods :)

ADD REPLYlink modified 4 months ago • written 4 months ago by grant.hovhannisyan300

Hi, thanks for this. I have looked through the log2, rlog and VST normalization methods mentioned in this workflow. But my data are from different species, so in this case, I believe that some calculations should be made to normalize the gene lengths. Any thought about this?

ADD REPLYlink written 4 months ago by wangdp12360

I think you are right about the normalization by gene length. But I am still wondering what is your ultimate goal: if your goal is to obtain robust phylogeny, than is would be much better (and I'd say the only proper way) to reconstruct phylogeny based on assembled trancriptomes, i.e. based on sequence, like it is done for genomes, and not based on gene expression levels, because gene expression is a relative measure and depends on multiple biological and technical factors. But if you want just to do a basic quality control of your samples to see whether you see the expected grouping (like samples from same tissues and treatment cluster together), then simple PCA or heatmap (with simple Dist. matrix tree) should be enough.

ADD REPLYlink written 4 months ago by grant.hovhannisyan300
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1288 users visited in the last hour