11 weeks ago by
USA / Europe / Brazil
It makes perfect sense that the dendrograms would be different for the following 2 main reasons:
- FPKM and DESeq2's normalisation methods are different: FPKM does not
sufficiently adjust for differences in library size across your
samples; DESeq2 does [account for difference]
- log (base 2) is not the same transformation as the regularisd log
transformation of DESeq2
In addition, looking at your dendrograms, I do not see good separation between control and treated. For your FPKM data, I see a 'flat' and 'structureless' dendrogram, which is what I'd expect from logged FPKM. On the other hand, I see a well-structured dendrogram for your rlog data, which is what I expect in knowing this data transformation method.
Something to keep in mind: when clustering, Euclidean distance should only be used for binomially-distributed data. If you want to cluster FPKM data, you should probably at least use 1 minus Pearson/Spearman correlation distance.