Question: Comparing FPKM values from cufflinks and cuffnorm
6
6.1 years ago by
bow790
Netherlands
bow790 wrote:

I am trying to calculate an expression value that are comparable in one sample (so I can compare gene A, gene B, etc in the same sample) and across samples (so I can compare gene A in sample 1 with gene A in sample 2, and so on). My samples are from single-cell RNA sequencing and I've been using the cufflinks suite to estimate abundance. I have a question about the FPKM values generated by cufflinks and cuffnorm.

First I tried estimating abundance using only cufflinks and checking the resulting fpkm_tracking table. This is done for a small subset of my complete samples (2 samples, library sizes of about 30 million reads and 20 million reads respectively). Curious, I then also tried to estimate abundance using cuffnorm, using the same 2 samples. I plotted the resulting FPKM values (on the X axis) against the FPKM values from Cufflinks (on the Y axis) for each sample. I was surprised to find that for both samples, most FPKM values were scaled down.

Here's one example from a gene:

• with cufflinks, sample A: 222.349
• with cuffnorm, sample A: 31.376
• with cufflinks, sample B: 122.469
• with cuffnorm, sample B: 17.333

I see this in several other genes as well, but not all. Here's one that almost does not change at all:

• with cufflinks, sample A: 29.4418
• with cuffnorm, sample A: 26.976
• with cufflinks, sample B: 8.0248
• with cuffnorm, sample B: 8.649

Looking at the mean and median of FPKMs, these are the numbers:

• cufflinks, sample A mean & median: 23.59 & 0.1073
• cuffnorm, sample A mean & median: 17.4 & 0.04278
• cufflinks, sample B mean & median: 24.2 & 0.01044
• cuffnorm, sample B mean & median:  20.94 & 0 (yes, zero)

There it is apparent that there is a global downscaling of the FPKM values.

I tried calculating the Spearman correlation coefficient, and I got 0.9722 for sample A and 0.968 for sample B, which I understand means the samples are quite similar in rank and are positively correlated (P-values are 0 for both samples).

My questions are:

1. What's happening here? Why are all my expression values in cuffnorm downscaled relative to cufflinks?
2. I may be stretching cufflinks / cuffnorm for my analysis, but my goal here is to use a unit of expression abundance comparable across samples and across genes. In lieu of the actual relative molar concentration, what do you recommend using?

I'm on version 2.2.1 (4237) of the Cufflinks suite, running on Ubuntu 12.04 64 bit, by the way.

modified 6.0 years ago by Biostar ♦♦ 20 • written 6.1 years ago by bow790
1
6.1 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

At least part of the reason is that cufflinks produces "classic" FPKMs, whereas cuffnorm likely uses normalized values in computing FPKMs (the documentation doesn't mention this, but given that that's what cuffdiff defaults to...). In general, the latter are far more useful, since FPKMs in and of themselves aren't very comparable (have a bit more rRNA or a high expressing gene in one sample? Congrats, your FPKMs are now near useless).

Aside from that, you'll likely have to ask the cufflinks authors (don't expect a speedy reply). Cuffnorm hasn't been described in detail anywhere that I've seen, so no one who hasn't gone through the source code would know the answer to this.

Indeed. Though when plotting the results against each other, depending on whether I use bias correction or not on Cufflinks, I could get either a linear scaling down or not. If I use the `-u` and `-b` flag on Cufflinks, Cuffnorm seems to only normalize linearly. This is not the case when I don't use those two flags. They mentioned in the Cuffnorm documentation that they normalize by geometric mean per DESeq's method. I know that in DESeq calculates a size factor, though not sure whether this factor is used to simply scale the counts.

Anyway, I have posted the question to their mailing list without any replies so far :(...

1

The joys of undocumented programs :)

Don't expect a reply for quite a while. It's not unusual for months to go by before one of the authors replies (if they ever do). This is one of the reasons I would recommend using something else if possible.

Thanks for your insights Devon - always find your thoughts useful here.

As a follow up to the FPKM comparison problem when you have different amounts of rRNA for example.. What if when you are running cufflinks to assemble your transcriptomes you use a rRNA, tRNA, and chrM mask file to mask these genes entirely. This should prevent them from being incorporated in to your transcriptome. Do you suppose this would then help to make cufflinks FPKM values comparable between samples?

When you run cuffDiff - this would presumably allow you to compare FPKM values between any samples included in the calculations.