Question: Question about Fpkm sum
4 weeks ago
lyrae wrote:

Recently someone told me, in most case, total expression of genes of each sample in RNA-seq shoud be similar.

I summed all gene's fpkm values belong to each sample in my RNA-seq results, but the fpkm sums are quite different.

              sample1       sample2      sample3      sample4     sample5    sample6

fpkm_sum    479064.4143   499710.9913   559446.653  389748.5528 376967.4209 396236.5871

Then I downloaded some FPMK results form TCGA (samples are belong to one project). results are like this:

            sample1       sample2     sample3      sample4       sample5      sample6
fpkm_sum  416526.1892   288256.1748  330630.1418  502633.3157  381001.8546  284932.5021

I'm confused now, FPKM sum of each samples in RNA-seq should be similar to or not? Please tell me.

Thanks a lot.

rna-seq fpkm
4 weeks ago
4 weeks ago
h.mon wrote:

This subject is more complicated than it seems, and I will brush it slightly here, and provide further resources at the end. First, there is no reason to expect total expression of genes will be similar for different samples. There are many reasons for differences - different treatments / amounts of material isolated / tissues / whatever. However, one step to do some interesting statistical analyses is to normalize the expression between samples, and this would make the total expression between samples to be similar - or the same, even. When RPKM / FPKM was first used, it seemed like it normalized samples more or less to the same total expression, but this is not really the case: RPKM / FPKM normalized counts are not comparable between samples, and may greatly differ. This is the main reason TPM (transcripts per million) are preferred over RPKM / FPKM.

Further reading:

What the FPKM? A review of RNA-Seq expression units

Estimating number of transcripts from RNA-Seq measurements

Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples

4 weeks ago
