Different results for differentially expressed genes in RNA-seq analysis with cuffdiff and R::genefilter
Entering edit mode
4.8 years ago

Hi everyone,

I have met a problem with RNA-seq analysis.

My data: 4 experimental conditions with each of 3 biological replicates, total 12 samples.

When used cuffdiff, I got a p-value and q-value for every gene (the algorithm seems to be a beta-negative binomial distribution).

And then, I used R packages genefilter::rowFtest to calculate the welch t-test p-value of every gene based on their FPKM values.

In the end, every gene had two different p-values from cuffdiff and R::genefilter.

The numbers of differentially expressed genes from cuffdiff and genefilter are quite different.

My question is that:

  1. Which one could I believe?

  2. Is the welch t-test fit for calculation of RNA-seq differentially expressed genes?

  3. When I deal with FPKM values from cufflinks for other analysis, just like for R:package::genefilter, should I perform a normalization for the matrix of the FPKM values?

Thank you very much

My Best.

Junfeng Shi

RNA-Seq cuffdiff genefilter • 1.7k views
Entering edit mode
4.8 years ago
Amitm ★ 2.1k

Hi, Are you sure you are working with FPKM values? If yes, then such data is not amenable to t-test (or similars), because FPKM data can't be approximated by Normal distribution (which is required for variants of t-test).

And thats the reason, cuffdiff used beta-negative binomial distribution. Count data or FPKM data are variously approximated by Poisson distrib. or the negative binomial.

If you have replicates and differential expression at gene level is your motive, I would suggest get Count data from your RNA-seq BAM files and use many well established packages like DESeq, EdgeR, limma (Voom method) etc. (all in BioConductor repo.)

There is no particular advantage in choosing FPKM values, if only gene level diff-exp. is desired. On the other hand, if you want transcript-level quantification, then tools like Cuffflinks -> Cuffdiff, or StringTie -> Ballgown , can give you differential expression using FPKM values.

So, please check methods that are appropriate for the data type you have.

Entering edit mode

Thank you very much. I have learned a lot from your answer. I need to re-consider my methods currently used in my analysis. Thanks a lot.


Login before adding your answer.

Traffic: 1070 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6