Question

How Are The Test_Stat, P-Value, Q-Value Calculated In The Cuffdiff Output

1

Entering edit mode

11.9 years ago

nepalis ▴ 10

I am looking at the output of cuffdiff. It has columns:test_id, gene_id, gene, locus, sample_1, sample_2, status, value_1, value_2, log2(fold_change), test_stat, p_value, q_value, significant. Can anybody explain how test_stat, p value, q value calculated ? what is the meaning of p value when i have only one replicate ? Also, in some cases the values of value_1 and value_2 are 0 making fold change infinite. How should I handle these cases ? i tried substituting 0's with small value 0.001 but there are some values that are as less as 0.0000160004. if i replace 0's with 0.0000000001 fold change becomes way high in comparison to other cases. How do I handle these cases ? I don't think I can ignore them because if a gene is not expressed at all in one experiment and expressed in other, it must be important.

thanks in advance for the help!!!

cuffdiff rnaseq • 8.3k views

ADD COMMENT • link updated 11.9 years ago by David Westergaard ★ 1.5k • written 11.9 years ago by nepalis ▴ 10

score 3 · Answer 1 · 2013-08-27

3

Entering edit mode

11.9 years ago

David Westergaard ★ 1.5k

I don't know the cuffdiff software in depth, but I'll try to answer from my knowledge of statistics.

If you only have one replicate, you (technically) can calculate a p-value, but it is without meaning. There is absolutely no statistical power from only one replicate, so reporting a p-value further is very, very bad. As for calculations, look into the cuffdiff FAQ.

As for zero counts, Cuffdiff De Significance Of Zero Fpkm Values is probably relevant.