Question

High p-values generated from CuffDiff

0

Entering edit mode

8.6 years ago

onspotproductions ▴ 150

I am doing differential expression analysis on two samples, control and over-expression vector. I am using cuffdiff directly with the gtf file from the human genome as we are only interested in gene expression changes and not transcript level changes. Previously, some analysis was done on the same raw data using the same method and produced output files which match our latest runs near perfectly in terms of log2 fold change. However, in the initial analysis not done by us the p-values produced are very low (e.g. 10-5) however the new analysis we are doing has higher p values (e.g. 0.01 for the same gene). I am unsure what could be causing this difference, or if there is a setting in cuffdiff I am missing. There were no replicates for the sequencing runs.

RNA-Seq differential expression cufflinks • 2.8k views

ADD COMMENT • link updated 8.6 years ago by Satyajeet Khare ★ 1.6k • written 8.6 years ago by onspotproductions ▴ 150

0

Entering edit mode

If this time you had no replicates and the previous time you did have, then this could explain the difference (data are less robust). However, a p-value of 0.01 is totally good imho! I usually select < 0.05 so it would be in the range!

ADD REPLY • link 8.6 years ago by Matteo Schiavinato ★ 3.7k

score 0 · Answer 1 · 2016-12-02

0

Entering edit mode

8.6 years ago

Satyajeet Khare ★ 1.6k

Can you check the exact command line for the previous run? You will find it in a file run.info in cuffdiff output. I suspect that the previous command was slightly different such as in dispersion method, or number of replicates required etc.

ADD COMMENT • link 8.6 years ago by Satyajeet Khare ★ 1.6k

0

Entering edit mode

I know there were no replicates as we are using the exact same data. I will have to try la different dispersion method, but unfortunately we have been unable to obtain the.info file.

ADD REPLY • link 8.6 years ago by onspotproductions ▴ 150

0

Entering edit mode

If the previous commands were the same as the current and all the input files are also the same, then it is strange to get different results. Is there a difference in the version of any tools used in your analysis earlier and now? Furthermore, q-value (adjusted p-value) is preferred over p-value. I would like to add that without replicates the statistical methods may not give you confidence about your findings.

ADD REPLY • link 8.6 years ago by Persistent LABS ▴ 750

0

Entering edit mode

The tools were much older at the time and even different dispersion method doesn't make a difference.

ADD REPLY • link 8.6 years ago by onspotproductions ▴ 150