Cuffdiff: 0 expression and NOTEST
2
0
Entering edit mode
8.8 years ago
mbio.kyle ▴ 380

I am using cuffdiff to perform differential expression analysis on RNA seq data. We have two conditions, each having three replicates. Our pipeline completes successfully and the data seems sane, however there is one anomaly which we are still wondering about...

Here is an example:

gene locus status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
SNORD111B chr16:70557690-70611571 NOTEST 0 4114.4 inf 0 1 1 no

There is 0 expression in condition 1 but very high expression in condition 2, but this was called as NOTEST and marked as non-significant. Why is this the case?

Thanks,
Kyle

RNA-Seq cuffdiff • 5.4k views
ADD COMMENT
2
Entering edit mode
8.8 years ago

The statistical test requires some expression in both samples. An expression of zero could be an anomaly or a biological situation. If you trust the sequencing and the bio-sample, then your answer is clearly that the gene is very much differentially expressed.

You don't get a p-value because no test can be run. That's what NOTEST means. The system can't do variation estimation and failed to allocate reads because there aren't any.

If you are specifically studying SNORD's then you'll want to investigate why this one had zero reads. If you aren't specifically studying SNORD's then you should ignore and filter out this result. Perhaps your bio-protocol cannot sequence "small nucleolar rna"s and the one with high expression is actually the error.

I know I've seen microRNA's anomalously in samples that had been size selected, that means the microRNA could not have been sequenced, but the aligner made the mistake, or the gene is overlapped by a larger gene.

ADD COMMENT
0
Entering edit mode
8.8 years ago
mbio.kyle ▴ 380

Hello Karl,

Thanks for getting back to me. Firstly, the example I chose was just random, we are not focusing on snoRNAs but instead focusing on global changes, etc. Your answer was very helpful and did clear things up until I noticed this result in another of our differential expression experiments:

AJAP1    chr1:4715104-4843851    NEG    POS    OK    0    6.91445    inf    -nan    5.00E-005    0.0081375    yes

In this case there was 0 expression detected in the negative condition, yet a P value was generated and it was marked as significant. The cuffdiff google group doesn't seem to be very active unfortunately. I cannot think of anything which would cause this example to be marked significant while the other is not...

ADD COMMENT
1
Entering edit mode

You said this is another experiment. Different subject and sample count? Definitely a different number of columns in the result. Before the test status was "NOTEST" and now it's "OK". Don't take the notest as a non-significant finding. It just wasn't tested. Here again the test statistic is "-nan" which is another failure to process. Looks like a different version of the program made some different assumptions and judgements.

I can't speak to how cuffdiff makes decisions. You'll have to wait for help from their support group, or use a tool you do understand.

ADD REPLY
0
Entering edit mode

The sample count was the same, as was the program version. I think I might have pasted a modified example originally, but the two results should be comparable.

But thanks very much for your help! Accepting and upvoting.

ADD REPLY

Login before adding your answer.

Traffic: 2280 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6