Forum:How to convincingly illustrate and discuss negative results in NGS.
1
3
Entering edit mode
7.8 years ago

Hi everyone,

In statistics and in science in general, it is always harder to convincingly show lack of effect rather than significant differences. In low throughput experiments, one can always report pvalues and show that the difference is not statistically significant but the situation is more complex (at least to me) in genome-wide studies. For instance, in the case of RNA-seq expression data, even if there is no biological differences between two conditions, there will always be some genes significantly differentially expressed because thousands of genes are tested. In such a case, one can not just say "we didn't see any significant differences between condition X and Y".

How would you illustrate and discuss such a case ? Do you have any example of publications that address this issue ?

Here are some ideas :

  • Discuss that there is less DEG between the conditions X and Y than between X and Z (where there is an effect that has been biologically confirmed). However I find this a bit weak.
  • Discuss that there is obviously no global differences between X and Y beside some differences that might be anecdotical.
  • Show MA-plot/volcano plot and let the reader decide for himself.

Best,

Carlo

RNA-seq DEG • 2.8k views
ADD COMMENT
1
Entering edit mode

A possible (but maybe also not strongly convincing) way would be to perform GO/KEGG enrichment and conclude that there are "no meaningful" differentially expressed genes.

ADD REPLY
0
Entering edit mode

Thank you for your input. Yes, that could be an interesting point in some cases. However sometimes the genes are deregulated based on their "genomic features" (position on chromosomes, presence of introns, nearby ncRNAs, ...) rather than their biological function and there is no functional link (GO/KEGG) between the DEG, even if there is a true effect.

ADD REPLY
1
Entering edit mode

Use a more stringent multiple-testing adjustment so no tests are significant after adjustment. Just kidding, don't do that. :)

ADD REPLY
0
Entering edit mode

Yep, I thought the same thing ^^

More seriously, let's assume that we can't change the stringency because we want to keep the same parameters across the full study (that includes more conditions than X and Y).

ADD REPLY
1
Entering edit mode

If there is no difference in X & Y (and if that has no bearing/influence on the conclusion(s) of the study) why not report the fact as is?

ADD REPLY
0
Entering edit mode

Perhaps you could compare condition X vs X (e.g., use 6 biological replicates for 3 vs 3 comparison) to demonstrate that variation plus a large number of genes invariably identifies some genes with differential expression. Or you could compare mixed (XY vs XY) or even randomly sampled data to make the same point.

ADD REPLY
0
Entering edit mode

If you start off with the null hypothesis "I will not detect a difference in mRNA expression between these two animals", then multiple-testing adjustments is probably the least of your worries.

ADD REPLY
0
Entering edit mode

replicate using orthogonal assays, like qPCR for RNA-Seq. There are much larger issues with RNA-Seq than just this.

ADD REPLY
1
Entering edit mode
7.8 years ago

I think the most straightforward way is the Volcano plot. Regarding the multiple testing, you can try to illustrate the fact that there are no significant differences by plotting the distribution of P-values and computing the false discovery rate (e.g. with this package).

The other idea is to get more samples from Gene Expression Omnibus (GEO) or Sequencing Read Archive (SRA). You can then normalize the RNA-Seq data from your study together with other published datasets, for example using Cuffnorm. The idea is to show that while you do not observe any differences between your conditions, there is still a meaningful difference between your sample and previously published ones, for example up-regulation of known tissue-specific genes. (I.e. a sort of positive control.)

PS. The biological background of the experiment should play a huge role here: in some setups the biologist would expect to find just a handful of genes to be differentially expressed.

PPS. Increasing the number of replicas and performing gene set enrichment analysis can also help to clarify the things a bit. The number of differentially expressed genes chosen under some arbitrary cutoff is not the best measure to quantify global differences between samples.

ADD COMMENT

Login before adding your answer.

Traffic: 820 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6