On "wings" in volcano plots...
0
3
Entering edit mode
11 months ago
kalavattam ▴ 190

Apparent "wings" in volcano plots (for example, as seen here) are evidence for the relationship between fold change and p-value when expression is low in one condition and there are few replicates. If one wishes to remove or minimize these, that can be done by pre-filtering the counts matrix (for example, excluding rows with partial or complete row sums/means/etc. below some user-defined threshold) and/or performing shrinkage for the effect size estimates.

My main question is this: Is it a problem/mistake to not remove or otherwise minimize "wings"?

Follow-up questions:

  • If it is necessary to address these "wings", is it suitable to do shrinkage without counts-matrix filtering or vice versa?
  • Any advice on setting thresholds for counts-matrix filtering in a non-arbitrary way?
DEG volcano DGE statistics DE • 1.3k views
ADD COMMENT
1
Entering edit mode

The stats for these gene are often not reliable. If the difference in mean expression is due to few outliers then it is best to filter them. That saves you from spurious calls. Is filtering a problem?

ADD REPLY
0
Entering edit mode

Thanks, no, filtering is not a problem. Do you have a preferred way to go about it? Maybe something as straightforward as this?

#!/usr/bin/env Rscript

counts_matrix[rowSums(counts_matrix) >= 10, ]

In reading over things, it seems edgeR::filterByExpr() could be useful here.

Do you recommend doing shrinkage too?

ADD REPLY
1
Entering edit mode

I've found shrinkage mildly difficult to explain to end users and generally forego it, but providing an lfcThreshold during testing can be really useful for filtering as well.

You could manually filter as you describe or use the edgeR approach, which generally works quite well. If you use DESeq2, it won't provide significance values for genes that it considers too lowly expressed to reasonably test (though it feels pretty conservative in what it considers "expressed enough").

ADD REPLY
0
Entering edit mode

Thanks, yeah, we're relying on the independent filtering performed by DESeq2 but still see "wings" in some of our plots.

ADD REPLY
1
Entering edit mode

If that doesn't remove them, try

keep <- rowSums(counts(dds) >= 10) >= 3
dds <- dds[keep,]

Change the thresholds of 10 counts in at least 3 samples as desired.

ADD REPLY
1
Entering edit mode

I like filterByExpr as it is automated and generally works rather well. Shrinkage is mainly for visualization and ranking so it is not critical but nice to have. Since the shrinkage is not part of the testing procedure (unless you use lfcShrink with a lfc threshold) the main filter should be the pvalue anyway.

ADD REPLY

Login before adding your answer.

Traffic: 1697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6