Question

SNV calling: what is a relative cost of FP and FN?

0

Entering edit mode

7.5 years ago

NikTuzov • 0

In relative terms, what is the cost of FP vs FN variants? E.g. if it's far more important to avoid FP (I'm not sure it's the case), then one should be able to say something like "I'd rather have K FN cases if it helps me avoid one FP case", where K > 1. What is the value of K?

If there are any papers that discuss this issue explicitly, please let me know.

SNP • 1.7k views

ADD COMMENT • link updated 7.5 years ago by harold.smith.tarheel ★ 4.9k • written 7.5 years ago by NikTuzov • 0

score 0 · Answer 1 · 2016-10-25

0

Entering edit mode

7.5 years ago

harold.smith.tarheel ★ 4.9k

The tradeoff between FPs and FNs will vary by dataset and variant-calling pipeline, and can be determined by precision-recall or ROC curves.

ADD COMMENT • link 7.5 years ago by harold.smith.tarheel ★ 4.9k

0

Entering edit mode

Measures like that assume that having a single FP is exactly as bad as having a single FN. Is that the case?

ADD REPLY • link 7.5 years ago by NikTuzov • 0

1

Entering edit mode

No, they don't. The indicate the TP/FP ratios at various thresholds. The 'cost' of identifying the first 10% of TPs may be only 0.01% FPs, whereas the cost of the last 1% may be >90% FPs.

And your tolerance of that cost depends upon your application. If you're using high-density SNPs for mapping, then recalling 10% of the TPs with few FPs may be more than sufficient. If you're identifying candidate mutations in an inbred system (e.g., C. elegans) that are easy to validate by independent criteria (such as RNAi), then it's far more important to maximize the number of TPs for validation at the expense of FPs.

ADD REPLY • link 7.5 years ago by harold.smith.tarheel ★ 4.9k