SNV calling: what is a relative cost of FP and FN?
1
0
Entering edit mode
7.5 years ago
NikTuzov • 0

In relative terms, what is the cost of FP vs FN variants? E.g. if it's far more important to avoid FP (I'm not sure it's the case), then one should be able to say something like "I'd rather have K FN cases if it helps me avoid one FP case", where K > 1. What is the value of K?

If there are any papers that discuss this issue explicitly, please let me know.

SNP • 1.7k views
ADD COMMENT
0
Entering edit mode
7.5 years ago

The tradeoff between FPs and FNs will vary by dataset and variant-calling pipeline, and can be determined by precision-recall or ROC curves.

ADD COMMENT
0
Entering edit mode

Measures like that assume that having a single FP is exactly as bad as having a single FN. Is that the case?

ADD REPLY
1
Entering edit mode

No, they don't. The indicate the TP/FP ratios at various thresholds. The 'cost' of identifying the first 10% of TPs may be only 0.01% FPs, whereas the cost of the last 1% may be >90% FPs.

And your tolerance of that cost depends upon your application. If you're using high-density SNPs for mapping, then recalling 10% of the TPs with few FPs may be more than sufficient. If you're identifying candidate mutations in an inbred system (e.g., C. elegans) that are easy to validate by independent criteria (such as RNAi), then it's far more important to maximize the number of TPs for validation at the expense of FPs.

ADD REPLY

Login before adding your answer.

Traffic: 2603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6