Question

What is more important for Structural Variant Calling? Recall or Precision?

0

Entering edit mode

23 days ago

Luke • 0

I am relatively new to structural variant calling so I apologise in advance if I seem unaware of certain things.

I am currently trying to call structural variants using multiple callers: (Manta, smoove, and GRIDSS) by merging their individual outputs using SURVIVOR. I'm attempting to compare the outputs I get from this process to a truthset called HG002_v0.6.

I want to see if there are any structural variants from the samples I am calling which have a stronger link to or impact on certain genes when compared to others, but I'm also trying to filter out as many false positives as possible.

I have also tried filtering the individual VCF files using duphold's added annotations (and recommended filtering on their github page), filtering via PASS only, and removing structural variants below 50bp in length.

For GRIDSS I have also annotated the calls using the simple_event_annotation.R script supplied by the GRIDSS github repository and used SVTyper for genotyping to make a merge with SURVIVOR possible for a GRIDSS VCF file.

In your opinion, what would be more important of a metric to determine if the structural variant calling is tuned properly, a higher recall? Or a higher precision. Currently I am achieving ~15-20% recall and >90% Precision.

structural-variants merging precision-recall • 115 views

ADD COMMENT • link 23 days ago by Luke • 0