CNV evaluate confidence
1
1
Entering edit mode
17 months ago
Daria ▴ 10

Hello! I am doing a CNV calling based on the CBS algorithm using this article https://jeremy9959.net/Blog/cbs-fixed/#:~:text=Circular%20Binary%20Segmentation%20is%20an,based%20DNA%20copy %20number%20data. I have a question, how can I evaluate the quality of the CNVs found? Are there any metrics and formulas for them? So far I have found Derivative log ratio spread (DLRS), Z-score.

Confidence CNVs CBS • 497 views
ADD COMMENT
0
Entering edit mode
17 months ago
cmdcolin ★ 3.8k

there are some simple statistics defined by https://github.com/brentp/duphold

quote from readme


    DHFC: fold-change for the variant depth relative to the rest of the chromosome the variant was found on
    DHBFC: fold-change for the variant depth relative to bins in the genome with similar GC-content.
    DHFFC: fold-change for the variant depth relative to Flanking regions.

It also adds GCF to the INFO field indicating the fraction of G or C bases in the variant.

After annotating with duphold, a sensible way to filter to high-quality variants is:

bcftools view -i '(SVTYPE = "DEL" & FMT/DHFFC[0] < 0.7) | (SVTYPE = "DUP" & FMT/DHBFC[0] > 1.3)' $svvcf

In our evaluations, DHFFC works best for deletions and DHBFC works slightly better for duplications. For genomes/samples with more variable coverage, DHFFC should be the most reliable.
ADD COMMENT

Login before adding your answer.

Traffic: 1951 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6