Recommendations for CNV calling algorithms/programs to benchmark
1
0
Entering edit mode
6.2 years ago
seraphya • 0

I want to benchmark the time/resource use and breakpoint accuracy of CNV callers on multiple individual NGS data at 30x-100x coverage at specific CNVs.

A few questions I have:

Is there a specific read mapper I should use to create the BAM file, or should I try a few for each SV/CNV caller?

Are there any specific algorithms/programs I should test? The only one I know I will test for sure is CNVnator as that is what has been used until now.

Anything else I should consider?

CNV Assembly • 4.6k views
ADD COMMENT
0
Entering edit mode

Hi, I am doing very similar project - to compare algorithms to detect CNVs. For my best practice I am using BWA aligner. For detection CNV I have good results from - oncoCNV, CNVkit, Pindel.. I did not played with CNVnator - if you can share your experiences everybody would appreciate.

ADD REPLY
10
Entering edit mode
6.2 years ago
Garan ▴ 690

I'm guessing you're after germline CNV callers since you've mentioned CNVnator. I've included some suggestions below for read-depth based callers including ExomeDepth which is the one I've used the most (reasonably easy to use since it's an R package). I'd have a look at Ximmer if you're interested in comparing CNV callers since it provides a standardised framework for comparing callers out of the box.

I guess you could try Read-depth callers, callers that look for breakpoints, or split-read (although these are more for WGS than targeted / Exomes), callers that look for missing / moved mate-pairs, or read-pair. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4394692/ There's also assembly based callers and callers that use a combination of the above techniques. The approach you use will also depend on the library and the size of CNV you're looking for.

Generally I've only compared CNV callers after a pipeline that uses BWA for alignment and then GATK best practices, since a couple of the callers actually use parts of the GATK suite (like XHMM). Some CNV callers like CANVAS (https://github.com/Illumina/canvas) are optimised for their own workflow (in this case ISAAC).

Read-Depth based callers

BCFTools https://samtools.github.io/bcftools/howtos/cnv-calling.html

CoNIFER http://conifer.sourceforge.net/ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3409265/

ExomeDepth https://cran.r-project.org/web/packages/ExomeDepth/index.html http://plagnol-lab.blogspot.co.uk/2013/11/faq-and-clarifications-for-exomedepth-r.html

XHMM http://atgu.mgh.harvard.edu/xhmm/tutorial.shtml http://www.cell.com/ajhg/fulltext/S0002-9297(12)00417-X XHMM used by ExAC to call their CNVs

Read-pair

ULYSSES https://github.com/gillet/ulysses Breakdancer https://github.com/genome/breakdancer

Split-read

Pindel https://github.com/genome/pindel

Frameworks

Ximmer https://github.com/ssadedin/ximmer https://www.biorxiv.org/content/early/2018/02/06/260927 Framework for running mulitple CNV callers together and calculating sensitivity etc. Comes with ExomeDepth, Xhmm, Cnmops and Conifer

GATK4 germline CNV caller https://software.broadinstitute.org/gatk/best-practices/workflow?id=11148 Not sure if this is available yet but should be ready soon - ideal if you want a full GATK best practice pipeline

Mainly used ExomeDepth on Targeted panels and found it okay with some tweaks and heavy filtering for false positives.

ADD COMMENT
0
Entering edit mode

Can Breakdancer be applied on Paired End Whole Exome sequencing data ?

ADD REPLY
0
Entering edit mode

GATK4 germlineCNVcaller is available now. It'd be great to see how it stacks up against some of the older methods out there.

ADD REPLY

Login before adding your answer.

Traffic: 2842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6