Question

What Are The 'Copy Number Detection' Tools Out There For Exome Capture Ngs Data.

27

Entering edit mode

12.5 years ago

Prateek ★ 1.0k

Do you know of any CNV detection tools for NGS paired-end exome data - coverage method (window based) or paired-end mapping method (clustering based)? I am aware its a tough problem to solve and have looked at some tools for whole genome but couldn't find one for exome.

I would also welcome discussion about how existing tool could be re-purposed for exome through post-processing (like ignoring exon boundaries).

Finally, please feel free to point out tools for structural variants (inversions, translocations etc.) too.

cnv copynumber next-gen sequencing variant structural • 23k views

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 12.5 years ago by Prateek ★ 1.0k

Ram · Answer 1 · 2011-11-16

Take a look at the supplementary information from the 1000G paper located here.

They use something like 15-17 algorithms including read-pair analysis (RP), read depth analysis (RD), split read analysis (SR), and sequences assembly (AS).

Those are broken down in Tables 2A and 2B of the supplement

In brief:

Read depth: Event-wise testing, CNVnator

Read pair: Spanner, PEMer, BreakDancer

Split read: Mosaik, Pindel

PD read pair/read depth: Spanner, Genome STRIP

There is also a 1000 Genomes tutorial on structural variants by Jan Korbel:

Video:

Slides: http://www.genome.gov/Pages/Research/DER/1000GenomesProjectTutorials/StructuralVariants-JanKorbel.pdf

A bit dated, but it can get you started.

Ram · Answer 2 · 2012-07-31

11

Entering edit mode

11.8 years ago

SBinson ▴ 110

cn.MOPS works well for this task.

ADD COMMENT • link updated 2.3 years ago by Ram 43k • written 11.8 years ago by SBinson ▴ 110

0

Entering edit mode

Another vote for cn.mops. I wish I had known about it when I supplied my original answer.

ADD REPLY • link updated 2.3 years ago by Ram 43k • written 11.2 years ago by User 59 13k

0

Entering edit mode

And another vote for cn.mops, big bonus point for me that it even works with small dataset (5-7 samples).

ADD REPLY • link updated 2.3 years ago by Ram 43k • written 11.0 years ago by ron_veg ▴ 50

0

Entering edit mode

cn.mops performed very well for detecting CNVs in free circulating cancer DNA.

ADD REPLY • link 10.3 years ago by okko.clevert ▴ 240

0

Entering edit mode

cn.mops works very well for analyzing exom sequencing data from cancer genomes

ADD REPLY • link 10.3 years ago by sepp.hochreiter • 0

0

Entering edit mode

I've had great luck with CN.mops and it's relatively easy to use, even for an R newbie. Also Günter (the software's author) is very helpful and responsive!

ADD REPLY • link 9.4 years ago by steven_friedenberg ▴ 10

Ram · Answer 3 · 2011-11-02

10

Entering edit mode

12.5 years ago

Chris Miller 22k

Varscan is one such program: (see Varscan, Using The Copycaller )
ExonCNV is another http://bioinformatics.oxfordjournals.org/content/early/2011/08/09/bioinformatics.btr462

ADD COMMENT • link updated 4.6 years ago by Ram 43k • written 12.5 years ago by Chris Miller 22k

score 8 · Answer 4 · 2011-11-02

There are several strategies to find structural variants (SVs) with genomic or exome NGS data. First, using paired-end data, you can mine the distribution of insert sizes between read pairs and infer SVs by identifying unusual insert sizes. Second, you may scan through the genome/exome to find regions with unusually high and low coverage. This is the only approach with which you can estimate the copy number (don't how accurate that is). Then you can also use the reads that get split when mapping, which may fall into SV regions. Finally, de novo assembly followed by traditional comparative genomics approaches can also help with SV discovery. Of course, you can combine all these approaches together and find the candidates with highest confidence.

I heard CNVnator is a pretty good coverage-based tool for genomic data, but not sure whether it's gonna perform well with the exome data. Considering the size and distribution of exons, split read method seems to be attractive. My personal experience involves a genomic data set, we assembled the genomic reads de novo, and used traditional method like MUMmer to identify the SVs and verified by coverage-based approaches. It works quite well but I don't know how de novo assembly would perform for exome (I heard the Trinity pipeline is rising as a good tool for de novo assembly of transcriptome or exome).

There is nice review on Nature Reviews Genetics. It said everything I mentioned and much more. http://www.nature.com/nrg/journal/v12/n5/full/nrg2958.html

score 5 · Answer 5 · 2011-11-03

Another vote for ExomeCNV. There's also CNASeg and CNV-Seq (although I'm not sure of their appropriateness for exome data). I've also seen CNVnator mentioned on SeqAnswers in relation to this question, but I think that Chris's point about variable depth means this is certainly a trickier proposition than for WGS.

EDIT:

I've also just seen an abstract for another BioConductor package based on an HMM approach. exomeCopy is the package.

score 3 · Answer 6 · 2014-11-27

Are you looking for CNVs in a population, or disease-causing copy number alterations in individual tumor or constitutional samples?

For the former, most of the answers already posted here, including cn.MOPS, will do.

For the latter, particularly tumor samples, CNVkit is a program I wrote recently that performs well.

There are lots of these tools tailored for slightly different purposes, and it's a good idea to look for recent papers that independently benchmark several of them at once.

score 2 · Answer 7 · 2013-04-16

2

Entering edit mode

11.1 years ago

fromer ▴ 20

We've written the XHMM software for calling CNV from exomes: http://atgu.mgh.harvard.edu/xhmm/

Our paper describing this was published last year in AJHG: http://www.cell.com/AJHG/abstract/S0002-9297%2812%2900417-X

ADD COMMENT • link 11.1 years ago by fromer ▴ 20

1

Entering edit mode

Can you say how many BAM files would be required to have reliable calling?

ADD REPLY • link 11.1 years ago by Ryan D ★ 3.4k

Ram · Answer 8 · 2014-11-26

0

Entering edit mode

9.4 years ago

chongchu.cs ▴ 10

We have a tool for calling genotypes of insertions and deletions for WGS. http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0113324

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by chongchu.cs ▴ 10

0

Entering edit mode

I'll be interesting on testing your algorithm but will it work also for exome. Otherwise, I'll look at the source code to know how it works.

By the way, I'm analysing wildtype/tumor samples

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 8.6 years ago by djtilyon • 0