General Question: Human Cnv/Structural Variants Algorithms Using Next-Gen Data Cannot Reach Consensus
Entering edit mode
9.1 years ago
Bioscientist ★ 1.7k

This is pretty much a general question in human CNV/Structural variants field (with next-gen data, NOT arrays).

As shown in 1000genome project, groups develop different algorithm-based approach to identify structural variants (mainly three algorithms: paired-end, read-depth and split-read).

However results from these approaches barely overlap with each other (of course they have different preferences, say, split-read is powerful for those small indels); and seems the false positive is quite high (or we simply don't know their false positive, because we cannot use alternative approach to validate those small structural variants like we use array CGH for large ones)

Or in simple words, I don't trust even those mainstream, or widely used approaches like Breakdancer, CNVnator (I only relatively show confidence in Pindels, because it provides nucleotide-resolution breakpoints). Do you trust them?

If not, then what should we do? To carry out some post-processing or filtering to reduce the potential false positive? For example, to adjust the read-depth threshold for read-depth-based approaches; or only limit our attention to calls supported by uniquely-mapping discordant paired-end reads for paired-end-based approaches?

Or do we need to develop our own codes for our specific research? What softwares do you guys use? (say CNVnator, Breakdancer)

Personally I would say, when someday sequencing is powerful enough to accurately produce long-enough reads, then we can say goodbye to these mapping-based methods, because we can simply assemble all reads, also in the absence of problems caused by repetitive sequences in human genome.

cnv structural next-gen sequencing • 2.7k views
Entering edit mode
9.1 years ago
Dm Church ▴ 30

Calling structural variants is indeed challenging, and software is being developed and tweaked all of the time. This is why dbVar ( tries to capture the experimental evidence (as best as it can) that went into the variant calls. This repository also allows for collections of studies so you can start doing meta analysis and comparison of different studies and methods.

Entering edit mode

thx. But about dbvar, I think comparison with dbvar is based on the hypothesis that human structural variants are mostly common SVs, then we are expecting our identified SVs can be found in the database. What if most of our SVs are rare ones? Has this hypothesis been proved?


Login before adding your answer.

Traffic: 2425 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6