Question: General Question: Human Cnv/Structural Variants Algorithms Using Next-Gen Data Cannot Reach Consensus
3
gravatar for Bioscientist
7.5 years ago by
Bioscientist1.7k
Bioscientist1.7k wrote:

This is pretty much a general question in human CNV/Structural variants field (with next-gen data, NOT arrays).

As shown in 1000genome project, groups develop different algorithm-based approach to identify structural variants (mainly three algorithms: paired-end, read-depth and split-read).

However results from these approaches barely overlap with each other (of course they have different preferences, say, split-read is powerful for those small indels); and seems the false positive is quite high (or we simply don't know their false positive, because we cannot use alternative approach to validate those small structural variants like we use array CGH for large ones)

Or in simple words, I don't trust even those mainstream, or widely used approaches like Breakdancer, CNVnator (I only relatively show confidence in Pindels, because it provides nucleotide-resolution breakpoints). Do you trust them?

If not, then what should we do? To carry out some post-processing or filtering to reduce the potential false positive? For example, to adjust the read-depth threshold for read-depth-based approaches; or only limit our attention to calls supported by uniquely-mapping discordant paired-end reads for paired-end-based approaches?

Or do we need to develop our own codes for our specific research? What softwares do you guys use? (say CNVnator, Breakdancer)

Personally I would say, when someday sequencing is powerful enough to accurately produce long-enough reads, then we can say goodbye to these mapping-based methods, because we can simply assemble all reads, also in the absence of problems caused by repetitive sequences in human genome.

ADD COMMENTlink written 7.5 years ago by Bioscientist1.7k
0
gravatar for Dm Church
7.5 years ago by
Dm Church30
United States
Dm Church30 wrote:

Calling structural variants is indeed challenging, and software is being developed and tweaked all of the time. This is why dbVar (http://www.ncbi.nlm.nih.gov/dbvar) tries to capture the experimental evidence (as best as it can) that went into the variant calls. This repository also allows for collections of studies so you can start doing meta analysis and comparison of different studies and methods.

ADD COMMENTlink written 7.5 years ago by Dm Church30

thx. But about dbvar, I think comparison with dbvar is based on the hypothesis that human structural variants are mostly common SVs, then we are expecting our identified SVs can be found in the database. What if most of our SVs are rare ones? Has this hypothesis been proved?

ADD REPLYlink written 7.5 years ago by Bioscientist1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2208 users visited in the last hour