Question: cnvkit example data
gravatar for laxvid
4.1 years ago by
United States
laxvid10 wrote:

Where can I get the bam files used in cnvkit examples cited in

I did like to compare another cnv caller's performance against cnvkit -Lax

cnv bam cnvkit • 2.0k views
ADD COMMENTlink modified 2.3 years ago by Biostar ♦♦ 20 • written 4.1 years ago by laxvid10
gravatar for Eric T.
4.1 years ago by
Eric T.2.6k
San Francisco, CA
Eric T.2.6k wrote:

The test samples are from Shain et al. 2015, Nature Genetics. Since these sequences are protected patient information the BAM files were submitted to dbGaP; there is a few months' delay before they appear online. However, those weren't ideal samples for testing copy number calling anyway -- desmoplastic melanoma genomes are dominated by somatic SNVs, not large-scale copy number alterations.

A better dataset for testing variant callers, both SNV and CNV, has become available recently: "An open access pilot freely sharing cancer genomic data from participants in Texas". I recommend running your benchmarks with these samples instead so that you can freely share your complete analysis. CNVkit has changed significantly since the version I benchmarked in the paper, so in any case you'll need to re-run the latest version of each caller (including CNVkit) to get representative results.

ADD COMMENTlink written 4.1 years ago by Eric T.2.6k

Can CNVKit be used for copy number germline mutation detection?

ADD REPLYlink written 4.0 years ago by win840

Yes, but the resolution is fairly coarse, especially on target panels, so detection of smaller CNVs (e.g. below 1Mb) is be less accurate. This is less of a concern in cancer cases where somatic copy number alterations tend to affect entire genes or larger chromosomal regions.

For germline cases, if you only have targeted/exome sequencing data then CNVkit is worth running to get some copy number information, but a clinic with access to the original sample or extracted DNA should consider running another assay (e.g. SNP array, FISH, qPCR) in parallel if possible.

ADD REPLYlink written 4.0 years ago by Eric T.2.6k
gravatar for Akliao
3.9 years ago by
San Diego
Akliao0 wrote:

The texas site is perfect. However it is WES. Anyone know of a good illumina amplicon cnv dataset?

ADD COMMENTlink written 3.9 years ago by Akliao0

It would be the best to ask this in a separate question.

ADD REPLYlink written 3.9 years ago by WouterDeCoster44k

I don't know of any that are publicly available, but try SRA or dbGaP. Targeted amplicon sequencing seems to be focused in smaller clinics where making the sequencing data widely available (e.g. IRB approval) is not the primary concern; bigger studies that are conducted with this intent are usually WES, WGS or at least hybrid capture with a broader panel. But if you find a good public TAS dataset, please let me know!

ADD REPLYlink written 3.9 years ago by Eric T.2.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 934 users visited in the last hour