Is there a tool to detect copy number variants on a single target sequenced sample file?
All the tools I know (GATK, ControlFREEC, VarScan) require a mate file.
What have you got? - a single FASTQ file relating to single-end next generation sequencing?
I have a BAM file from which I obtained 2 vcf files after calling germline and somatic variants using GATK.
It seems that bcftools cnv could sort me out. I need to add to my vcf, the B-allele frequency (I can compute it from the AD field), and the LRR (don't know what it is yet),
bcftools cnv does not seem to work since we need to have 2 samples to compute LRR (Log R Ratio).
ControlFREEC can calculate copy number from a single sample, but what is your sample? - whole genome sequencing, exome sequencing, or target sequencing?
It is target sequencing.
ControlFREEC can call copy number for this data. Here is a sample config file that you'll need for this type of data:
Thanks Kevin ! The program generates the sample.cnp and GC_profile.cnp files, but then stops with the following error:
Error: zero reads in windows with the GC-content around 0.35 with interval 0.01, will try again with 0.04
Error: zero reads in windows with the GC-content around 0.35 with interval 0.04
Unable to proceed
Try to rerun the program with higher number of reads
I tried to decrease the window size to 500 but I got the same error.
Note, I am only interested in one chromosome (my BAM contains only reads mapped to one chromosome). I supplied a captureRegions path though.
Do you have an idea what parameters should I adjust?
How large is your target region?
It is about 10e6 bp.
You may need to add a new chunk of code to your config file, like this:
Also, with that, set window=0 in the [general] chunk.
Be aware of the other option, readCountThreshold=10, which you may need to toggle. 10 is already low, though.
I also decreased the following parameters and still get the same error:
window=1 #(window=0 does not work)
Alternatively, I need to calculate the approximative copy number (and exon involved) of a specific gene. Would it be possible to do it "manually" from a BAM or VCF file?
Realistically, I am not confident that it is possible to accurately determine copy number for just one gene - how would you know the level of coverage that is reflective of normal, deleted, or amplified DNA, when read coverage, even in a 'normal' piece of DNA, fluctuates a lot based on GC content, sequence similarity elsewhere in the genome, etc? Even when we have genome-wide data, copy number callers disagree a lot.
Why can't you just pick a few exons, design some primers, obtain a normal reference hgDNA, and then do qPCR (determine copy number via the delta delta Ct method).
Good idea. I will give qPCR a try.