Question

CNVkit cnv_ztest with low log2 values

0

Entering edit mode

3.8 years ago

merella.stefania ▴ 20

Dear all, I am using CNVkit to analyse exome sequencing data (germline samples) in order to find new copy number variants. For the pipeline analysis I followed all the found suggestions for germline samples and here are the steps:

1) create a reference using all samples

cnvkit.py batch --normal ALL_BAM/*.bam --targets Exome_SureSelect_QXTV7_forCNVkit.bed --fasta hs37d5.fa --access access-5k-mappable.hg19.bed --output-reference reference.cnn

2) batch command using the previously created reference

cnvkit.py batch ALL_BAM/*.bam -r reference.cnn -d results/

3) add ci column

cd results/
for i in *.cnr ; do cnvkit.py segmetrics -s `basename ${i%%.cnr}`.cn{s,r} --ci ; done

4) call command

for i in *segmetrics* ; do cnvkit.py call $i --filter ci -m clonal --center mode -o call_cnvkit/`basename ${i%%segmetrics*}`.call.cns; done

5) cnv_ztest

for i in *.cnr; do cnv_ztest.py $i -t -s call_cnvkit/`basename ${i%%.cnr}..call.cns` -o cnv_ztest/`basename ${i%%.cnr}`.ztest.cnr; done

I am now looking at cnv_ztest results but I am not understanding log2 values reported in the file because they have quite negative values in all samples (like between -21 and -8). Is there something I am missing? Or probably is there an error in my pipeline? I also had a look at this thread but because I am using germline samples I didn't use the --drop-low-coverage as suggested. Do you have any idea about what is going wrong? Thanks in advance!

Stefania

cnvkit germline exome sequencing cnv_ztest • 1.1k views

ADD COMMENT • link updated 3.8 years ago by GenoMax 142k • written 3.8 years ago by merella.stefania ▴ 20

0

Entering edit mode

Hi Stephania,

is it for sure that your targeted enrichment kit was the same for both cases and controls?

ADD REPLY • link 3.8 years ago by German.M.Demidov ★ 2.9k

0

Entering edit mode

Hi German, yes the target enrichment kit is the same for all samples.

ADD REPLY • link 3.8 years ago by merella.stefania ▴ 20

0

Entering edit mode

Very low negative numbers show homozygous deletions. They are found as very low coverage. Somehow your tumor samples do not have coverage in these regions.

Having many recurrent homozygous deletions is very unlikely. Some genes such as CDKN2A are "famous" for that, but still, I would not expect more than 20% of samples to have it.

ADD REPLY • link 3.8 years ago by German.M.Demidov ★ 2.9k