Question: best way to separate copy number variations from VCF files
gravatar for soleimani_homa
8 months ago by
soleimani_homa0 wrote:


I am interested in finding copy number variation in my samples. I have raw VCF files. I have looked at the previous questions, but I have not gotten one clear answer. Is there a walker to find CNV's (duplications or deletions) in GATK from raw VCF files?

Hope to hear from you soon.

Regards Homa

cnvs snp • 416 views
ADD COMMENTlink modified 8 months ago by WouterDeCoster40k • written 8 months ago by soleimani_homa0


ADD REPLYlink written 8 months ago by cpad011211k
gravatar for WouterDeCoster
8 months ago by
WouterDeCoster40k wrote:

What about using grep? I'd use something like:

cat <(grep '^#' myvariants.vcf) <(grep '<DEL>\|<DUP>' myvariants.vcf) > cnvs.vcf

But I'm not sure how your vcf looks like. The first grep takes the header lines, the second grep searchs for variants containing either the word <del> or the word <dup>.

ADD COMMENTlink written 8 months ago by WouterDeCoster40k

Thanks a lot for the reply

Since my VCF files are derived from the GATK software, I would prefer to continue the path with the GATK. Do you have any suggestions for separating the CNVs from the VCF file using GATK?

ADD REPLYlink written 8 months ago by soleimani_homa0

Please do not make the mistake of overcomplicating things. This is a simple pattern-extraction task. Even if you use a GATK filtering tool (if that exists, I don't know) it will do the exact same thing, just wrapped in a GATK_filter_whatever.jar. The suggested solution is perfectly fine.

ADD REPLYlink modified 8 months ago • written 8 months ago by ATpoint19k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1588 users visited in the last hour