Question: best way to separate copy number variations from VCF files
0
gravatar for soleimani_homa
4 months ago by
soleimani_homa0 wrote:

Hi

I am interested in finding copy number variation in my samples. I have raw VCF files. I have looked at the previous questions, but I have not gotten one clear answer. Is there a walker to find CNV's (duplications or deletions) in GATK from raw VCF files?

Hope to hear from you soon.

Regards Homa

cnvs snp • 293 views
ADD COMMENTlink modified 4 months ago by WouterDeCoster37k • written 4 months ago by soleimani_homa0

snpsift?

ADD REPLYlink written 4 months ago by cpad011211k
2
gravatar for WouterDeCoster
4 months ago by
Belgium
WouterDeCoster37k wrote:

What about using grep? I'd use something like:

cat <(grep '^#' myvariants.vcf) <(grep '<DEL>\|<DUP>' myvariants.vcf) > cnvs.vcf

But I'm not sure how your vcf looks like. The first grep takes the header lines, the second grep searchs for variants containing either the word <del> or the word <dup>.

ADD COMMENTlink written 4 months ago by WouterDeCoster37k

Thanks a lot for the reply

Since my VCF files are derived from the GATK software, I would prefer to continue the path with the GATK. Do you have any suggestions for separating the CNVs from the VCF file using GATK?

ADD REPLYlink written 4 months ago by soleimani_homa0
1

Please do not make the mistake of overcomplicating things. This is a simple pattern-extraction task. Even if you use a GATK filtering tool (if that exists, I don't know) it will do the exact same thing, just wrapped in a GATK_filter_whatever.jar. The suggested solution is perfectly fine.

ADD REPLYlink modified 4 months ago • written 4 months ago by ATpoint14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1018 users visited in the last hour