Question: best way to separate copy number variations from VCF files
0
gravatar for soleimani_homa
23 months ago by
soleimani_homa0 wrote:

Hi

I am interested in finding copy number variation in my samples. I have raw VCF files. I have looked at the previous questions, but I have not gotten one clear answer. Is there a walker to find CNV's (duplications or deletions) in GATK from raw VCF files?

Hope to hear from you soon.

Regards Homa

cnvs snp • 872 views
ADD COMMENTlink modified 23 months ago by WouterDeCoster44k • written 23 months ago by soleimani_homa0

snpsift?

ADD REPLYlink written 23 months ago by cpad011214k
2
gravatar for WouterDeCoster
23 months ago by
Belgium
WouterDeCoster44k wrote:

What about using grep? I'd use something like:

cat <(grep '^#' myvariants.vcf) <(grep '<DEL>\|<DUP>' myvariants.vcf) > cnvs.vcf

But I'm not sure how your vcf looks like. The first grep takes the header lines, the second grep searchs for variants containing either the word <del> or the word <dup>.

ADD COMMENTlink written 23 months ago by WouterDeCoster44k

Thanks a lot for the reply

Since my VCF files are derived from the GATK software, I would prefer to continue the path with the GATK. Do you have any suggestions for separating the CNVs from the VCF file using GATK?

ADD REPLYlink written 23 months ago by soleimani_homa0
1

Please do not make the mistake of overcomplicating things. This is a simple pattern-extraction task. Even if you use a GATK filtering tool (if that exists, I don't know) it will do the exact same thing, just wrapped in a GATK_filter_whatever.jar. The suggested solution is perfectly fine.

ADD REPLYlink modified 23 months ago • written 23 months ago by ATpoint38k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 622 users visited in the last hour