Remove variants in VCF with the same call in all but the reference
2
0
Entering edit mode
16 months ago

I have a VCF file with individuals mapped to a reference. In many cases, all of the individuals have the same allele call (although it is different from the reference). What I want to do is filter this VCF so I only have allele calls that are variable within my samples. Is there a straightforward way to do this? In the example below I want to keep Chromosome 3416, but get rid of all others:

Chromosome  72  .   T   G   .   PASS    .   GT  1   1   1   1   1
Chromosome  1993    .   T   C   .   PASS    .   GT  1   1   1   1   1
Chromosome  3416    .   C   T   .   PASS    .   GT  0   0   1   1   0
Chromosome  4190    .   G   T   .   PASS    .   GT  1   1   1   1   1
SNP genome • 264 views
ADD COMMENT
1
Entering edit mode
16 months ago
guillaume.rbt ▴ 830

You can achieve that with SnpSift filter : http://snpeff.sourceforge.net/SnpSift.html

For example if you want to filter out all positions where all your 5 samples are variants :

cat variants.vcf | java -jar SnpSift.jar filter "!(isVariant( GEN[0] ) & isVariant( GEN[1] ) & isVariant( GEN[2] ) & isVariant( GEN[3] ) & isVariant( GEN[4] ) )" > filtered.vcf
ADD COMMENT
0
Entering edit mode
16 months ago
bcftools view -i 'AC!=5' input.Vcf
ADD COMMENT

Login before adding your answer.

Traffic: 2323 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6