Using bcftools isec to filter for unique variants
Entering edit mode
2.0 years ago
ThePresident ▴ 180

I have a set of 11 vcf files derived from variant calling through snippy.

ref.vcf contains variants that I consider as background, i.e. they should all appear in the remaining vcf files.

I am interested in unique variants from sample1.vcf through sample10.vcf, in other words, report all variants not found in ref.vcf

It appears that bcftools isec tool should do that, but I am not sure how to set it up since I'm not interested in common variants but unique variants.

Would this: bcftools isec -n =11 ref.vcf.gz sample1.vcf.gz sample2.vcf.gz sample3.vcf.gz etc produce file with common varinats but also unique variants from each sample.vcf files?

Hope this is clear - thanks in advance.

Note: I have already parsed through the posts here and did not find a suitable answer for filtering unique (as opposed to common) variants.

vcf variants bcftools vcftools • 1.0k views
Entering edit mode

OK - I think I might have found a reasonable solution. I can ran: bcftools -isec -n -11 ref.vcf.gz sample1.vcf.gz sample2.vcf.gz sample3.vcf.gz etc and take a look at the sites.txt file. I am interested in all variants that have the following signature:

CP009851    64300   A   G   01111111111
CP009851    79654   C   G   01111111111
CP009851    80035   C   T   01111111111
CP009851    80043   C   G   01111111111
CP009851    146418  C   G   01111111111

The "01111111111" indicates that a variant is not found in the first file but is in all the others. Not the prettiest way but gets the job done.



Login before adding your answer.

Traffic: 2682 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6