Using bcftools isec to filter for unique variants
0
0
Entering edit mode
21 months ago
ThePresident ▴ 180

I have a set of 11 vcf files derived from variant calling through snippy.

ref.vcf contains variants that I consider as background, i.e. they should all appear in the remaining vcf files.

I am interested in unique variants from sample1.vcf through sample10.vcf, in other words, report all variants not found in ref.vcf

It appears that bcftools isec tool should do that, but I am not sure how to set it up since I'm not interested in common variants but unique variants.

Would this: bcftools isec -n =11 ref.vcf.gz sample1.vcf.gz sample2.vcf.gz sample3.vcf.gz etc produce file with common varinats but also unique variants from each sample.vcf files?

Hope this is clear - thanks in advance.

Note: I have already parsed through the posts here and did not find a suitable answer for filtering unique (as opposed to common) variants.

vcf variants bcftools vcftools • 901 views
ADD COMMENT
0
Entering edit mode

OK - I think I might have found a reasonable solution. I can ran: bcftools -isec -n -11 ref.vcf.gz sample1.vcf.gz sample2.vcf.gz sample3.vcf.gz etc and take a look at the sites.txt file. I am interested in all variants that have the following signature:

CP009851    64300   A   G   01111111111
CP009851    79654   C   G   01111111111
CP009851    80035   C   T   01111111111
CP009851    80043   C   G   01111111111
CP009851    146418  C   G   01111111111

The "01111111111" indicates that a variant is not found in the first file but is in all the others. Not the prettiest way but gets the job done.

TP

ADD REPLY

Login before adding your answer.

Traffic: 2541 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6