bcftools filter appy to sub regions
0
0
Entering edit mode
5 days ago
qwzhang0601 ▴ 80

Hello:

I am trying to do filtering to reduce FP variants from a single sample vcf file. There are some known challenging genes in the panel, so for the challenging regions I will filter less and for other regions I will add more filtering prameters. I have prepared the scripts in three steps to get the final vcf. I wonder whether there is a way to add "regions" constraint into the filtering expression, so I can get the final vcf with only one step? Like this $ bcftools filter -e '(FS>50 | FMT/AF[0:0] < 0.15) & TYPE="snp" & regions NOT_IN ${file_challenge_regions}'

Below is my current script.

#include regions with challenge for variant calling, where we will apply less filtering
file_challenge_regions=challenge_regions.bed

#step 1 (for regions without challenge): first filter variants with DP< 15, then for SNP not in "challenge_regions.bed" filter those with FS>50 or AF<0.15

$ bcftools filter -e 'FMT/DP[0] < 15' normalized.vcf.gz | bcftools filter -e '(FS>50 | FMT/AF[0:0] < 0.15) & TYPE="snp"' -T ^${file_challenge_regions} -Ov -o nonchallenge.final.vcf

#step 2 (for challenge regions):  first filter variants with DP< 15
$ bcftools filter -e 'FMT/DP[0] < 15' -T ${file_challenge_regions} normalized.vcf.gz -Ov -o challenge.final.vcf

#step 3: combine variants from challenge regions and nonchallenge regions.
time bcftools contact -a --rm-dups=none -Ov -o final.vcf nonchallenge.final.vcf  challenge.final.vcf

Thanks

bcftools filter • 129 views
ADD COMMENT

Login before adding your answer.

Traffic: 1265 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6