vcftools: SNP filtering
1
1
Entering edit mode
6.9 years ago

I'm trying to filter my SNP vcf file. I'm looking for various filters that can be applied.

I've tried this, to remove the SNPs that have Genome quality < 50, but the number of files remain the same.

vcftools --vcf input.vcf --recode --recode-INFO-all --out output.vcf --minGQ 50.00

How can I fix this? Also, I would like to know what other filters that I can use.

Thank you

vcftools SNP filter Tool • 7.6k views
0
Entering edit mode
6.9 years ago
Dan D 7.2k

To list here the possible filters you can pass into VCFTools would be an exercise in redundancy. The documentation does a good job of laying them all out.

You're specifying the genotype quality filter correctly--are you saying that you're getting the same number of lines in your output file? If you're not sure, here's a quick way to check the total number of lines in the input and output files:

wc -l input.vcf

wc -l output.vcf

You should also check to see if the "GQ" tag is actually in your VCF file. Depending on what software was used to generated the VCF, there might not be Genotype Quality data for all sites, which is what VCFTools requires..

EDIT: try changing --minGQ to --minQ

0
Entering edit mode

I did a word count before and after, it gives me the same number of SNPs.
I tried the same with the --minDP (depth), the number of SNPs remain a constant here as well.
I used bcftools to generate the VCF. Also, I checked for the GQ tag, it exists.

0
Entering edit mode

Would you be willing to post a dropbox (or google drive, or something similar) link to your VCF file?

0
Entering edit mode
0
Entering edit mode