vcftools: SNP filtering
1
1
Entering edit mode
9.9 years ago
Parimala Devi ▴ 100

I'm trying to filter my SNP vcf file. I'm looking for various filters that can be applied.

I've tried this, to remove the SNPs that have Genome quality < 50, but the number of files remain the same.

vcftools --vcf input.vcf --recode --recode-INFO-all --out output.vcf --minGQ 50.00

How can I fix this? Also, I would like to know what other filters that I can use.

Thank you

SNP vcftools • 9.1k views
ADD COMMENT
0
Entering edit mode
9.9 years ago
Dan D 7.4k

To list here the possible filters you can pass into VCFTools would be an exercise in redundancy. The documentation does a good job of laying them all out.

You're specifying the genotype quality filter correctly--are you saying that you're getting the same number of lines in your output file? If you're not sure, here's a quick way to check the total number of lines in the input and output files:

wc -l input.vcf

wc -l output.vcf

You should also check to see if the "GQ" tag is actually in your VCF file. Depending on what software was used to generated the VCF, there might not be Genotype Quality data for all sites, which is what VCFTools requires..

EDIT: try changing --minGQ to --minQ

ADD COMMENT
0
Entering edit mode

I did a word count before and after, it gives me the same number of SNPs.
I tried the same with the --minDP (depth), the number of SNPs remain a constant here as well.
I used bcftools to generate the VCF. Also, I checked for the GQ tag, it exists.

ADD REPLY
0
Entering edit mode

Would you be willing to post a dropbox (or google drive, or something similar) link to your VCF file?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks! Answer updated.

ADD REPLY

Login before adding your answer.

Traffic: 2630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6