Filtering on the minor allele in VCFtools
2
0
Entering edit mode
8.4 years ago
outlier95 ▴ 30

Wondering how I can get the number of informative sites in a .vcf file using VCFtools. By informative I mean at least two samples share a variant. Any suggestions? Thanks.

vcftools snps • 4.9k views
ADD COMMENT
0
Entering edit mode

Not sure about VCFtools, but if you are up for trying something new, the Variant Effect Predictor gives you information on MAF for data in .vcf files. Look at the Filtering options available for the VEP including frequency i.e. MAF.

ADD REPLY
0
Entering edit mode

GATK indicates they do this for their best practices and then makes the reader scavenger around the internet to find how to do this instead of giving a resource. Shame.

ADD REPLY
0
Entering edit mode
8.4 years ago

using vcffilterjs: https://github.com/lindenb/jvarkit/wiki/VCFFilterJS add INFORMATIVE in the FILTER column for the variant having less than two samples having more than one genotype hom-ref or het. extract the FILTER column, count the number of line containing INFORMATIVE

cat input.vcf |\
java -jar dist/vcffilterjs.jar -F INFORMATIVE -e 'function accept(v) { var f=0,i;for(i=0;i<v.getNSamples();++i) {var g=v.getGenotype(i); f+=(g.isHomVar() || g.isHet()?1:0);} return f<2;}accept(variant);' |\
grep -v "^#" | cut -f 7 | grep -c INFORMATIVE

ADD COMMENT
0
Entering edit mode
8.4 years ago
Adam ★ 1.0k
vcftools --gzvcf vcf_file --mac 2 --stdout --recode | fgrep -v '#' | wc -l
ADD COMMENT
0
Entering edit mode

MOD-EDIT: OP has opened a new question for this here: Identifying private and shared SNPs using VCFtools

ADD REPLY
0
Entering edit mode

If I remove singletons and private doubletons via --singletons and --positions using VCFtools, then take the difference in the number of SNPs before and after filtering, should that amount to the number of informative SNPs (per my definition)? Many thanks.

ADD REPLY

Login before adding your answer.

Traffic: 825 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6