vcftools Tajima's D calculation
1
0
Entering edit mode
23 months ago
biogirl ▴ 200

Hi all,

I'm running vcftools on a vcf (obv, ha) to calculate Taijma's D in sliding windows of 10,000 bp:

vcftools --vcf in.vcf --out tajimasd --TajimaD 10000

The log file states that after filtering, kept 342087 out of a possible 342087 Sites (so all sites). Yet when I look at the output file, there are a lot of 'nan' (meaning no SNPs in that bin), and the bins that have Tajima's D calculated have a tiny number of SNPs in them (max 8 - certainly not adding up to >300k SNPs).

Any ideas as to why I'm getting no SNPs in the majority of my bins?

Thanks

vcftools tajimasd selection • 2.9k views
ADD COMMENT
0
Entering edit mode
5 months ago
grey ▴ 20

I believe this is because vcftools --TajimaD doesn't like missing genotype calls ./. and will not calculate Tajimas D there

See: vcftools --TajimaD does not start at 0 and misses variants

ADD COMMENT

Login before adding your answer.

Traffic: 2490 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6