Hi, I'm trying to calculate LD for a population. According to this post: http://sourceforge.net/p/vcftools/mailman/message/33219601/
the author says that it should be fine for haploid individuals if their genotype is 1 or "1|.":
Apologies. The announcement was slightly misinformative. The change allowed one to calculate r^2 with genotypes of the form '1|.', but not actually for haploids like '1'. I've just committed a change to allow the second case, so please give it a go and drop me a line if there are any issues.
So I tried calculating LD on my vcf file like so:
vcftools --vcf HA.diploid.vcf --hap-r2 --out test
But I got only nan in my output for the value of r^2. I then converted my VCF to have 1|. for the genotype field instead of just 1 (looks something like this):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT M-Ait14.22010_CAAGACTA_C5E9DANXX_6_20140930B_20140930.bam.mpileup.all.snp.vcfsamp.vcf M-Ait9.22003_GTCTGTCA_C5E9DANXX_6_20140930B_20140930.bam.mpileup.all.snp.vcfsamp.vcf M-ELH20.18511_ATCCTGTA_C41F2ACXX_6_20140423B_20140423.bam.mpileup.all.snp.vcfsamp.vcf M-ELH27.18516_CCTCTATC_C41F2ACXX_6_20140423B_20140423.bam.mpileup.all.snp.vcfsamp.vcf M-ARB0.18513_GCTAACGA_C41F2ACXX_6_20140423B_20140423.bam.mpileup.all.snp.vcfsamp.vcf M-Set0.22009_CATACCAA_C5E9DANXX_6_20140930B_20140930.bam.mpileup.all.snp.vcfsamp.vcf 1 306 . G T . PASS AC=1;AN=1;SF=9 GT .|. .|. .|. 1|. .|. .|. 1 375 . A G . PASS AC=5;AN=5;SF=1,2,3,9,15 GT 1|. 1|. .|. 1|. 1|. 1|. 1 424 . C A,T . PASS AC=3,0;AN=3;SF=2,6,9,14,15,16,19 GT .|. 1|. .|. 1|. .|. 1|. 1 465 . G A . PASS AC=1;AN=1;SF=5,15 GT .|. .|. .|. .|. .|. 1|. 1 508 . T C . PASS AC=5;AN=5;SF=1,2,3,5,6,9,10,12,14,15,16,17,19,20 GT 1|. 1|. .|. 1|. 1|. 1|.
But my output is still nan:
CHR POS1 POS2 N_CHR R^2 D Dprime 1 306 375 1 nan 0 nan 1 306 465 0 nan nan nan 1 306 508 1 nan 0 nan 1 306 528 1 nan 0 nan 1 306 711 1 nan 0 nan 1 306 723 1 nan 0 nan 1 306 727 0 nan nan nan 1 306 856 0 nan nan nan 1 306 864 1 nan 0 nan
Any ideas on what I'm doing wrong? Thank you very much!