Entering edit mode
                    9.4 years ago
        hellbio
        
    
        ▴
    
    520
    I am having the below error while calcuating r2 using vcftools:
vcftools --vcf GATK.QC.MAF5.recode.vcf --hap-r2
VCFtools - v0.1.12b
(C) Adam Auton and Anthony Marcketta 2009
Parameters as interpreted:
--vcf GATK.QC.MAF5.recode.vcf
--max-alleles 2
--min-alleles 2
--hap-r2
--phased
After filtering, kept 133 out of 133 Individuals
Outputting Pairwise LD (phased bi-allelic only)
Error: Insufficient sites remained after filtering
I tried using --geno-r2 --phased and met with the obvious error as the variants are unphased.
vcftools --vcf 133Samples.GATK.QC.MAF5.recode.vcf --geno-r2  --phased
VCFtools - v0.1.12b
(C) Adam Auton and Anthony Marcketta 2009
Parameters as interpreted:
--vcf 133Samples.GATK.QC.MAF5.recode.vcf
--geno-r2
--max-alleles 2
--min-alleles 2
--phased
After filtering, kept 133 out of 133 Individuals
Outputting genotype pairwise LD (bi-allelic only) for a set of SNPs versus all others.
After filtering, kept 0 out of a possible 9792407 Sites
No data left for analysis!
Run Time = 444.00 seconds
Could anyone comment on how to calculate r2 using vcftools?
There was a bug in some of these functions in VCFtools 0.1.12. Perhaps a place to start would be to use 0.1.14 (the latest)?
In my case, I saw that I had a non-phased individual that had been merged with the phased ones and all sites got removed as a result.