I have to do an imputation using Sanger Imputation Server.
I have prepared data (which is aligned with reference panel) and submited, but i receveid an e-mail as follows:
Update from Sanger Imputation Service:
--- Aborted Job --- The genotype probability distribution in the input file does not match the reference panel frequencies well. The number of genotypes expected with low frequencies under HWE (with P<=0.1) is too big in the user data: 0.59 whereas the threshold is 0.26. For comparison, the number of these genotypes in 1000Genomes data is 0.17, the attached plot shows typical GT distributions
This is usually an indicator of REF,ALT alleles being on incorrect strand. Another frequent problem is the VCF using a different reference sequence, for example GRCh38 instead of GRCh37.
The attached graph was produced using the bcftools/af-dist plugin, check these links http://samtools.github.io/bcftools/howtos/plugin.af-dist.html http://samtools.github.io/bcftools/howtos/plugin.fixref.html
--- Help --- Please check these links for help
How can I solve that???