Question: Error while running BEAGLE for genotype imputation
17 months ago by
Shab86 wrote:

I am trying to run BEAGLE 4.1 for an imputation run. I have core exome chip data on variants of 20th chromosome in BED/BIM/FAM format, which I phased and converted to vcf format. Also, the reference format is in vcf which was phased. All the phasing was done in SHAPEIT and converted using the convert option in it.

But, now when I try to run a BEAGLE imputation run by this: java -jar beagle.jar gt=test.vcf ref=chr20.vcf impute=TRUE

I get an error saying this- ERROR: REF field is not a sequence of A, C, T, G, or N characters at newrs11467497:126156 [D]

I am a newbie in this and can't understand what the error is about. Can anyone please help me out?

17 months ago by
WouterDeCoster wrote:

If I remember correctly (it has been a while I used beagle) it only operates on SNP polymorphisms. And indeed, the position (rs11467497) you run into a problem with is an indel:

The error message also explains that: the reference field should contain either A, C, T, G or N. But for this case the reference is 'CAAA' or '-'.

I suggest to prefilter your vcf file to remove indel polymorphisms.

Awesome ! Thanks for your answer :)

