Hello,
I am a novice in the world of bioinformatics. I am attempting to work out genotype imputation for a set of 8K+ samples. I am only interested in imputing data for chromosome 6 to be used in later downstream analysis. I am using Beagle for the genotype imputation using the following script:
java -Xss5m -Xmx16g \
-jar beagle.18May20.d20.jar \
gt=immuno_chrom6_for_imputation.vcf.gz \
ref=chr6.1kg.phase3.v5a.b37.bref3 \
map=plink.chr6.GRCh37.map \
out=immuno_chr6_imputed \
nthreads=16 \
ne=20000 \
impute=true \
seed=-99999
The data was passed down to me and i was told that the appropriate reference genome to use is the hg19 (GRCh37) build. When I run this script, i get the following error.
ERROR: Reference and target files have no markers in common in interval: 6:63979-21294564
Is there a way troubleshoot this? I queried my dataset based off position values and the interval 63979-21294564 is about 800 samples out of 8894.
Any help would be greatly appreciated!
follow up: Have you tried running any earlier Beagle versions? I ran the same input files that were producing the exact same error on Beagle5.1 on Beagle4.0 and everything went smoothly.
Best