Genetic distance in cM from VCF of non-reference species to run Beagle
Entering edit mode
9 months ago
AndrMod • 0

I'm working with a resequenced genome of a non-reference species.

The VCF contains ~7 mln of SNPs, all with their relative position on their own chromosome. I have a 10.01 % of missing data, so I need to impute these NA. I eventually settled for Beagle v5 as a tool, since it can do this job even without a reference panel of phased and completely genotyped individuals.

However, Beagle asks also for a .map file with the genetic distance in cM, which is giving me many troubles. The species lacks a linkage map at the SNP level, so I was thinking of computing it starting from th population recombination rate; however I'd obtain a single value, which is by no mean useful to get the different cM distances.

(Indeed, when I ran Beagle with the output of PLINK 1.9, which has the all the genetic distances set to 0, I got this error:

Exception in thread "main" java.lang.IllegalArgumentException: All loci in genetic map have the same genetic position [0.0]: CHROM_1

My current CL to deal with PLINK, as suggested here, is

plink1.9 --bfile ./PEDwithMorgans_v2/CHROMnumberBhagaVentoux --cm-map ./PEDwithMorgans_v2/Bhaga_@_103chrom_v2.txt --make-bed  --recode --out ./PEDwithMorgans_v2/IndividualBhaga_v2_cms --allow-extra-chr

Also, I'm a tad confused by the usage of "genetic distance" here. Usually I assume it's a pairwise measure between different markers, but the map format clearly require a single value.

Can you please point me to some useful tool to perform?

I deeply thank you in advance.

plink beagle vcf • 434 views

Login before adding your answer.

Traffic: 3188 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6