Hello! I am in need of some help. I have a VCF file that was generated using a reference genome where the chromosomes are named in roman numerals: chrI, chrII... chrM, chrV, etc. Which means that they are sorted alphabetically and not numerically, therefore my chromosomes have a silly order, with chrM listed in the middle for example (why lord!).
I've tried renaming them using only single digits and letters (1,2,3... M, X, Y) using
bcftools annotate before I generate my bfiles using plink. The issue is that because my chrM was listed somewhere in the middle, when I try to make the bfiles, my BIM file stops when it reaches the M. This is the command I used:
bcftools norm -Ou -m -any $file.vcf.gz | bcftools norm -Ou -f $ref | bcftools annotate -Ob -x ID \ -I +'%CHROM:%POS:%REF:%ALT' | plink --bcf /dev/stdin \ --keep-allele-order \ --const-fid \ --allow-extra-chr \ --make-bed \ --chr-set 24 \ #I also tried --output-chr M --out $file
Is there a simple way to address this in the plink command? I'm trying to figure out a way to sort my VCF so the chrM is listed last also, but so far it has been a struggle and I must be thinking about this wrong! Ugh D: