I've got .bed/.bim/.fam SNP files and am using physical bp positions to extract SNP subsets in PLINK (version 1.9). I got my coordinates from UCSC Table Browser so some of the chromosome IDs are unusual, like "6_ssto_hap7" or "Un_gl000211".
I tried using
--allow-extra-chr 0 to include these but it doesn't seem to be working as I still get the same error
Error: Invalid chromosome code on line 938486 of --extract range file.
(Line 938486 is where the unusual chromosome IDs start)
I've been using this command:
plink --bfile <filename> --extract range <coordinates.txt> --make-bed --out <newfile> --allow-extra-chr 0
If I delete all the coordinates with the unusual chromsome IDs, I can successfully extract SNPs.
Any suggestions to fix this?
Have you tried using --allow-extra-chr without the 0?
Have you tried looking at line 938485, 938486, 938487 to see whether these lines are just broken somehow?
Yes I've tried both of your suggestions. Those lines are in the exact same formatting, but just have IDs like "6_ssto_hap7". I tried changing one of those to just "6" and there was no error for that line.
Can you post the full .log file from your failed run? (Please make sure the version date/number is included.)
Ok, it's erroring out because the chromosome code isn't in your dataset. This is a bug, "--extract range" should just ignore that line. I'll post a fix tonight.
Sorry, just realised I accidentally deleted "range" from --extract range in the log file when I was changing the txt file name
Bugfix is now posted.
Thanks for that. I no longer receive the error but I know there are SNPs within some of the bp ranges with unusual chromosome IDs and they're not being extracted, only chromosomes 1-23. I would still like to extract them.
I also manually manipulated the bp position files to increase the search window but now I'm getting this error:
Is there a way to ignore that these positions are invalid and continue searching anyway?
At this point, I’ll need you to send me a set of files to reproduce what you’re seeing.
For the record, the problem was a negative position value, resulting from subtracting 5000 from the original interval-start and adding 5000 to the original interval-end coordinates.
This will remain an error in plink 1.9 and 2.0. However, today's plink 2.0 build adds --bed-border-bp/--bed-border-kb flags which perform this interval-extension for you.
Thanks for your help!