I have recently converted a VCF file containing 40 samples into Plink format using the Plink --make-bed flag. The file (name: input_data.bim) I'm left with is in the following format:
10 . 0 45265 A C 10 . 0 45402 T C 10 . 0 45781 C CA 10 . 0 46126 G A 10 . 0 46915 T C 10 . 0 47001 CAGAACACAGTAA C
My aim is to have the . value in the second column converted to a dbsnp rsID by cross-referencing the chromosome and position data columns 1 and 4. I have found this previous post a good starting point and am trying to follow the same logic but must be missing something.
I have my .bim, .bed, .bam Plink files, the downloaded dbsnp153.txt file from UCSC Genome Browser which included all fields by default but I've modified it to the below format (filename: hg38_dbsnp153_final):
#chrom chromStart name 1 10177 rs367896724 1 10352 rs555500075 1 11007 rs575272151 1 11011 rs544419019 1 13109 rs540538026 1 13115 rs62635286
I then run the following:
sudo plink1.9 --bfile input_data --update-name hg38_dbsnp153_final --make-bed --out mydata
Resulting int the following duplicate ID error:
PLINK v1.90b6.16 64-bit (17 Feb 2020) www.cog-genomics.org/plink/1.9/ (C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to mydata.log. Options in effect: --bfile input_data --make-bed --out mydata --update-name hg38_dbsnp153_final 128894 MB RAM detected; reserving 64447 MB for main workspace. 35624 variants loaded from .bim file. 58 people (0 males, 0 females, 58 ambiguous) loaded from .fam. Ambiguous sex IDs written to mydata.nosex . Error: Duplicate ID '.'.
Can anyone suggest a way in which I can resolve this and assign dbsnp rsids to the currently blank second column of my .bim file?