Change rsID to chromosome:location in bim files for plink
3
1
Entering edit mode
4.9 years ago

Some of the SNPs in my dataset are without rs ids, so in the bim file, they have dots instead of rs ids for them. So for errorless analysis how should replace the rs id column with that of the chromosome: location column instead?

genome snp plink • 8.9k views
ADD COMMENT
6
Entering edit mode
4.9 years ago
zx8754 11k

Using awk, (not tested):

awk '{if($2 == ".") {print $1,$1"_"$3,$4,$5,$6} else {print $0}}' myfile.bim > myfileClean.bim
ADD COMMENT
0
Entering edit mode

Thanks for the answer, there is just a typo in the awk command: should be $1_$4 instead of $1_$3 (if one wants the bp coordinates in the ID, which I think would usually be the case), and there is a missing $3 as well:

awk '{if($2 == ".") {print $1,$1"_"$4,$3,$4,$5,$6} else {print $0}}' myfile.bim > myfileClean.bim
ADD REPLY
5
Entering edit mode
4.9 years ago

This can also be done with plink 2.0's --set-all-var-ids flag.

But it's more valuable to understand the awk solution, since that opens the door to a broadly useful set of text-manipulation techniques.

ADD COMMENT
0
Entering edit mode
2.1 years ago
jk587 • 0

or use the plink option of --set-missing-var-ids @:# Note the : can be replaced by other characters

ADD COMMENT

Login before adding your answer.

Traffic: 2906 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6