Change SNPID in vcf file with awk
1
0
Entering edit mode
4 months ago
L_to_the_m ▴ 10

Hi, I have a VCF file with SNP IDs like this:

AX-14233402__rs35404821
AX-37499887__rs74704183
AX-36783275__rs11997571

I would like to change the SNP IDs to have only the IDs without the AX-... term:

rs35404821 
rs74704183 
rs74704183

Is there any solution for this? I tried with a gsub command, but nothing changed:

awk '{gsub(/AX*_rs/,"rs"); print}' datafile.vcf > datafile_ID.vcf
vcf SNPID • 374 views
ADD COMMENT
1
Entering edit mode

Do not delete posts when they've been addressed. If one or more solutions worked, accept them using the green check mark.

upvote_bookmark_accept

ADD REPLY
0
Entering edit mode

can you try one of these three?

$ awk -F "AX-.*__" '{print $1$2}' test.vcf 
$ awk -F "AX-.*__rs" -v OFS="rs" '{print $1,$2}' test.vcf 
$ sed '/\tAX.*__rs/ s//\trs/' test.vcf
ADD REPLY
2
Entering edit mode
4 months ago
 awk -F '\t' '/^#/ {print;next;} {OFS="\t";gsub(/.*__rs/,"rs",$3);print}' < in.vcf 
ADD COMMENT

Login before adding your answer.

Traffic: 2047 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6