Question: Plink: how to recode SNP identifiers
0
gravatar for jamespoweraid2
4.1 years ago by
United States
jamespoweraid20 wrote:

Hi,

I have two files I need to merge in Plink, but the SNP identifiers are different, i.e.

file1

1       rs75454623:14930:A:G    0       14930   G       A

1       rs199856693:14933:G:A   0       14933   A       G

file2

1       rs75454623:    0       14930   1       2

1       rs12354060      134304  10004   1       2

1       rs2691310       327454  46844   0       0

Is there any way in Plink that I can change the SNP IDs in the first file, so that I can merge these two files?

Thank you very much for your help.

 

 

input plink • 2.2k views
ADD COMMENTlink modified 4.1 years ago by christopher medway440 • written 4.1 years ago by jamespoweraid20
1
gravatar for christopher medway
4.1 years ago by
Cardiff, UK
christopher medway440 wrote:

Seems to me that when you merge these PLINK files there will be two issues; i) the SNP IDs are different and, ii) the allele coding is different.

i) Assuming your rsIDs and the genome coordinates are consistent across your two files, you could strip out the rsIDs from file1. This Perl one liner will reprint the map file with the rsIDs stripped out.

perl -ne 's/:\S+//g &&  print $_' file1.txt > output.txt

ii) You can recode your file1 from ACTG to 12 format using --recode12 in PLINK. This isn't without it's problems; I think PLINK will decide which allele is 1 or 2 based on which one is the more common allele. This is usually fine, but can introduce problems if the SNP is common as the 'minor' allele becomes ambiguous.

This may be a solution, or it may not.

ADD COMMENTlink modified 8 weeks ago by RamRS25k • written 4.1 years ago by christopher medway440

I am sorry I mistakenly posted a comment as an answer and cannot delete it...

ADD REPLYlink modified 8 weeks ago by RamRS25k • written 4.1 years ago by jamespoweraid20
0
gravatar for jamespoweraid2
4.1 years ago by
United States
jamespoweraid20 wrote:

Thank you Christopher for your help! I realize Plink can do the recode of the alleles which is great, but can it also recode the SNP ID in file 1 so that they are similar format to the SNP IDs in file 2, i.e. only extracting the first part with rsIDs, and create BED/BIM/FAM file with the recoded SNPs in the new format?

Thank you!

ADD COMMENTlink modified 8 weeks ago by RamRS25k • written 4.1 years ago by jamespoweraid20
0
gravatar for christopher medway
4.1 years ago by
Cardiff, UK
christopher medway440 wrote:

There may we a way, in PLINK, of updating the rsIDs in the first file using the rsIDs in the second file, but this is not something I am aware of.

ADD COMMENTlink modified 8 weeks ago by RamRS25k • written 4.1 years ago by christopher medway440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1518 users visited in the last hour