Converting SNP from chr:pos to rs number using PLINK?
0
7
Entering edit mode
5.6 years ago
dam4l ▴ 180

Hi,

I have a .txt file with multiple columns, one of which lists SNPs in chr:pos format. Is it possible to convert from chr:pos to rs number using PLINK? If so, what commands are required.

 

Thanks!

SNP plink • 18k views
ADD COMMENT
0
Entering edit mode

Additionally, I have ~ 15 million SNPs.

ADD REPLY
0
Entering edit mode

Do you have a list with the corresponding rs? Ex: 1:123467932 rs123456789

ADD REPLY
0
Entering edit mode

No I don't. I just have the SNPs in a .bim file with the following columns:

1    1:693731    0    693731    G    A
1    1:706992    0    706992    T    C
ADD REPLY
4
Entering edit mode

To change the SNP name, you need a list with all SNPs available.

  1. Go to UCSC Table Browser and download a list of all available SNPs with chr, pos and rs number.
  2. Create a file with chr:pos in column 1 and rs number in column 2.
  3. In PLINK, use options --update-map and --update-name to change your SNPs name. WARNING: Some position have more than one SNP.
ADD REPLY
0
Entering edit mode

Thanks for your reply! Which settings on the UCSC Table Browser will allow me to download this list?

ADD REPLY
3
Entering edit mode
  • clade: Mammal
  • genome: Human
  • assembly: Same as your SNPs
  • group: Variation
  • track: All SNPs (latest)
  • table: All SNPs
  • region: Genome
  • output format: selected fields from primary and related tables
  • output file: All-SNPs.txt
  • file type returned: plain text

get output

Select Fields from...

  • chrom
  • chromEnd
  • name
ADD REPLY
0
Entering edit mode

Thanks so much!

ADD REPLY
0
Entering edit mode

Thank you so much! I followed your way to download the Chromosome/position/rs# information, and ready to do the convertion using plink --update-name flag , but realized that there are some positions with multiple rs# and plink just stoped with error. How can I delete duplication from the >100000000 SNPs (I did a whole genome imputation so downloaded SNPs of whole genome as you suggested)?

ADD REPLY
1
Entering edit mode

You have two possible solution.

  1. Do not update any SNP with two or more possible rs. It's not perfect, but it's a good temporary solution.
  2. Find the good rs by comparing alleles. On UCSC, you can download alleles for each snp. Then choose the right rs.
ADD REPLY
0
Entering edit mode

Hi,

I'm kind of facing similar error of duplicate variant ID after using --update-name flag in plink. My error is shown below.

Error: Duplicate variant ID '1:100002443' in --update-name file.

When you are saying for the possible solution of 1) Do not update any SNP with two ore more possible rs. What do you mean by that? How can I do not update the duplicate chr:position rs ID?

Thank you for your help!

ADD REPLY
0
Entering edit mode

Hi all! I don't know if this is a more recent thing, but I was not able to download the file using the method described here, because the download times out as the file is too big for the table browser. I found the file containing all SNPs in the UCSC directory ( http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snp151.txt.gz ).

[UPDATE] But actually, I used only the common SNPs table.

ADD REPLY
0
Entering edit mode

Thanks a lot! I have a question. Does "position" in "CHR:POSITION" SNP IDs refer to chromosome start or end position?

ADD REPLY
0
Entering edit mode

I think it's end, but double check it on UCSC browser by viewing your SNP of interest. But bear in mind that this rule only works for single nucleotide variants (multiallelic variants do not obey this rule).

ADD REPLY
0
Entering edit mode

where to put the file with chr:pos and rsid?

ADD REPLY

Login before adding your answer.

Traffic: 1944 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6