Question: Converting SNP from chr:pos to rs number using PLINK?
7
gravatar for dam4l
4.6 years ago by
dam4l170
dam4l170 wrote:

Hi,

I have a .txt file with multiple columns, one of which lists SNPs in chr:pos format. Is it possible to convert from chr:pos to rs number using PLINK? If so, what commands are required.

 

Thanks!

snp plink • 15k views
ADD COMMENTlink modified 4.4 years ago by Biostar ♦♦ 20 • written 4.6 years ago by dam4l170

Additionally, I have ~ 15 million SNPs.

ADD REPLYlink written 4.6 years ago by dam4l170

Do you have a list with the corresponding rs? Ex: 1:123467932 rs123456789

ADD REPLYlink modified 7 months ago by RamRS28k • written 4.6 years ago by Maxime Lamontagne2.2k

No I don't. I just have the SNPs in a .bim file with the following columns:

1    1:693731    0    693731    G    A
1    1:706992    0    706992    T    C
ADD REPLYlink modified 7 months ago by RamRS28k • written 4.6 years ago by dam4l170
4

To change the SNP name, you need a list with all SNPs available.

  1. Go to UCSC Table Browser and download a list of all available SNPs with chr, pos and rs number.
  2. Create a file with chr:pos in column 1 and rs number in column 2.
  3. In PLINK, use options --update-map and --update-name to change your SNPs name. WARNING: Some position have more than one SNP.
ADD REPLYlink modified 7 months ago by RamRS28k • written 4.6 years ago by Maxime Lamontagne2.2k

Thanks for your reply! Which settings on the UCSC Table Browser will allow me to download this list?

ADD REPLYlink modified 7 months ago by RamRS28k • written 4.6 years ago by dam4l170
3
  • clade: Mammal
  • genome: Human
  • assembly: Same as your SNPs
  • group: Variation
  • track: All SNPs (latest)
  • table: All SNPs
  • region: Genome
  • output format: selected fields from primary and related tables
  • output file: All-SNPs.txt
  • file type returned: plain text

get output

Select Fields from...

  • chrom
  • chromEnd
  • name
ADD REPLYlink modified 7 months ago by RamRS28k • written 4.6 years ago by Maxime Lamontagne2.2k

Thanks so much!

ADD REPLYlink written 4.6 years ago by dam4l170

Thank you so much! I followed your way to download the Chromosome/position/rs# information, and ready to do the convertion using plink --update-name flag , but realized that there are some positions with multiple rs# and plink just stoped with error. How can I delete duplication from the >100000000 SNPs (I did a whole genome imputation so downloaded SNPs of whole genome as you suggested)?

ADD REPLYlink modified 7 months ago by RamRS28k • written 4.4 years ago by chenchunhuichina0
1

You have two possible solution.

  1. Do not update any SNP with two or more possible rs. It's not perfect, but it's a good temporary solution.
  2. Find the good rs by comparing alleles. On UCSC, you can download alleles for each snp. Then choose the right rs.
ADD REPLYlink modified 7 months ago by RamRS28k • written 4.4 years ago by Maxime Lamontagne2.2k

Hi,

I'm kind of facing similar error of duplicate variant ID after using --update-name flag in plink. My error is shown below.

Error: Duplicate variant ID '1:100002443' in --update-name file.

When you are saying for the possible solution of 1) Do not update any SNP with two ore more possible rs. What do you mean by that? How can I do not update the duplicate chr:position rs ID?

Thank you for your help!

ADD REPLYlink modified 7 months ago by RamRS28k • written 12 months ago by jongyun.jung10

Hi all! I don't know if this is a more recent thing, but I was not able to download the file using the method described here, because the download times out as the file is too big for the table browser. I found the file containing all SNPs in the UCSC directory ( http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snp151.txt.gz ).

[UPDATE] But actually, I used only the common SNPs table.

ADD REPLYlink modified 13 months ago • written 14 months ago by rodd100

Thanks a lot! I have a question. Does "position" in "CHR:POSITION" SNP IDs refer to chromosome start or end position?

ADD REPLYlink written 14 months ago by bigfoot0

I think it's end, but double check it on UCSC browser by viewing your SNP of interest. But bear in mind that this rule only works for single nucleotide variants (multiallelic variants do not obey this rule).

ADD REPLYlink modified 11 months ago • written 13 months ago by rodd100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1444 users visited in the last hour