Converting SNP from chr:pos to rs number using PLINK?
0
7
Entering edit mode
5.6 years ago
dam4l ▴ 180

Hi,

I have a .txt file with multiple columns, one of which lists SNPs in chr:pos format. Is it possible to convert from chr:pos to rs number using PLINK? If so, what commands are required.

Thanks!

0
Entering edit mode

Additionally, I have ~ 15 million SNPs.

0
Entering edit mode

Do you have a list with the corresponding rs? Ex: 1:123467932 rs123456789

0
Entering edit mode

No I don't. I just have the SNPs in a .bim file with the following columns:

1    1:693731    0    693731    G    A
1    1:706992    0    706992    T    C

4
Entering edit mode

To change the SNP name, you need a list with all SNPs available.

1. Go to UCSC Table Browser and download a list of all available SNPs with chr, pos and rs number.
2. Create a file with chr:pos in column 1 and rs number in column 2.
3. In PLINK, use options --update-map and --update-name to change your SNPs name. WARNING: Some position have more than one SNP.
0
Entering edit mode

3
Entering edit mode
• genome: Human
• assembly: Same as your SNPs
• group: Variation
• track: All SNPs (latest)
• table: All SNPs
• region: Genome
• output format: selected fields from primary and related tables
• output file: All-SNPs.txt
• file type returned: plain text

get output

Select Fields from...

• chrom
• chromEnd
• name
0
Entering edit mode

Thanks so much!

0
Entering edit mode

Thank you so much! I followed your way to download the Chromosome/position/rs# information, and ready to do the convertion using plink --update-name flag , but realized that there are some positions with multiple rs# and plink just stoped with error. How can I delete duplication from the >100000000 SNPs (I did a whole genome imputation so downloaded SNPs of whole genome as you suggested)?

1
Entering edit mode

You have two possible solution.

1. Do not update any SNP with two or more possible rs. It's not perfect, but it's a good temporary solution.
2. Find the good rs by comparing alleles. On UCSC, you can download alleles for each snp. Then choose the right rs.
0
Entering edit mode

Hi,

I'm kind of facing similar error of duplicate variant ID after using --update-name flag in plink. My error is shown below.

Error: Duplicate variant ID '1:100002443' in --update-name file.


When you are saying for the possible solution of 1) Do not update any SNP with two ore more possible rs. What do you mean by that? How can I do not update the duplicate chr:position rs ID?

0
Entering edit mode

Hi all! I don't know if this is a more recent thing, but I was not able to download the file using the method described here, because the download times out as the file is too big for the table browser. I found the file containing all SNPs in the UCSC directory ( http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snp151.txt.gz ).

[UPDATE] But actually, I used only the common SNPs table.

0
Entering edit mode

Thanks a lot! I have a question. Does "position" in "CHR:POSITION" SNP IDs refer to chromosome start or end position?

0
Entering edit mode

I think it's end, but double check it on UCSC browser by viewing your SNP of interest. But bear in mind that this rule only works for single nucleotide variants (multiallelic variants do not obey this rule).

0
Entering edit mode

where to put the file with chr:pos and rsid?