Question: Updating dbSNP rs ids from old SNP data
1
gravatar for devenvyas
2.1 years ago by
devenvyas390
University of Florida
devenvyas390 wrote:

I have downloaded some SNP data sets published in 2012 (http://www.biologiaevolutiva.org/dcomas/north-african-affy-6-0-data-henn-et-al-submitted/, http://mega.bioanth.cam.ac.uk/data/Ethiopia/).

I am trying to merge the data with recent data, but it seems to me that the data is using an older Hg build (I am assuming Hg18) and the rs numbers don't match my relatively new, existing data as well as I would expect them.

For example, here is a site that corresponds in those two 2012 map files

1    rs7519837    1.06103    1500664

1    rs7519837    0    1500664

but when you search for it in dbSNP, the coordinates are different.

I was wondering, how can I individually update the coordinates and rsIDs of these map files? Thanks!

 

 

 

snp dbsnp • 1.4k views
ADD COMMENTlink modified 22 months ago by kevin.wyatt.mcmahon0 • written 2.1 years ago by devenvyas390

Detailed post on tools for converting coordinates between genome builds: Converting Genome Coordinates From One Genome Version To Another (Ucsc Liftover, Ncbi Remap, Ensembl Api)

ADD REPLYlink written 17 days ago by Malachi Griffith15k
0
gravatar for h.mon
2.1 years ago by
h.mon8.7k
Brazil
h.mon8.7k wrote:

You can use liftOver, or CrossMap.

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by h.mon8.7k
Those do not support Plink format files.
ADD REPLYlink written 2.1 years ago by devenvyas390

You can easily change the plink bim/map file into the bed format required by liftOver, then you can convert it back to the plink format.

ADD REPLYlink written 2.1 years ago by Sam2.0k

Any suggestion on how I would do that?

ADD REPLYlink written 2.1 years ago by devenvyas390

Perl (my choice) or awk (a general favorite around Biostars). If the file is not too large in relation to your computer memory, you can even do this on excel / libreoffice calc.

ADD REPLYlink written 2.1 years ago by h.mon8.7k

I found a script that converts Plink MAP to UCSC BED.

However, there is still a big problem with LiftOver. It omits a whole bunch of results, which means I can't just convert the output back to MAP, because there are inconsistent numbers of rows.

Successfully converted 274294 records: View Conversions

Conversion failed on 5237 records.    Display failure file    Explain failure messages

 

ADD REPLYlink written 2.1 years ago by devenvyas390

@h.mon, liftOver doesn't reflect the rs number changes, right?

ADD REPLYlink written 19 months ago by dora 70
0
gravatar for devenvyas
2.1 years ago by
devenvyas390
University of Florida
devenvyas390 wrote:

I found a script that does the coordinate update for you. I still need to find out how to do the dbSNP rs id update.

http://genome.sph.umich.edu/wiki/LiftMap.py

ADD COMMENTlink written 2.1 years ago by devenvyas390
Its an ID. What can you change?
ADD REPLYlink written 2.1 years ago by karl.stamm3.2k

dbSNP RS IDs get updated over time. For example, when two SNPs turn out to be the same SNP later on, then dbSNP revokes one of the RS ID as it is a synonym. Old Plink data, however, is stuck with the old RS ID.
 

ADD REPLYlink written 2.1 years ago by devenvyas390

that's true. Did you find a way to solve this?

ADD REPLYlink written 19 months ago by dora 70
0
gravatar for kevin.wyatt.mcmahon
22 months ago by
United States
kevin.wyatt.mcmahon0 wrote:

I have a question related to this, so I thought I might just stay on the same thread:

I have a VCF file of variants (what else?) from hg18-aligned sequences. I need to convert these variants to hg19.

My question is: do I need to be concerned about the difference in sequence between the two genome versions?  

For example:

One of my variants is at chr10 8365, and the hg18 reference sequence is a T, but our bamfile found a C at that spot.

In this case, the hg19 also has a T at that spot, but could it possibly be different?  And if so, are there any tools to account for sequence position differences?

 

Thanks in advance!

 

Wyatt

ADD COMMENTlink written 22 months ago by kevin.wyatt.mcmahon0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 591 users visited in the last hour